Wu, Sherry (2015) Comprehensive Bioinformatic Analysis of Glycoside Hydrolase Family 10 Proteins. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
4MBWu_MSc_S2015.pdf - Accepted Version |
Other (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
99kBWu_MSc_Thesis_Supplementary_files.xlsx - Accepted Version |
Abstract
Glycoside Hydrolase Family 10 (GH10) contains endo-1, 4-β-xylanase which catalyzes the hydrolysis of xylan, the most abundant hemicellulose in lignocellulosic biomass. In this study, different bioinformatic approaches were used to comprehensively analyze the distribution, the phylogeny, the function and the evolutionary origin of a large GH10 protein dataset. The goal was to explore the correlation between sequence similarity and function of GH10 proteins to better understand xylan utilization pattern within the family.
Predicted glycoside hydrolase family 10 sequences from fungal, bacterial, archaeal, and non-fungal eukaryotic genomes as well as biochemically characterized proteins were used to perform a phylogenetic analysis. Based on the tree topology, 626 GH10 sequences were classified into 50 well-supported subfamilies. Among the analyzed sequences, 42 remained unclustered. The complex topology of the family tree suggests multiple duplication events followed by lineage specific gene loss during evolution. In addition, the Maximum Likelihood phylogeny of GH10 proteins does not mirror the previously established species taxonomic tree, suggesting that the divergence of the GH10 family ancestral gene preceded the appearance of the eukaryotic lineages.
A set of non-fungal GH10 proteins were manually curated employing criteria used in mycoCLAP, a database for biochemically characterized fungal lignocellulose active enzymes. Experimental data of biochemically characterized GH10 proteins were mapped onto the phylogenetic tree to establish relationships, if any, between biochemical properties and sequence similarity. Only 24 subfamilies contain members with characterization, demonstrating that 26 phylogenetically diverse subfamilies remain uncharacterized. Among the subfamilies with experimental data, a distantly related subfamily with tomatinase activity was identified. By comparing the tertiary structures of well-characterized subfamilies, I have identified subfamilies that display different xylan substrate preferences and hydrolysis patterns. Correlations were also observed between sequence similarity and the pH and/or temperature optimum in the GH10 family. The accumulation of mutations within subfamilies reflects how they have diverged over time. Subfamily discriminating residue analyses were performed to identify subfamily-specific polymorphisms. Detailed lists of subfamily discriminating residues are provided. The majority of these residues are involved in secondary structure formation based on alignment to 3D structures, suggesting they might be functionally and structurally important.
Divisions: | Concordia University > Faculty of Arts and Science > Biology |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Wu, Sherry |
Institution: | Concordia University |
Degree Name: | M. Sc. |
Program: | Biology |
Date: | January 2015 |
Thesis Supervisor(s): | Tsang, Adrian |
ID Code: | 979804 |
Deposited By: | SHERRY WU |
Deposited On: | 13 Jul 2015 16:00 |
Last Modified: | 18 Jan 2018 17:50 |
Repository Staff Only: item control page