Login | Register

Comprehensive Bioinformatic Analysis of Glycoside Hydrolase Family 10 Proteins


Comprehensive Bioinformatic Analysis of Glycoside Hydrolase Family 10 Proteins

Wu, Sherry (2015) Comprehensive Bioinformatic Analysis of Glycoside Hydrolase Family 10 Proteins. Masters thesis, Concordia University.

[thumbnail of Wu_MSc_S2015.pdf]
Text (application/pdf)
Wu_MSc_S2015.pdf - Accepted Version
[thumbnail of Wu_MSc_Thesis_Supplementary_files.xlsx]
Other (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
Wu_MSc_Thesis_Supplementary_files.xlsx - Accepted Version


Glycoside Hydrolase Family 10 (GH10) contains endo-1, 4-β-xylanase which catalyzes the hydrolysis of xylan, the most abundant hemicellulose in lignocellulosic biomass. In this study, different bioinformatic approaches were used to comprehensively analyze the distribution, the phylogeny, the function and the evolutionary origin of a large GH10 protein dataset. The goal was to explore the correlation between sequence similarity and function of GH10 proteins to better understand xylan utilization pattern within the family.
Predicted glycoside hydrolase family 10 sequences from fungal, bacterial, archaeal, and non-fungal eukaryotic genomes as well as biochemically characterized proteins were used to perform a phylogenetic analysis. Based on the tree topology, 626 GH10 sequences were classified into 50 well-supported subfamilies. Among the analyzed sequences, 42 remained unclustered. The complex topology of the family tree suggests multiple duplication events followed by lineage specific gene loss during evolution. In addition, the Maximum Likelihood phylogeny of GH10 proteins does not mirror the previously established species taxonomic tree, suggesting that the divergence of the GH10 family ancestral gene preceded the appearance of the eukaryotic lineages.
A set of non-fungal GH10 proteins were manually curated employing criteria used in mycoCLAP, a database for biochemically characterized fungal lignocellulose active enzymes. Experimental data of biochemically characterized GH10 proteins were mapped onto the phylogenetic tree to establish relationships, if any, between biochemical properties and sequence similarity. Only 24 subfamilies contain members with characterization, demonstrating that 26 phylogenetically diverse subfamilies remain uncharacterized. Among the subfamilies with experimental data, a distantly related subfamily with tomatinase activity was identified. By comparing the tertiary structures of well-characterized subfamilies, I have identified subfamilies that display different xylan substrate preferences and hydrolysis patterns. Correlations were also observed between sequence similarity and the pH and/or temperature optimum in the GH10 family. The accumulation of mutations within subfamilies reflects how they have diverged over time. Subfamily discriminating residue analyses were performed to identify subfamily-specific polymorphisms. Detailed lists of subfamily discriminating residues are provided. The majority of these residues are involved in secondary structure formation based on alignment to 3D structures, suggesting they might be functionally and structurally important.

Divisions:Concordia University > Faculty of Arts and Science > Biology
Item Type:Thesis (Masters)
Authors:Wu, Sherry
Institution:Concordia University
Degree Name:M. Sc.
Date:January 2015
Thesis Supervisor(s):Tsang, Adrian
ID Code:979804
Deposited By: SHERRY WU
Deposited On:13 Jul 2015 16:00
Last Modified:18 Jan 2018 17:50
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top