Strasser, Kimchi-Audrey, McDonnell, Erin, Nyaga, Carol, Wu, Min, Wu, Sherry, Almeida, Hayda, Meurs, Marie-Jean, Kosseim, Leila, Powlowski, Justin, Butler, Greg and Tsang, Adrian (2015) mycoCLAP, the database for characterized lignocellulose-active proteins of fungal origin: resource and text mining curation support. Database, 2015 . bav008. ISSN 1758-0463
Preview |
Text (application/pdf)
1MBbutler-database-2015.pdf - Published Version Available under License Spectrum Terms of Access. |
Official URL: http://dx.doi.org/10.1093/database/bav008
Abstract
Enzymes active on components of lignocellulosic biomass are used for industrial applications ranging from food processing to biofuels production. These include a diverse array of glycoside hydrolases, carbohydrate esterases, polysaccharide lyases and oxidoreductases. Fungi are prolific producers of these enzymes, spurring fungal genome sequencing efforts to identify and catalogue the genes that encode them. To facilitate the functional annotation of these genes, biochemical data on over 800 fungal lignocellulose-degrading enzymes have been collected from the literature and organized into the searchable database, mycoCLAP (http://mycoclap.fungalgenomics.ca). First implemented in 2011, and updated as described here, mycoCLAP is capable of ranking search results according to closest biochemically characterized homologues: this improves the quality of the annotation, and significantly decreases the time required to annotate novel sequences. The database is freely available to the scientific community, as are the open source applications based on natural language processing developed to support the manual curation of mycoCLAP. Database URL: http://mycoclap.fungalgenomics.ca.
Divisions: | Concordia University > Research Units > Centre for Structural and Functional Genomics |
---|---|
Item Type: | Article |
Refereed: | Yes |
Authors: | Strasser, Kimchi-Audrey and McDonnell, Erin and Nyaga, Carol and Wu, Min and Wu, Sherry and Almeida, Hayda and Meurs, Marie-Jean and Kosseim, Leila and Powlowski, Justin and Butler, Greg and Tsang, Adrian |
Journal or Publication: | Database |
Date: | 2015 |
Funders: |
|
Digital Object Identifier (DOI): | 10.1093/database/bav008 |
ID Code: | 982241 |
Deposited By: | Danielle Dennie |
Deposited On: | 17 Mar 2017 20:53 |
Last Modified: | 18 Jan 2018 17:54 |
References:
Martinez D. Larrondo L.F. Putnam N. et al ( 2004 ) Genome sequence of the lignocellulose degrading fungus Phanerochaete chrysosporium strain RP78 . Nat. Biotechnol. , 22 , 695 – 700 .Andersen M.R. Salazar M.P. Schaap P.J. et al ( 2011 ) Comparative genomics of citric-acid-producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88 . Genome Res. , 21 , 885 – 897 .
Berka R.M. Grigoriev I.V. Otillar R. et al ( 2011 ) Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris . Nat. Biotechnol. , 29 , 922 – 927 .
Floudas D. Binder M. Riley R. et al ( 2012 ) The Paleozoic origin of enzymatic lignin decomposition reconstructed from 31 fungal genomes . Science , 336 , 1715 – 1719 .
Youssef N.H. Couger M. Struchtemeyer C.G. et al ( 2013 ) The genome of the anaerobic fungus Orpinomyces sp. strain C1A reveals the unique evolutionary history of a remarkable plant biomass degrader . Appl. Environ. Microbiol. , 79 , 4620 – 4634 .
Cantarel B.L. Coutinho P.M. Rancurel C. et al ( 2009 ) The Carbohydrate-Active EnZymes database (CAZy): an expert resource for glycogenomics . Nucleic Acids Res. , 37 , D233 – D238 .
Murphy C. Powlowski J. Wu M. et al ( 2011 ) Curation of characterized glycoside hydrolases of fungal origin . Database , 2011 , bar020 .
Chang A. Scheer M. Grote A. et al ( 2009 ) BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009 . Nucleic Acids Res. , 37 , D588 – D592 .
Benson A. Cavanaugh M. Clark K. et al ( 2013 ) GenBank . Nucleic Acids Res. , 41 , D36 – D42 , gks1195 .
UniProt Consortium. ( 2013 ) Update on activities at the Universal Protein Resource (UniProt) in 2013 . Nucleic Acids Res. , 41 , D43 – D47 .
Ashburner M. Ball C.A. Blake J.A. et al ( 2000 ) Gene Ontology: tool for the unification of biology . Nat. Genet. , 25 , 25 – 29 .
Pouvreau L. Joosten R. Hinz S.W. et al Chrysosporium lucknowense c1 arabinofuranosidases are selective in releasing arabinose from either single or double substituted xylose residues in arabinoxylans . Enzyme Microbial Technol. , 48 , 397 – 403 .
Altschul S.F. Gish W. Miller W. et al ( 1990 ) Basic local alignment search tool . J. Mol. Biol. , 215 , 403 – 410 .
Petersen T.N. Brunak S. von Heijne G. et al SignalP 4.0: discriminating signal peptides from transmembrane regions . Nat. Methods , 8 , 785 – 786 .
Meurs M.J. Murphy C. Morgenstern I. et al ( 2012 ) Semantic text mining support for lignocellulose research . BMC Med. Inf. Decis. Mak. , 12 , S5 .
Cunningham H. Maynard D. Bontcheva K. et al ( 2011 ) Text Processing with GATE (Version 6) . University of Sheffield, Department of Computer Science, Gateway Press, CA .
Almeida H. Meurs M.-J. Kosseim L. et al ( 2014 ) Machine learning for biomedical literature triage . PLoS One 9 , e115892 .
Howe D. Costanzo M. Fey P. et al ( 2008 ) Big data: the future of biocuration . Nature , 455 , 47 – 50 .
Chawla N.V. Bowyer K.W. Hall L.O. et al Smote: synthetic minority over-sampling technique . J. Artif. Intell. Res. , 16 , 341 – 378 .
Repository Staff Only: item control page