Almeida, Hayda, Meurs, Marie-Jean, Kosseim, Leila, Butler, Greg and Tsang, Adrian (2014) Machine Learning for Biomedical Literature Triage. PLoS ONE .
Preview |
Text (application/pdf)
593kBjournal.pone.0115892.pdf - Published Version Available under License Creative Commons Attribution. |
Official URL: http://dx.doi.org/10.1371/journal.pone.0115892
Abstract
This paper presents a machine learning system for supporting the first task of the biological literature manual curation process, called triage. We compare the performance of various classification models, by experimenting with dataset sampling factors and a set of features, as well as three different machine learning algorithms (Naive Bayes, Support Vector Machine and Logistic Model Trees). The results show that the most fitting model to handle the imbalanced datasets of the triage classification task is obtained by using domain relevant features, an under-sampling technique, and the Logistic Model Trees algorithm.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering Concordia University > Research Units > Centre for Structural and Functional Genomics |
---|---|
Item Type: | Article |
Refereed: | Yes |
Authors: | Almeida, Hayda and Meurs, Marie-Jean and Kosseim, Leila and Butler, Greg and Tsang, Adrian |
Journal or Publication: | PLoS ONE |
Date: | 2014 |
Digital Object Identifier (DOI): | 10.1371/journal.pone.0115892 |
Keywords: | support vector machines, machine learning algorithms, logistic model trees, fungi, database searching, enzyme, machine learning, triage, biocuration, imbalanced dataset |
ID Code: | 979710 |
Deposited By: | MARIE-JEAN MEURS |
Deposited On: | 24 Feb 2015 17:43 |
Last Modified: | 18 Jan 2018 17:49 |
Repository Staff Only: item control page