ye, qing (2019) classifying transport proteins using profile hidden markov models and specificity determining sites. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
1MBYeQing_MCompSc_S2019.pdf - Accepted Version |
Abstract
This thesis develops methods to classifiy the substrates transported across a membrane by a given transmembrane protein. Our methods use tools that predict specificity determining sites (SDS) after computing a multiple sequence alignment (MSA), and then building a profile Hidden Markov Model (HMM) using HMMER. In bioinformatics, HMMER is a set of widely used applications for sequence analysis based on profile HMM. Specificity determining sites (SDS) are the key positions in a protein sequence that play a crucial role in functional variation within the protein family during the course of evolution.
We have established a classification pipeline which integrated the steps of data processing, model building and model evaluation. The pipeline contains similarity search, multiple sequence alignment, specificity determining site prediction and construction of a profile Hidden Markov Model.
We did comprehensive testing and analysis of different combinations of MSA and SDS tools in our pipeline. The best performing combination was MUSCLE with Xdet, and the performance analysis showed that the overall average Matthews Correlation Coefficient (MCC) across the seven substrate classes of the dataset was 0.71, which outperforms the state-of-the-art.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | ye, qing |
Institution: | Concordia University |
Degree Name: | M.A. Sc. |
Program: | Individualized Program |
Date: | April 2019 |
Thesis Supervisor(s): | Butler, Gregory |
ID Code: | 985327 |
Deposited By: | QING YE |
Deposited On: | 27 Oct 2022 13:49 |
Last Modified: | 27 Oct 2022 13:49 |
Repository Staff Only: item control page