Alballa, Munira (2020) Predicting Transporter Proteins and Their Substrate Specificity. PhD thesis, Concordia University.
Preview |
Text (application/pdf)
5MBAlballa_PhD_F2020.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
The publication of numerous genome projects has resulted in an abundance of protein sequences, a significant number of which are still unannotated. Membrane proteins such as transporters, receptors, and enzymes are among the least characterized proteins due to their hydrophobic surfaces and lack of conformational stability. This research aims to build a proteome-wide system to determine transporter substrate specificity, which involves three phases: 1) distinguishing membrane proteins, 2) differentiating transporters from other functional types of membrane proteins, and 3) detecting the substrate specificity of the transporters.
To distinguish membrane from non-membrane proteins, we propose a novel tool, TooT-M, that combines the predictions from transmembrane topology prediction tools and a selective set of classifiers where protein samples are represented by pseudo position-specific scoring matrix (Pse-PSSM) vectors. The results suggest that the proposed tool outperforms all state-of-the-art methods in terms of the overall accuracy and Matthews correlation coefficient (MCC).
To distinguish transporters from other proteins, we propose an ensemble classifier, TooT-T, that is trained to optimally combine the predictions from homology annotation transfer and machine learning methods. The homology annotation transfer components detect transporters by searching against the transporter classification database (TCDB) using different thresholds. The machine learning methods include three models wherein the protein sequences are encoded using a novel encoding psi-composition. The results show that TooT-T outperforms all state-of-the-art de novo transporter predictors in terms of the overall accuracy and MCC.
To detect the substrate specificity of a transporter, we propose a novel tool, TooT-SC, that combines compositional, evolutionary, and positional information to represent protein samples. TooT-SC can efficiently classify transport proteins into eleven classes according to their transported substrate, which is the highest number of predicted substrates offered by any de novo prediction tool. Our results indicate that TooT-SC significantly outperforms all of the state-of-the-art methods. Further analysis of the locations of the informative positions reveals that there are more statistically significant informative positions in the transmembrane segments (TMSs) than the non-TMSs, and there are more statistically significant informative positions that occur close to the TMSs compared to regions far from them.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (PhD) |
Authors: | Alballa, Munira |
Institution: | Concordia University |
Degree Name: | Ph. D. |
Program: | Computer Science |
Date: | 30 April 2020 |
Thesis Supervisor(s): | Gregory, Butler |
ID Code: | 986941 |
Deposited By: | MUNIRA AL-BALLA |
Deposited On: | 25 Nov 2020 16:15 |
Last Modified: | 25 Nov 2020 16:15 |
Repository Staff Only: item control page