Jamali, Saeedeh (2024) A Deep Few-Shot Network for Protein Family Classification. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
1MBJamali_MSc_S2024.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Protein sequence analysis is arguably a challenging modern bioinformatics problem covering various applications such as disease research, precision medicine, and therapeutics. Given the emergence of sequencing technologies and the resulting large-scale databases, protein family classification is an open problem in bioinformatics. Recent advances in computer science have opened new gates to researchers in various scientific domains. Bioinformatics, as an intermediary research field, takes advantage of these advancements from conventional machine learning methods to large language models, and biostatistics. Utilized machine learning techniques for protein family classification, are dependent on domain experts to generate features which could be time-consuming and challenging. Deep learning algorithms have shown promising results in proteomics; however, their application is limited to the availability of massive data sets for training. Since the required data comes from experiments, it can be highly complex or incomplete. As an alternative, few-shot models can learn and generalize from a few observations. To address the mentioned limitations, in this research, we designed and implemented a deep few-shot network for protein family classification and our result showed outperformance to state-of-the-art baseline models. To the best of our knowledge, this is the first deep network tailored for primary sequence family classification that can highly perform with a very limited number of observations.
Divisions: | Concordia University > Faculty of Arts and Science > Mathematics and Statistics |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Jamali, Saeedeh |
Institution: | Concordia University |
Degree Name: | M. Sc. |
Program: | Mathematics |
Date: | 25 March 2024 |
Thesis Supervisor(s): | Chaubey, Yogendra P. and Ebadi, Ashkan |
ID Code: | 993700 |
Deposited By: | Saeedeh Jamali |
Deposited On: | 05 Jun 2024 16:27 |
Last Modified: | 05 Jun 2024 16:27 |
Repository Staff Only: item control page