Davari, MohammadReza (2020) Neural Network Approaches to Medical Toponym Recognition. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
16MBDavari_MCompSc_S2020.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Toponym identification, or place name recognition, within epidemiology articles is a crucial task for phylogeographers, as it allows them to analyze the development, spread, and migration of viruses. Although, public databases, such as GenBank (Benson et al., November 2012), contain the geographical information, this information is typically restricted to country and state levels. In order to identify more fine-grained localization information, epidemiologists need to read relevant scientific articles and manually extract place name mentions.
In this thesis, we investigate the use of various neural network architectures and language representations to automatically segment and label toponyms within biomedical texts. We demonstrate how our language model based toponym recognizer relying on transformer architecture can achieve state-of-the-art performance. This model uses pre-trained BERT as the backbone and fine tunes on two domains of datasets (general articles and medical articles) in order to measure the generalizability of the approach and cross-domain transfer learning.
Using BERT as the backbone of the model, resulted in a large highly parameterized model (340M parameters). In order to obtain a light model architecture we experimented with parameter pruning techniques, specifically we experimented with Lottery Ticket Hypothesis (Frankle and Carbin, May 2019) (LTH), however as indicated by Frankle and Carbin (May 2019), their pruning technique does not scale well to highly parametrized models and loses stability. We proposed a novel technique to augment LTH in order to increase the scalability and stability of this technique to highly parametrized models such as BERT and tested our technique on toponym identification task.
The evaluation of the model was performed using a collection of 105 epidemiology articles from PubMed Central (Weissenbacher et al., June 2015). Our proposed model significantly improves the state-of-the-art model by achieving an F-measure of 90.85% compared to 89.13%.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Davari, MohammadReza |
Institution: | Concordia University |
Degree Name: | M. Comp. Sc. |
Program: | Computer Science |
Date: | 3 April 2020 |
Thesis Supervisor(s): | Bui, Tien D. and Kosseim, Leila |
Keywords: | Toponym Identification, Deep Learning, BERT, Attention, Transformers, Transfer Learning, Cross-domain Learning, Parameter Pruning |
ID Code: | 986657 |
Deposited By: | MohammadReza Davari |
Deposited On: | 26 Jun 2020 13:04 |
Last Modified: | 26 Jun 2020 13:04 |
Repository Staff Only: item control page