Lathiff, Fathima Nihatha Abdul (2022) Dependency Encoding for Relation Extraction. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
2MBLathiff_MCompSc_F2022.pdf - Accepted Version |
Abstract
The surge in information in the form of textual data demands automated systems to extract structured information from unstructured data. Relation extraction plays a key role in the process, with the aim of extracting semantic relations between entities in a text. Since dependency parse trees are capable of capturing the grammatical structure of sentences, this thesis experiments with different encodings of the dependency parse tree to distinguish different semantic relationships. Experiments are conducted on three different data sets that vary in domain and complexity and experimented with varying encoding schemas that can be grouped into two. The first group focuses on encoding the structure of the dependency parse tree with a Deep Graph Convolution Neural Network (DGCNN). The second group focuses on encoding the linguistic features obtained from the dependency parse tree with classical machine learning models such as Random Forest, Support Vector Machine, and Feed-Forward Network, and deep models such as BERT and Transformer encoder stack. The objective of this thesis is not to achieve state-of-the-art (SOTA) performance, rather to evaluate how dependency parse tree based linguistic features perform on different encoding schemas, including deep transformer-based models, on the relation extraction task. The results of the experiments show that these features on certain data sets being less computationally demanding are competitive for complex language models such as BERT, and incorporating them externally to BERT improves the performance rather than confounding.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Lathiff, Fathima Nihatha Abdul |
Institution: | Concordia University |
Degree Name: | M. Comp. Sc. |
Program: | Computer Science |
Date: | June 2022 |
Thesis Supervisor(s): | Bergler, Sabine |
ID Code: | 990694 |
Deposited By: | Fathima Lathiff |
Deposited On: | 27 Oct 2022 14:13 |
Last Modified: | 27 Oct 2022 14:13 |
Repository Staff Only: item control page