Login | Register

Argument Labeling of Discourse Relations using LSTM Neural Networks

Title:

Argument Labeling of Discourse Relations using LSTM Neural Networks

Hooda, Sohail (2019) Argument Labeling of Discourse Relations using LSTM Neural Networks. Masters thesis, Concordia University.

[thumbnail of Hooda_MCOMPSc_S2019.pdf]
Preview
Text (application/pdf)
Hooda_MCOMPSc_S2019.pdf - Accepted Version
Available under License Spectrum Terms of Access.
1MB

Abstract

A discourse relation can be described as a linguistic unit that is composed of sub-units that, when combined, present more information than the sum of its parts. A discourse relation is usually comprised of two arguments that relate to each other in a given form. A discourse relation may have another optional sub-unit called the discourse connective that connects the two arguments and describes the relationship between the two more explicitly. This is called Explicit Discourse relation. Extracting or labeling arguments present in an explicit discourse relations is a challenging task. In recent years, due to the CoNLL competitions, feature engineering has been applied to allow various machine learning models to achieve an F-measure value of about 55%. However, feature engineering is brittle and hand-crafted, requiring advanced knowledge of linguistics as well as the dataset in question. In this thesis, we propose an approach for segmenting (or identifying the boundaries of) Arg1 and Arg2 without feature engineering. We introduce a Bidirectional Long Short-Term Memory (LSTM) based model for argument labeling. We experimented with multiple configurations of our model. Using the Penn Discourse Treebank (PDTB) dataset, our best model achieved an F1 measure of 23.05% without any feature engineering. This is significantly higher than the 20.52% achieved by the state of the art Recurrent Neural Network (RNN) approach, but significantly lower than the feature based state of the art systems. On the other hand, because our approach learns only from the raw dataset, it is more widely applicable to multiple textual genres and languages.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Hooda, Sohail
Institution:Concordia University
Degree Name:M. Comp. Sc.
Program:Computer Science
Date:24 January 2019
Thesis Supervisor(s):Kosseim, Leila
ID Code:985003
Deposited By: SOHAIL HOODA
Deposited On:20 Dec 2019 16:05
Last Modified:20 Dec 2019 16:05
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top