Login | Register

An Empirical Evaluation Of Attention And Pointer Networks For Paraphrase Generation

Title:

An Empirical Evaluation Of Attention And Pointer Networks For Paraphrase Generation

Gupta, Varun (2019) An Empirical Evaluation Of Attention And Pointer Networks For Paraphrase Generation. Masters thesis, Concordia University.

[thumbnail of Gupta_MCompSc_F2019.pdf]
Preview
Text (application/pdf)
Gupta_MCompSc_F2019.pdf - Accepted Version
Available under License Spectrum Terms of Access.
4MB

Abstract

In computer vision, one of the common practice to augment the image dataset is by
creating new images using geometric transformation, which preserves the similarity.
This data augmentation was one of the most significant factors to win the Image Net
competition in 2012 with vast neural networks. Similarly, in speech recognition, we
saw similar results by augmenting the signal by noise, slowing signal or accelerating
it, and spectrogram modification.
Unlike in computer vision and speech data, there haven not been many techniques
explored to augment data in natural language processing (NLP). The only technique
explored in text data is by lexical substitution, which only focuses on replacing
words by synonyms.
In this thesis, we investigate the use of different pointer networks with the sequence
to sequence models, which have shown excellent results in neural machine translation
(NMT) and text simplification tasks, in generating similar sentences using a sequence
to sequence model and of the paraphrase dataset (PPDB). The evaluation of
these paraphrases is carried out by augmenting the training dataset of IMDb movie
review dataset and comparing its performance with the baseline model. We show
how these paraphrases can affect downstream tasks. Furthermore, We train different
classifiers to create a stable baseline for evaluation on IMDb movie dataset. To our
best knowledge, this is the first study on generating paraphrases using these models
with the help of PPDB dataset and evaluating these paraphrases in the downstream
task.

Divisions:Concordia University > Faculty of Arts and Science
Item Type:Thesis (Masters)
Authors:Gupta, Varun
Institution:Concordia University
Degree Name:M. Comp. Sc.
Program:Computer Science
Date:27 June 2019
Thesis Supervisor(s):Krzyzak, Adam
ID Code:985554
Deposited By: Varun Gupta
Deposited On:06 Feb 2020 02:43
Last Modified:06 Feb 2020 02:43
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top