Login | Register

Syntactic Sentence Compression for Text Summarization

Title:

Syntactic Sentence Compression for Text Summarization

Perera, Paththamestrige (2013) Syntactic Sentence Compression for Text Summarization. Masters thesis, Concordia University.

[img]
Preview
Text (application/pdf)
Paththamestrige_MSc_F2013.pdf - Accepted Version
Available under License Spectrum Terms of Access.
741kB

Abstract

Abstract

Automatic text summarization is a dynamic area in Natural Language Processing that has gained much attention in the past few decades. As a vast amount of data is accumulating
and becoming available online, providing automatic summaries of specific subjects/topics has become an important user requirement. To encourage the growth of this research area, several shared tasks are held annually and different types of benchmarks are made available. Early work on automatic text summarization focused on improving the relevance
of the summary content but now the trend is more towards generating more abstractive and coherent summaries. As a result of this, sentence simplification has become a prominent requirement in automatic summarization. This thesis presents our work on sentence compression using syntactic pruning methods in order to improve automatic text summarization. Sentence compression has several applications in Natural Language Processing such as text simplification, topic and subtitle generation, removal of redundant information and text summarization. Effective sentence
compression techniques can contribute to text summarization by simplifying texts, avoiding redundant and irrelevant information and allowing more space for useful information. In our work, we have focused on pruning individual sentences, using their phrase structure grammar representations. We have implemented several types of pruning techniques and the results were evaluated in the context of automatic summarization, using standard evaluation metrics. In addition, we have performed a series of human evaluations and a comparison with other sentence compression techniques used in automatic summarization.
Our results show that our syntactic pruning techniques achieve compression rates that are similar to previous work and also with what humans achieve. However, the automatic
evaluation using ROUGE shows that any type of sentence compression causes a decrease in content compared to the original summary and extra content addition does not show
a significant improvement in ROUGE. The human evaluation shows that our syntactic pruning techniques remove syntactic structures that are similar to what humans remove and inter-annotator content evaluation using ROUGE shows that our techniques perform well compared to other baseline techniques. However, when we evaluate our techniques with a
grammar structure based F-measure, the results show that our pruning techniques perform better and seem to approximate human techniques better than baseline techniques.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Perera, Paththamestrige
Institution:Concordia University
Degree Name:M. Comp. Sc.
Program:Computer Science
Date:September 2013
Thesis Supervisor(s):Kosseim, Leila
ID Code:977725
Deposited By: PRASAD PERERA
Deposited On:26 Nov 2013 15:35
Last Modified:18 Jan 2018 17:45
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Back to top Back to top