Using text classification to automate ambiguity detection in SRS documents

Title:

Using text classification to automate ambiguity detection in SRS documents

Hussain, H. M. Ishrar (2007) Using text classification to automate ambiguity detection in SRS documents. Masters thesis, Concordia University.

Preview

Text (application/pdf)
MR34442.pdf - Accepted Version

5MB

Abstract

Software Requirements Specification (SRS) is one of the most important artifacts produced during the software development lifecycle. In practice, requirements specifications are initially written in natural language, which allows them to be corrupted with different forms of ambiguity that eventually may contribute to critical failure in the subsequent phases of the system's development, if they are not detected at the time of requirements validation. The objective of this work is to study possible automation of detecting ambiguity in SRS documents by means of a text classification system. The work is a part of a larger project aimed at applying Natural Language Processing (NLP) techniques to assess the quality of SRS documents. In the absence of a standard annotated corpus, we collected SRS samples and carried out corpus annotation process to build corpora of our own, one at the sentence-level and the other at the discourse-level. The annotators were trained with an annotation guideline, which was written based on the quality model that we had developed. The process showed substantial level of inter-annotator agreement indicating the possibility of automating the task by a tool that can accurately emulate the human annotation process. The resultant corpus was then used for training and testing our text classification system. We set the scope of this thesis to detect ambiguity at the level of surface understanding only, since the indicators of its possible existence in text can be realistically extracted by the currently available NLP tools. We developed two different decision-tree-based text classification systems that worked at the sentence-level and the discourse-level, and conducted a series of experiments training and testing the classifier with different sets of features. Finally, merging the two classifiers together yielded optimum results with an accuracy of 86.67% in detecting ambiguity at the level of surface understanding. To our knowledge, none of the previous work in the field of Requirements Engineering (RE) has tested the applicability or performance of using a text classification system to automate the detection of textual ambiguities. Our work, thus, provided significant evidence on the prospect and feasibility of using a text classifier to automate ambiguity detection in SRS documents. Keywords. Software Requirements Specification, Text Classification, Natural Language Processing

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:	Thesis (Masters)
Authors:	Hussain, H. M. Ishrar
Pagination:	xv, 130 leaves : ill. ; 29 cm.
Institution:	Concordia University
Degree Name:	M. Comp. Sc.
Program:	Computer Science and Software Engineering
Date:	2007
Thesis Supervisor(s):	Ormandjieva, O
Identification Number:	LE 3 C66C67M 2007 H87
ID Code:	975462
Deposited By:	lib-batchimporter
Deposited On:	22 Jan 2013 16:08
Last Modified:	13 Jul 2020 20:07
Related URLs:	https://concordiauniversity.on.worldcat....

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

Using text classification to automate ambiguity detection in SRS documents

Using text classification to automate ambiguity detection in SRS documents

Abstract