Login | Register

Using text classification to automate ambiguity detection in SRS documents


Using text classification to automate ambiguity detection in SRS documents

Hussain, H. M. Ishrar (2007) Using text classification to automate ambiguity detection in SRS documents. Masters thesis, Concordia University.

[thumbnail of MR34442.pdf]
Text (application/pdf)
MR34442.pdf - Accepted Version


Software Requirements Specification (SRS) is one of the most important artifacts produced during the software development lifecycle. In practice, requirements specifications are initially written in natural language, which allows them to be corrupted with different forms of ambiguity that eventually may contribute to critical failure in the subsequent phases of the system's development, if they are not detected at the time of requirements validation. The objective of this work is to study possible automation of detecting ambiguity in SRS documents by means of a text classification system. The work is a part of a larger project aimed at applying Natural Language Processing (NLP) techniques to assess the quality of SRS documents. In the absence of a standard annotated corpus, we collected SRS samples and carried out corpus annotation process to build corpora of our own, one at the sentence-level and the other at the discourse-level. The annotators were trained with an annotation guideline, which was written based on the quality model that we had developed. The process showed substantial level of inter-annotator agreement indicating the possibility of automating the task by a tool that can accurately emulate the human annotation process. The resultant corpus was then used for training and testing our text classification system. We set the scope of this thesis to detect ambiguity at the level of surface understanding only, since the indicators of its possible existence in text can be realistically extracted by the currently available NLP tools. We developed two different decision-tree-based text classification systems that worked at the sentence-level and the discourse-level, and conducted a series of experiments training and testing the classifier with different sets of features. Finally, merging the two classifiers together yielded optimum results with an accuracy of 86.67% in detecting ambiguity at the level of surface understanding. To our knowledge, none of the previous work in the field of Requirements Engineering (RE) has tested the applicability or performance of using a text classification system to automate the detection of textual ambiguities. Our work, thus, provided significant evidence on the prospect and feasibility of using a text classifier to automate ambiguity detection in SRS documents. Keywords. Software Requirements Specification, Text Classification, Natural Language Processing

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Hussain, H. M. Ishrar
Pagination:xv, 130 leaves : ill. ; 29 cm.
Institution:Concordia University
Degree Name:M. Comp. Sc.
Program:Computer Science and Software Engineering
Thesis Supervisor(s):Ormandjieva, O
Identification Number:LE 3 C66C67M 2007 H87
ID Code:975462
Deposited By: Concordia University Library
Deposited On:22 Jan 2013 16:08
Last Modified:13 Jul 2020 20:07
Related URLs:
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top