Linguistic Approaches for Early Measurement of Functional Size from Software Requirements

Title:

Linguistic Approaches for Early Measurement of Functional Size from Software Requirements

Hussain, H M Ishrar (2014) Linguistic Approaches for Early Measurement of Functional Size from Software Requirements. PhD thesis, Concordia University.

Preview

Text (Doctoral Thesis) (application/pdf)
Hussain_PhD_F2014.pdf - Accepted Version
Available under License Spectrum Terms of Access.

7MB

Abstract

The importance of early effort estimation, resource allocation and overall quality control in a software project has led the industry to formulate several functional size measurement (FSM) methods that are based on the knowledge gathered from software requirements documents. The main objective of this research is to develop a comprehensive methodology to facilitate and automate early measurement of a software's functional size from its requirements document written in unrestricted natural language. For the purpose of this research, we have chosen to use the FSM method developed by the Common Software Measurement International Consortium (COSMIC) and adopted as an international standard by the International Standardization Organization (ISO). This thesis presents a methodology to measure the COSMIC size objectively from various textual forms of functional requirements and also builds conceptual measurement models to establish traceability links between the output measurements and the input requirements. Our research investigates the feasibility of automating every major phase of this methodology with natural language processing and machine learning approaches. The thesis provides a step-by-step validation and demonstration of the implementation of this innovative methodology. It describes the details of empirical experiments conducted to validate the methodology with practical samples of textual requirements collected from both the industry and academia. Analysis of the results show that each phase of our methodology can successfully be automated and, in most cases, leads to an accurate measurement of functional size.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:	Thesis (PhD)
Authors:	Hussain, H M Ishrar
Institution:	Concordia University
Degree Name:	Ph. D.
Program:	Computer Science
Date:	27 August 2014
Thesis Supervisor(s):	Ormandjieva, Olga and Kosseim, Leila
Keywords:	Functional Size Measurement, Software Requirements Specification, Effort Estimation, Natural Language Processing, Text Mining
ID Code:	978960
Deposited By:	H M ISHRAR HUSSAIN
Deposited On:	20 Nov 2014 19:26
Last Modified:	18 Jan 2018 17:48

References:

Aamodt, A., & Plaza, E. (1994). Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. AI Communications, 7 (1), 39-59.

Abbott, R. J. (1983). Program design by informal English descriptions. Communications of the ACM, 26 (11), 882-894.

Abdukalykov, R., Hussain, I., Kassab, M., & Ormandjieva, O. (2011). Quantifying the Impact of Different Non-functional Requirements and Problem Domains on Software Effort Estimation. Proceedings of 9th International Conference on Software Engineering Research, Management and Applications (SERA) (pp. 158-165). Washington, DC: IEEE Computer Society.

Aiello, G., Alessi, M., Cossentino, M., Urso, A., & Vella, G. (2007). RTDWD: Real-Time Distributed Wideband-Delphi for User Stories Estimation. Proceedings of the 3rd International Conference on Rapid Integration of Software Engineering Techniques (pp. 35-50). Springer-Verlag.

Albrecht, A. J. (1979). Measuring Application Development Productivity. Proceedings of IBM Application Development Symp. (pp. 83-92). Monterey, Calif.: Press I.B.M.

Albrecht, A. J., & Gaffney, J. E. (1983). Software function, source lines of code, and development effort prediction: A software science validation. IEEE Transactions on Software Engineering, 9, 639-648.

Amazon.com Inc. (2012). Amazon Mechanical Turk. Amazon Mechanical Turk:
https://www.mturk.com/mturk/welcome

Ambriola, V., & Gervasi, V. (2006). On the Systematic Analysis of Natural Language Requirements with CIRCE. Automated Software Engineering, Automated Software Engineering, 13 (1), 107-167.

Ambriola, V., & Gervasi, V. (1997). Processing natural language requirements. Proceedings of Automated Software Engineering (ASE'97): 12th IEEE International Conference, November 1–5 (pp. 36-45). IEEE Computer Society.

Angelis, L., & Stamelos, I. (2000). A simulation tool for efficient analogy based cost estimation. Empirical Software Engineering, 5, 35-68.

Azzeh, M., Neagu, D., & Cowling, P. (2008). Improving analogy software effort estimation using fuzzy feature subset selection algorithm. Proceedings of the 4th International Workshop on Predictor Models in Software Engineering (pp. 71-78). Leipzig, Germany: ACM.

Beck, K., & Fowler, M. (2000). Planning Extreme Programming. Addison-Wesley.

Bertran, M., Borrega, O., Recasens, M., & Soriano, B. (2008). AnCoraPipe: A tool for multilevel annotation. Procesamiento del Lenguaje Natural, 41, 291-292.

Bevo, V. (2005). Analyse et formalisation ontologique des procédures de mesure associées aux méthodes de mesure de la taille fonctionnelle des logiciels: de nouvelles perspectives pour la mesure. Doctoral thesis, Montréal: Université du Québec à Montréal - UQAM.

Boehm, B. (2000). Safe and Simple Software Cost Analysis. IEEE Software, 17 (5), 14-17.

Boehm, B. (1981). Software engineering economics. Prentice-Hall.

Boehm, B. (1984). Software engineering economics. IEEE Transactions on Software Engineering, 10, 1, 4-21.

Boehm, B., Abts, C., Brown, A. W., Chulani, S., Clark, B. K., Horowitz, E., Madachy, R. & Reifer, D.. (2000). Software cost estimation with Cocomo II. Prentice-Hall.

Bontcheva, K., Cunningham, H., Roberts, I., Roberts, A., Tablan, V., Aswani, N., & Gorrell, G. (2013). Teamware: A Web-based, Collaborative Text Annotation Framework. Language Resources and Evaluation. 47 (4), 1007-1029.

Braga, P. L., Oliveira, A. L., & Meira, S. R. (2008). A GA-based feature selection and parameters optimization for support vector regression applied to software effort estimation. Proceedings of the 2008 ACM Symposium on Applied Computing (pp. 1788-1792). Fortaleza, Ceara, Brazil: ACM.

Brill, E. (1992). A Simple Rule-Based Part of Speech Tagger. Proceedings of the Third Conference on Applied Natural Language Processing (pp. 152-155). Trento, Italy: Association for Computational Linguistics.

Burgess, C. J., & Lefley, M. (2001). Can genetic programming improve software effort estimation? A comparative evaluation. Information and Software Technology, 43, 863-873.

Cake Software Foundation. (2014, May 1). CakePHP Cookbook Documentation: Release 2.x. Retrieved May 3, 2014, from CakePHP:
http://book.cakephp.org/2.0/_downloads/en/CakePHPCookbook.pdf

Carletta, J. (1996). Assessing Agreement on Classification Tasks: The Kappa Statistic. Computational Linguistics, 22, 249-255.

Casamayor, A., Godoy, D., & Campo, M. (2009). Semi-Supervised Classification of Non-Functional Requirements: An Empirical Analysis. Inteligencia Artificial, 44, 35-45.

Cer, D., de Marneffe, M.-C., Jurafsky, D., & Manning, C. D. (2010). Parsing to Stanford Dependencies: Trade-offs between speed and accuracy. Proceedings of 7th International Conference on Language Resources and Evaluation (LREC 2010) (pp. 1628–1632). European Language Resources Association (ELRA).

Chang, C. K., Christensen, M. J., & Tao, Z. (2001). Genetic algorithms for project management. Annals of Software Engineering, 11, 107-139.

Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.

Chulani, S., Boehm, B., & Steece, B. (1999). Bayesian analysis of empirical software engineering cost models. IEEE Transactions on Software Engineering, 25, 573-583.

Chung, L., & Sapakkul, S. (2006). Capturing and Reusing Functional and Non-functional Requirements Knowledge: A Goal-Object Pattern Approach. Proceedings of the 2006 IEEE International Conference on Information Reuse and Integration, September (pp. 539-544). Waikoloa, Hawaii, USA: IEEE Press.

Cicchetti, D. V., & Feinstein, A. R. (1990). High agreement but low kappa: II. Resolving the paradoxes. Journal of Clinical Epidemiology, 43 (6), 551-558.

Cleland-Huang, J., Settimi, R., Zou, X., & Solc, P. (2006). The Detection and Classification of Non-Functional Requirements with Application to Early Aspects. Proceedings of the 14th IEEE International Requirements Engineering Conference 2006 (RE'06), September 11-15 (pp. 36-45). Minneapolis, MN: IEEE Press.

Cochran, W. G. (1977). Sampling techniques (3rd ed.). John Wiley & Sons.

Cohen, J. (1960). A Coefficient of Agreement for Nominal Scales. Journal of Educational and Psychological Measurement, 20, 37-46.

Cohn, M. (2005). Agile Estimating and Planning. Prentice Hall.

Condori-Fernández, N., Abrahão, S., & Pastor, O. (2007). On the estimation of the functional size of software from requirements specifications. Journal of Computer Science and Technology, 22 (3), 358-370.

Conte, S. D., Dunsmore, H. E., & Shen, V. Y. (1986). Software engineering metrics and models. Redwood City, CA: Benjamin Cummings Publishing.

COSMIC. (2014). The COSMIC Functional Size Measurement Method Version 4.0: Measurement Manual. Retrieved May 05, 2014, from COSMIC:
http://www.cosmicon.com/dl_manager4.asp?id=464

COSMIC. (2011). Why COSMIC is the best method for measuring Agile ‘User Stories’. COSMIC News, 7 (1), 3.

Cunningham, H., Maynard, D., Bontcheva, K., & Tablan, V. (2002). GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL'02) (pp. 168-175). PA, USA: Association for Computational Linguistics.

Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Aswani, N., Roberts, I., et al. (2011). Text Processing with GATE (Version 6). Sheffield, UK: Department of Computer Science, University of Sheffield.

Cyre, W. (1995). A Requirements Sublanguage for Automated Analysis. Interational Journal of Intelligent Systems, 10 (7), 665-689.

Cysneiros, L. M., & Leite, J. C. (2002). Non-functional requirements: from elicitation to modelling languages. Proceedings of the International Conference on Software Engineering, 2002 (ICSE 2002), May (pp. 699-700). Orlando, Florida, USA: IEEE Press.

Dawson, C. W. (1996). A neural network approach to software project effort estimation. In R. A. Adey, G. Rzevski, & A. K. Sunol (Ed.), Proceedings of International Conference on Artificial Intelligence in Engineering (p. 37). School of Mathematics & Computing, Derby University, UK.

Delobel, C. (1978). Normalization and Hierarchical Dependencies in the Relational Data Model. ACM Trans. Database Syst., 3 (3), 201-222.

Demirors, O., & Gencel, C. (2004). A Comparison of Size Estimation Techniques Applied Early in the Life Cycle. Software Process Improvement, Lecture Notes in Computer Science, vol. 3281 (pp. 184-194), Berlin, Heidelberg: Springer -Verlag.

Denger, C., Berry, D. M., & Kamsties, E. (2003). Higher Quality Requirements Specifications through Natural Language Patterns. Proceedings of the IEEE International Conference on Software Science, Technology and Engineering (p. 80). Washington, DC: IEEE Computer Society.

Diab, H., Koukane, F., Frappier, M., & St-Denis, R. (2005). μcROSE: Automated Measurement of COSMIC-FFP for Rational Rose Real Time. Information and Software Technology, 47 (3), 151-166.

Drazan, J., & Mencl, V. (2007). Improved processing of textual use cases: Deriving behavior specifications. Lecture Notes in Computer Science-SOFSEM 2007: Theory and Practice of Computer Science, Proceedings of 33rd Conference on Current Trends in Theory and Practice of Computer Science, 4362, 856–868.

Eriksson, M., Börstlerb, J., & Borga, K. (2009). Managing requirements specifications for productlines – An approach and industry case study. Journal of Systems and Software, 82 (3), 435-447.

Estabrooks, A., Jo, T., & Japkowicz, N. (2004). A Multiple Resampling Method for Learning from Imbalanced Data Sets. Computational Intelligence, 20 (1), 18-39.

Eye, A. v., & Eye, M. v. (2008). On the Marginal Dependency of Cohen's κ. European Psychologist, 13 (4), 305-315.

Fabbrini, F., Fusani, M., Gnesi, S., & Lami, G. (2001). An Automatic Quality Evaluation for Natural Language Requirements. Proceedings of the Seventh International Workshop on RE: Foundation for Software Quality (REFSQ’2001), June 4-5. Interlaken, Switzerland.

Fantechi, A., Gnesi, S., Ristori, G., Carenini, M., Vanocchi, M., & Moreschini, P. (1994). Assisting requirement formalization by means of natural language translation. Form. Methods Syst. Des., 4 (3), 243-263.

Flitman, A. M. (2000). A neural network DEA meta-model to facilitate software development time and cost estimation. Proceedings of Artificial Neural Networks in Engineering Conference. 10, (pp. 941-946). New York, NY, USA: ASME.

Foss, T., Stensrud, E., Kitchenham, B., & Myrtveit, I. (2003). A simulation study of the model evaluation criterion MMRE. IEEE Transactions on Software Engineering, 29, 985-995.

Fraser, S., Boehm, B., Erdogmus, H., Jorgensen, M., Rifkin, S., & Ross, M. (2009). The role of judgment in software estimation. Proceedings of 31st International Conference on Software Engineering, ICSE 2009, May 16-24, (pp. 13-17). Vancouver, Canada.

Gaffney, J. E., & Werling, R. (1991). Estimating Software Size from Counts of Externals, A Generalization of Function Points. Analytical Methods in Software Engineering Economics (pp. 193-203). Berlin, Heidelberg: Springer.

Galesic, M., & Bosnjak, M. (2009). Effects of Questionnaire Length on Participation and Indicators of Response Quality in a Web Survey. Public Opinion Quarterly, 73 (2), 349-360.

Galorath. (2008). SEER for Software Development: Estimating Software Projects. Retrieved November 6, 2007, from SEER by Galorath:
http://www.galorath.com/index.php/products/software/C5

Gelhausen, T., & Tichy, W. F. (2007). Thematic Role Based Generation of UML Models from Real World Requirements. Proceeding of The First International Conference on Semantic Computing (ICSC 2007) (pp. 282-289). Los Alamitos: IEEE Computer Society.

Gencel, C., & Demirors, O. (2008). Functional size measurement revisited. Transactions on Software Engineering and Methodology (TOSEM), 17 (3), 15:1-15:36.

Gencel, C., Demirors, O., & Yuceer, E. (2005). A Case Study on Using Functional Size Measurement Methods for Real Time Systems. Proceedings of the 15th. International Workshop on Software Measurement (IWSM) (pp. 159-178). Shaker-Verlag.

Gencel, C., Demirors, O., & Yuceer, E. (2005). Utilizing Functional Size Measurement Methods for Real Time Software Systems. 11th IEEE International Software Metrics Symposium (METRICS 2005). Como, Italy. Retrieved May 05, 2014, from:
http://metrics2005.di.uniba.it/IndustryTrack/Gencel_Utilizingms.pdf

Gnesi, S., Lami, G., & Trentanni, G. (2005). An automatic tool for the analysis of natural language requirements. International Journal of Computer Systems Science and Engineering, Special issue on Automated Tools for Requirements Engineering, 20, 53-62.

Grimstad, S., & Jorgensen, M. (2007). The Impact of Irrelevant Information on Estimates of Software Development Effort. Proceedings of the 2007 Australian Software Engineering Conference, ASWEC '07 (pp. 359-368). IEEE Computer Society.

Habela, P., Głowacki, E., Serafiński, T., & Subieta, K. (2005). Adapting Use Case Model for COSMIC-FFP Based Measurement. Proceedings of 15th International Workshop on Software Measurement (IWSM-2005), (pp. 195-207). Montreal.

Hakkarainen, J., Laamanen, P., & Rask, R. (1993). Neural networks in specification level software size estimation. Proceeding of the Twenty-Sixth Hawaii International Conference on System Sciences, 4, (pp. 626-634). IEEE Press.

Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA Data Mining Software: An Update. SIGKDD Explorations, 11 (1), 10-18.

Harmain, H., & Gaizauskas, R. (2000). CM-Builder: An automated NL-based CASE tool. Proceedings of the Fifteenth IEEE International Conference on Automated Software Engineering, September 11-15 (pp. 45-53). Grenoble, France: IEEE Press.

Heiat, A. (2002). Comparison of artificial neural network and regression models for estimating software development effort. Information and Software Technology, 44, 911-922.

Heinrich, E., Kemp, E., & Patrick, J. (1999). A Natural Language Like Description Language. Proceedings of the 10th Australasian Conference on Information Systems (ACIS), (pp. 375–386). Wellington, New Zealand.

Hill, R., Wang, J., & Nahrstedt, K. (2004). Quantifying Non-functional Requirements: A Process Oriented Approach. Proceedings of the 12th IEEE International Requirements Engineering Conference (RE'04), September (pp. 352-353). Kyoto, Japan: IEEE Press.

Hoehler, F. K. (2000). Bias and prevalence effects on kappa viewed in terms of sensitivity and specificity. Journal of Clinical Epidemiology, 53 (5), 499-503.

Hoogendoorn, S. (2009). Measuring agile progress in smart use case points. Retrieved May 05, 2014, from Sander Hoogendoorn:
http://sanderhoogendoorn.com/blog/index.php/measuring-agile-progress-in-smart-use-case-points/

Huang, S.-J., & Chiu, N.-H. (2006). Optimization of analogy weights by genetic algorithm for software effort estimation. Information and Software Technology, 48, 1034-1045.

Huang, X., Ho, D., Ren, J., & Capretz, L. (2004). A neuro-fuzzy tool for software estimation. Proceedings of the 20th IEEE International Conference on Software Maintenance, (p. 520). IEEE Press.

Hussain, I. (2007). Automated Ambiguity Detection in Natural Language Software Requirements. Master’s Thesis, Department of Computer Science and Software Engineering, Concordia University.

Hussain, I., Kosseim, L., & Ormandjieva, O. (2013). Approximation of COSMIC functional size to support early effort estimation in Agile. Data & Knowledge Engineering, 85, 2-14.

Hussain, I., Kosseim, L., & Ormandjieva, O. (2008). Using Linguistic Knowledge to Classify Non-functional Requirements in SRS documents. In LNCS: Natural Language and Information Systems, vol. 5039/2008 (pp. 287-298). Germany: Springer-Verlag.

Hussain, I., Ormandjieva, O., & Kosseim, L. (2007). Automatic Quality Assessment of SRS Text by Means of a Decision-Tree-Based Text Classifier. Proceedings of the Seventh International Conference on Quality Software (QSIC 2007) (pp. 209-218). Portland, USA: IEEE Computer Society.

Hussain, I., Ormandjieva, O., & Kosseim, L. (2012). LASR: A Tool For Large Scale Annotation of Software Requirements. Proceedings of EmpiRE 2012, the International Workshop on Empirical Requirements Engineering,

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

Linguistic Approaches for Early Measurement of Functional Size from Software Requirements

Linguistic Approaches for Early Measurement of Functional Size from Software Requirements

Abstract

References: