Quach, Sophia (2021) Studying the Use of SZZ with Non-functional bugs. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
992kBQuach_MASc_S2021.pdf - Accepted Version |
Abstract
Non-functional bugs bear a heavy cost on both software developers and end-users. Tools to reduce the occurrence, impact, and repair time of non-functional bugs can therefore provide key assistance for software developers racing to fix these issues.
Classification models that focus on identifying defect-prone commits, referred to as \emph{Just-In-Time (JIT) Quality Assurance} are known to be useful in allowing developers to review risky commits. JIT models, however, leverage the SZZ approach to identify whether or not a past change is bug-inducing.
However, the due to the nature of non-functional bugs, their fixes may be scattered and separate from their bug-inducing locations in the source code. Yet, prior studies that leverage or evaluate the SZZ approach do not consider non-functional bugs, leading to potential bias on the results.
In this thesis, we conduct an empirical study on the results of the SZZ approach on the non-functional bugs in the NFBugs dataset, and the performance bugs in Cassandra, and Hadoop. We manually examine whether each identified bug-inducing change is indeed the correct bug-inducing change. Our manual study shows that a large portion of non-functional bugs cannot be properly identified by the SZZ approach. We uncover root causes for false detection that have not been previously found. We evaluate the identified bug-inducing changes based on criteria from prior research. Our results may be used to assist in future research on non-functional bugs, and highlight the need to complement SZZ to accommodate the unique characteristics of non-functional bugs.
Furthermore, we conduct an empirical study to evaluate model performance for JIT models by using them to identify bug-inducing code commits for performance related bugs.
Our findings show that JIT defect prediction classifies non-performance bug-inducing commits better than performance bug-inducing commits. However, we find that manually correcting errors in the training data only slightly improves the models. In the absence of a large number of correctly labelled performance bug-inducing commits, our findings show that combining all available training data yields the best classification results.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Quach, Sophia |
Institution: | Concordia University |
Degree Name: | M.A. Sc. |
Program: | Software Engineering |
Date: | 24 April 2021 |
Thesis Supervisor(s): | Shang, Weiyi |
ID Code: | 988408 |
Deposited By: | Sophia Quach |
Deposited On: | 29 Nov 2021 17:04 |
Last Modified: | 29 Nov 2021 17:04 |
Repository Staff Only: item control page