Hajari, Fahimeh (2022) Balance Expertise, Workload and Turnover into Code Review Recommendation. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
567kBHajari_MC_W2021.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Developer turnover is inevitable on software projects and leads to knowledge loss, a reduction
in productivity, and an increase in defects. Mitigation strategies to deal with turnover tend to disrupt and increase workloads for developers. In this work, we suggest that through code review
recommendation, we can distribute knowledge and mitigate turnover while more evenly distributing review workload. We conduct empirical investigations to understand the natural concentration
of review workload and the degree of knowledge spreading that is inherent in code review. Even
though the review workload is highly concentrated, with the top 20% of reviewers doing 80.19%
of reviews, code review naturally spreads knowledge and reduces the files at risk to turnover from
79.79% to 32.59%. To balance the review workload, reduce the Files at Risk to turnover, FaR,
and maintain high levels of Expertise during the review, we evaluate existing code review recommenders and develop novel recommenders. We find that prior work that assigns reviewers based
on file ownership concentrates knowledge on a small group of core developers, increasing the risk
of knowledge loss from turnover by up to 65.19%. Recent work, WhoDo, that considers developer workload, assigns developers that are not suffciently committed to the project and increases
FaR by 40.97%. We propose learning and retention aware review recommenders that when combined are effective at reducing the risk of turnover by -29.54%, but they unacceptably reduce the
overall expertise during reviews by -25.30%. Combining recommenders, we develop the SofaWL
recommender that suggests experts with low active review workload when none of the files under
review are hoarded by developers, but distribute knowledge when files are at risk to turnover. In this
way, we can simultaneously increase expertise during review with an ∆Expertise of 3.20%, reduce
workload concentration, ∆GiniWork by -12.00%, and reduce the fles at risk, ∆FaR, by -23.92%.
We then focus on the Risky File that have zero or one knowledgeable developers. We randomly
replace one of the actual reviewers with a suggested developer using TurnoverRec when we have a
risky file in the pull request. In this approach, we can increase Expertise substantially in comparison
to TurnoverRec and reduce FaR by -25.14%.
For the FaR++, we add a learner to the actual reviewers when we have a risky fle in the pull
request. We reduce FaR by -83.88% but increase the number of review by 13.14%. To reduce
the additional workload in AwareFaR, we only add reviewers when there are abandoned fles, this
decreases FaR by -37.51% and only increases the number of reviews by 34.24%.
Our data results and scripts are available in our replication package. 1
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Hajari, Fahimeh |
Institution: | Concordia University |
Degree Name: | M.A. Sc. |
Program: | Computer Science |
Date: | 7 January 2022 |
Thesis Supervisor(s): | Rigby, Peter C |
ID Code: | 990669 |
Deposited By: | Fahimeh Hajari |
Deposited On: | 27 Oct 2022 14:02 |
Last Modified: | 27 Oct 2022 14:02 |
Repository Staff Only: item control page