Farhour, Farbod ORCID: https://orcid.org/0000-0003-1637-9898 (2022) A Weak Supervision-based Approach to Improve Chatbots for Code Repositories. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
928kBFarhour_MSc_S2022.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Software chatbots are growing in popularity and have been increasingly used in software projects due to their benefits in saving time, cost, and effort. Through natural language, users communicate with chatbots to perform various tasks (e.g., monitor and control services). Natural Language Understanding (NLU) component is vital for chatbots as it enables them to understand the users' queries. NLUs need to be trained on various ways a user formulates a query (typically different paraphrases of the same intent). Nevertheless, when implementing a chatbot using an NLU, chatbot practitioners face a challenge in training the NLUs as labeled training data is scarce or unavailable. Typically, such training is done manually and prohibitively expensive.
In this thesis, we propose a weak supervision-based approach to automate the query annotation and chatbot retraining process. Specifically, we leverage weak supervision to label users' queries posted to a software repository-based chatbot. To evaluate the proposed approach, we perform a case study to assess our approach on the NLU's performance. We use a software repository-based chatbot dataset that contains 749 queries, with 52 intents in our evaluation. The results show that using our approach yields to an average increase of 17.16% in the NLU's performance in terms of F1-score. Also, we find that our approach labels, on average, 99% of users' queries correctly. Finally, our results show that applying more labeling functions improves the NLU's performance in classifying the user's query. Our work helps software engineering (SE) practitioners improve their chatbot's performance while requiring minimal training by automating the labeling process of users' queries.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Farhour, Farbod |
Institution: | Concordia University |
Degree Name: | M. Sc. |
Program: | Computer Science |
Date: | 11 May 2022 |
Thesis Supervisor(s): | Shihab, Emad and Mansour, Essam |
Keywords: | Software Chatbots, Weak Supervision, Natural Language Understanding Platforms, Empirical Software Engineering. |
ID Code: | 990601 |
Deposited By: | Farbod Farhour |
Deposited On: | 27 Oct 2022 14:49 |
Last Modified: | 01 Jun 2024 00:00 |
Repository Staff Only: item control page