Login | Register

A Weak Supervision-based Approach to Improve Chatbots for Code Repositories


A Weak Supervision-based Approach to Improve Chatbots for Code Repositories

Farhour, Farbod ORCID: https://orcid.org/0000-0003-1637-9898 (2022) A Weak Supervision-based Approach to Improve Chatbots for Code Repositories. Masters thesis, Concordia University.

[thumbnail of Farhour_MSc_S2022.pdf]
Text (application/pdf)
Farhour_MSc_S2022.pdf - Accepted Version
Restricted to Repository staff only until 1 June 2024.
Available under License Spectrum Terms of Access.


Software chatbots are growing in popularity and have been increasingly used in software projects due to their benefits in saving time, cost, and effort. Through natural language, users communicate with chatbots to perform various tasks (e.g., monitor and control services). Natural Language Understanding (NLU) component is vital for chatbots as it enables them to understand the users' queries. NLUs need to be trained on various ways a user formulates a query (typically different paraphrases of the same intent). Nevertheless, when implementing a chatbot using an NLU, chatbot practitioners face a challenge in training the NLUs as labeled training data is scarce or unavailable. Typically, such training is done manually and prohibitively expensive.

In this thesis, we propose a weak supervision-based approach to automate the query annotation and chatbot retraining process. Specifically, we leverage weak supervision to label users' queries posted to a software repository-based chatbot. To evaluate the proposed approach, we perform a case study to assess our approach on the NLU's performance. We use a software repository-based chatbot dataset that contains 749 queries, with 52 intents in our evaluation. The results show that using our approach yields to an average increase of 17.16% in the NLU's performance in terms of F1-score. Also, we find that our approach labels, on average, 99% of users' queries correctly. Finally, our results show that applying more labeling functions improves the NLU's performance in classifying the user's query. Our work helps software engineering (SE) practitioners improve their chatbot's performance while requiring minimal training by automating the labeling process of users' queries.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Farhour, Farbod
Institution:Concordia University
Degree Name:M. Sc.
Program:Computer Science
Date:11 May 2022
Thesis Supervisor(s):Shihab, Emad and Mansour, Essam
Keywords:Software Chatbots, Weak Supervision, Natural Language Understanding Platforms, Empirical Software Engineering.
ID Code:990601
Deposited By: Farbod Farhour
Deposited On:27 Oct 2022 14:49
Last Modified:27 Oct 2022 14:49
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top