Fitzgibbons, Megan
ORCID: https://orcid.org/0000-0003-0409-6321, Berrizbeitia, Francisco
ORCID: https://orcid.org/0000-0002-1542-8435, Chalifour, Joshua
ORCID: https://orcid.org/0000-0001-7663-0509, Stouhi, Yara, Charbonneau, Olivier and Majerczyk, Aviva
(2026)
Deploying and Evaluating a Conversational Agent Using LLMs for Academic Library Reference.
Reference Services Review
.
ISSN 2054-1716
Preview |
Text (Author Accepted Manuscript) (application/pdf)
1MBConvAI-Library-Reference_ReferenceServicesReview_AAM_2026.pdf - Accepted Version Available under License Creative Commons Attribution Non-commercial. |
Official URL: https://doi.org/10.1108/RSR-05-2025-0030
Abstract
This study has two aims. First, we sought to implement a RAG-based GenAI system capable of answering reference questions. Second, we aimed to develop an evaluation protocol to assess the chatbot by means of comparing implementations that use three different LLMs. An evaluation rubric was piloted to gauge its viability as an assessment tool.
The RAG-based chatbot uses a two-step approach. First, in response to a query, the system retrieves relevant documents from a knowledge base. Each document is vectorized and matched by relevance. Second, retrieved data is combined with an LLM's generative capabilities to produce a context-aware response.
Fourteen common questions representing different areas of the knowledge base were tested with the chatbot versions. The research team developed and then used an evaluation rubric to score the chatbots’ responses according to: accuracy, groundedness, elicitation, completeness, and further assistance. The rubric was also evaluated by calculating the standard deviation among reviewers’ scores.
The RAG implementations were largely successful in restricting the chatbot’s responses to the knowledge base. The evaluation rubric was effective for assessing the models, highlighting each’s strengths and weaknesses. Despite the evaluation being subjective, the evaluators gave similar scores, with the greatest variation in the elicitation dimension.
This study offers a technical description of a practical way to implement a RAG-based chatbot in a library setting as well as a protocol for evaluating such chatbots in multiple dimensions that hasn’t been discussed in previous literature.
| Divisions: | Concordia University > Library |
|---|---|
| Item Type: | Article |
| Refereed: | Yes |
| Authors: | Fitzgibbons, Megan and Berrizbeitia, Francisco and Chalifour, Joshua and Stouhi, Yara and Charbonneau, Olivier and Majerczyk, Aviva |
| Journal or Publication: | Reference Services Review |
| Date: | 22 January 2026 |
| Funders: |
|
| Digital Object Identifier (DOI): | 10.1108/RSR-05-2025-0030 |
| Keywords: | reference services, artificial intelligence, evaluation protocol, technological change, service delivery, generative AI, chatbot |
| ID Code: | 996676 |
| Deposited By: | Joshua Chalifour |
| Deposited On: | 23 Jan 2026 21:41 |
| Last Modified: | 23 Jan 2026 21:41 |
| Related URLs: |
References:
Arce, V., & Ehrenpreis, M. (2023). Improving a library FAQ: Assessment and reflection of the first year’s use. The Reference Librarian, 64(1), 35–50. https://doi.org/10.1080/02763877.2023.2167898Barus, S. P., & Surijati, E. (2022). Chatbot with Dialogflow for FAQ services in Matana University Library. International Journal of Informatics and Computation, 3(2). https://doi.org/10.35842/ijicom.v3i2.43
Bryant, R. (2024, December 12). Implementing an AI reference chatbot at the University of Calgary Library. Hanging Together. https://hangingtogether.org/implementing-an-ai-reference-chatbot-at-the-university-of-calgary-library/
Cox, A. (2023). How artificial intelligence might change academic library work: Applying the competencies literature and the theory of the professions. Journal of the Association for Information Science and Technology, 74(3), 367–380. https://doi.org/10.1002/asi.24635
Danuarta, L., Mawardi, V. C., & Lee, V. (2024). Retrieval-Augmented Generation (RAG) Large Language Model for educational chatbot. 2024 Ninth International Conference on Informatics and Computing (ICIC), 1–6. https://doi.org/10.1109/ICIC64337.2024.10957676
Ehrenpreis, M., & DeLooper, J. (2025). Chatbot assessment: Best practices for artificial intelligence in the library. Portal: Libraries and the Academy, 25(4), 671–702.
Ehrenpreis, M., & DeLooper, J. (2022). Implementing a chatbot on a library website. Journal of Web Librarianship, 16(2), 120–142. https://doi.org/10.1080/19322909.2022.2060893
Feng, Y., Wang, J., & Anderson, S. G. (2024). Ethical considerations in integrating AI in research consultations: Assessing the possibilities and limits of GPT-based chatbots. Journal of eScience Librarianship, 13(1), e846. https://doi.org/10.7191/jeslib.846
Guy, J., Pival, P. R., Lewis, C. J., & Groome, K. (2023). Reference Chatbots in Canadian Academic Libraries. Information Technology and Libraries, 42(4). https://doi.org/10.5860/ital.v42i4.16511
Ivanovskaya, A., Aksyonov, K., Kalinin, I., Chiryshev, Y., & Aksyonova, O. (2019). Development of the text analysis software agent (chat bot) for the library based on the question and answer system TWIN. ITM Web of Conferences, 30, 04006. https://doi.org/10.1051/itmconf/20193004006
Kane, D. (2019, April 10). Analyzing an interactive chatbot and its impact on academic reference services. ACRL 19th National Conference, Cleveland, Ohio. http://hdl.handle.net/11213/17624
Kansal, A. (2024). Building generative AI-powered apps: A hands-on guide for developers. Apress. https://doi.org/10.1007/979-8-8688-0205-8
Lai, K. (2023). How well does ChatGPT handle reference inquiries? An analysis based on question types and question complexities. College & Research Libraries, 84(6). https://doi.org/10.5860/crl.84.6.974
LangChain. (n.d.). Overview. https://docs.langchain.com/oss/python/langchain/overview
Lappalainen, Y., & Narayanan, N. (2023). Aisha: a custom AI library chatbot using the ChatGPT API. Journal of Web Librarianship, 17(3), 37–58. https://doi.org/10.1080/19322909.2023.2221477
Olawore, K., McTear, M., & Bi, Y. (2025). Development and evaluation of a university chatbot using deep learning: A RAG-based approach. In A. Følstad, S. Papadopoulos, T. Araujo, E. L.-C. Law, E. Luger, S. Hobert, & P. B. Brandtzaeg (Eds.), Chatbots and Human-Centered AI (pp. 96–111). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-88045-2_7
Panda, S., & Chakravarty, R. (2022). Adapting intelligent information services in libraries: A case of smart AI chatbots. Library Hi Tech News, 39(1), 12–15. https://doi.org/10.1108/LHTN-11-2021-0081
Reinsfelder, T. L., & O’Hara-Krebs, K. (2023). Implementing a rules-based chatbot for reference service at a large university library. Journal of Web Librarianship, 17(4), 95-109. https://doi.org/10.1080/19322909.2023.2268832
Rodriguez, S., & Mune, C. (2022). Uncoding library chatbots: Deploying a new virtual reference tool at the San Jose State University library. Reference Services Review, 50(3/4), 392–405. https://doi.org/10.1108/RSR-05-2022-0020
Thalaya, N., & Puritat, K. (2022). BCNPYLIB CHAT BOT: The artificial intelligence Chatbot for library services in college of nursing. 2022 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON), 247–251. https://ieeexplore.ieee.org/abstract/document/9720367/
University of Oklahoma Libraries. (n.d.). Project Highlight: Bizzy Chat Bot | OU Libraries. Retrieved March 13, 2025, from https://libraries.ou.edu/content/project-highlight-bizzy-chat-bot
University of Texas Libraries. (2024, November 30). Building a bot: An exploration of AI to assist librarians. TexLibris. https://texlibris.lib.utexas.edu/2024/11/building-a-bot-an-exploration-of-ai-to-assist-librarians/
Yang, S. Q., & Mason, S. (2024). Beyond the algorithm: understanding how ChatGPT handles complex library queries. Internet Reference Services Quarterly. https://doi.org/10.1080/10875301.2023.2291441
Repository Staff Only: item control page


Download Statistics
Download Statistics