Login | Register

Deploying and Evaluating a Conversational Agent Using LLMs for Academic Library Reference

Title:

Deploying and Evaluating a Conversational Agent Using LLMs for Academic Library Reference

Fitzgibbons, Megan ORCID: https://orcid.org/0000-0003-0409-6321, Berrizbeitia, Francisco ORCID: https://orcid.org/0000-0002-1542-8435, Chalifour, Joshua ORCID: https://orcid.org/0000-0001-7663-0509, Stouhi, Yara, Charbonneau, Olivier and Majerczyk, Aviva (2026) Deploying and Evaluating a Conversational Agent Using LLMs for Academic Library Reference. Reference Services Review . ISSN 2054-1716

[thumbnail of Author Accepted Manuscript]
Preview
Text (Author Accepted Manuscript) (application/pdf)
ConvAI-Library-Reference_ReferenceServicesReview_AAM_2026.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial.
1MB

Official URL: https://doi.org/10.1108/RSR-05-2025-0030

Abstract

This study has two aims. First, we sought to implement a RAG-based GenAI system capable of answering reference questions. Second, we aimed to develop an evaluation protocol to assess the chatbot by means of comparing implementations that use three different LLMs. An evaluation rubric was piloted to gauge its viability as an assessment tool.

The RAG-based chatbot uses a two-step approach. First, in response to a query, the system retrieves relevant documents from a knowledge base. Each document is vectorized and matched by relevance. Second, retrieved data is combined with an LLM's generative capabilities to produce a context-aware response.

Fourteen common questions representing different areas of the knowledge base were tested with the chatbot versions. The research team developed and then used an evaluation rubric to score the chatbots’ responses according to: accuracy, groundedness, elicitation, completeness, and further assistance. The rubric was also evaluated by calculating the standard deviation among reviewers’ scores.

The RAG implementations were largely successful in restricting the chatbot’s responses to the knowledge base. The evaluation rubric was effective for assessing the models, highlighting each’s strengths and weaknesses. Despite the evaluation being subjective, the evaluators gave similar scores, with the greatest variation in the elicitation dimension.

This study offers a technical description of a practical way to implement a RAG-based chatbot in a library setting as well as a protocol for evaluating such chatbots in multiple dimensions that hasn’t been discussed in previous literature.

Divisions:Concordia University > Library
Item Type:Article
Refereed:Yes
Authors:Fitzgibbons, Megan and Berrizbeitia, Francisco and Chalifour, Joshua and Stouhi, Yara and Charbonneau, Olivier and Majerczyk, Aviva
Journal or Publication:Reference Services Review
Date:22 January 2026
Funders:
  • Concordia Applied AI Institute: Collaborations with Industry Grant
  • Concordia Library Research Grant
Digital Object Identifier (DOI):10.1108/RSR-05-2025-0030
Keywords:reference services, artificial intelligence, evaluation protocol, technological change, service delivery, generative AI, chatbot
ID Code:996676
Deposited By: Joshua Chalifour
Deposited On:23 Jan 2026 21:41
Last Modified:23 Jan 2026 21:41
Related URLs:

References:

Arce, V., & Ehrenpreis, M. (2023). Improving a library FAQ: Assessment and reflection of the first year’s use. The Reference Librarian, 64(1), 35–50. https://doi.org/10.1080/02763877.2023.2167898

Barus, S. P., & Surijati, E. (2022). Chatbot with Dialogflow for FAQ services in Matana University Library. International Journal of Informatics and Computation, 3(2). https://doi.org/10.35842/ijicom.v3i2.43

Bryant, R. (2024, December 12). Implementing an AI reference chatbot at the University of Calgary Library. Hanging Together. https://hangingtogether.org/implementing-an-ai-reference-chatbot-at-the-university-of-calgary-library/

Cox, A. (2023). How artificial intelligence might change academic library work: Applying the competencies literature and the theory of the professions. Journal of the Association for Information Science and Technology, 74(3), 367–380. https://doi.org/10.1002/asi.24635

Danuarta, L., Mawardi, V. C., & Lee, V. (2024). Retrieval-Augmented Generation (RAG) Large Language Model for educational chatbot. 2024 Ninth International Conference on Informatics and Computing (ICIC), 1–6. https://doi.org/10.1109/ICIC64337.2024.10957676

Ehrenpreis, M., & DeLooper, J. (2025). Chatbot assessment: Best practices for artificial intelligence in the library. Portal: Libraries and the Academy, 25(4), 671–702.

Ehrenpreis, M., & DeLooper, J. (2022). Implementing a chatbot on a library website. Journal of Web Librarianship, 16(2), 120–142. https://doi.org/10.1080/19322909.2022.2060893

Feng, Y., Wang, J., & Anderson, S. G. (2024). Ethical considerations in integrating AI in research consultations: Assessing the possibilities and limits of GPT-based chatbots. Journal of eScience Librarianship, 13(1), e846. https://doi.org/10.7191/jeslib.846

Guy, J., Pival, P. R., Lewis, C. J., & Groome, K. (2023). Reference Chatbots in Canadian Academic Libraries. Information Technology and Libraries, 42(4). https://doi.org/10.5860/ital.v42i4.16511

Ivanovskaya, A., Aksyonov, K., Kalinin, I., Chiryshev, Y., & Aksyonova, O. (2019). Development of the text analysis software agent (chat bot) for the library based on the question and answer system TWIN. ITM Web of Conferences, 30, 04006. https://doi.org/10.1051/itmconf/20193004006

Kane, D. (2019, April 10). Analyzing an interactive chatbot and its impact on academic reference services. ACRL 19th National Conference, Cleveland, Ohio. http://hdl.handle.net/11213/17624

Kansal, A. (2024). Building generative AI-powered apps: A hands-on guide for developers. Apress. https://doi.org/10.1007/979-8-8688-0205-8

Lai, K. (2023). How well does ChatGPT handle reference inquiries? An analysis based on question types and question complexities. College & Research Libraries, 84(6). https://doi.org/10.5860/crl.84.6.974

LangChain. (n.d.). Overview. https://docs.langchain.com/oss/python/langchain/overview

Lappalainen, Y., & Narayanan, N. (2023). Aisha: a custom AI library chatbot using the ChatGPT API. Journal of Web Librarianship, 17(3), 37–58. https://doi.org/10.1080/19322909.2023.2221477

Olawore, K., McTear, M., & Bi, Y. (2025). Development and evaluation of a university chatbot using deep learning: A RAG-based approach. In A. Følstad, S. Papadopoulos, T. Araujo, E. L.-C. Law, E. Luger, S. Hobert, & P. B. Brandtzaeg (Eds.), Chatbots and Human-Centered AI (pp. 96–111). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-88045-2_7

Panda, S., & Chakravarty, R. (2022). Adapting intelligent information services in libraries: A case of smart AI chatbots. Library Hi Tech News, 39(1), 12–15. https://doi.org/10.1108/LHTN-11-2021-0081

Reinsfelder, T. L., & O’Hara-Krebs, K. (2023). Implementing a rules-based chatbot for reference service at a large university library. Journal of Web Librarianship, 17(4), 95-109. https://doi.org/10.1080/19322909.2023.2268832

Rodriguez, S., & Mune, C. (2022). Uncoding library chatbots: Deploying a new virtual reference tool at the San Jose State University library. Reference Services Review, 50(3/4), 392–405. https://doi.org/10.1108/RSR-05-2022-0020

Thalaya, N., & Puritat, K. (2022). BCNPYLIB CHAT BOT: The artificial intelligence Chatbot for library services in college of nursing. 2022 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON), 247–251. https://ieeexplore.ieee.org/abstract/document/9720367/

University of Oklahoma Libraries. (n.d.). Project Highlight: Bizzy Chat Bot | OU Libraries. Retrieved March 13, 2025, from https://libraries.ou.edu/content/project-highlight-bizzy-chat-bot

University of Texas Libraries. (2024, November 30). Building a bot: An exploration of AI to assist librarians. TexLibris. https://texlibris.lib.utexas.edu/2024/11/building-a-bot-an-exploration-of-ai-to-assist-librarians/

Yang, S. Q., & Mason, S. (2024). Beyond the algorithm: understanding how ChatGPT handles complex library queries. Internet Reference Services Quarterly. https://doi.org/10.1080/10875301.2023.2291441
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top