Berrizbeitia, Francisco ORCID: https://orcid.org/0000-0002-1542-8435 and Chalifour, Joshua ORCID: https://orcid.org/0000-0001-7663-0509 (2024) Beyond the Hype. Deploying and Evaluating a Conversational Agent Using LLMs in an Academic Setting. In: Access 2024, 21 Oct - 23 Oct, Montreal. (Unpublished)
Preview |
Slideshow (application/pdf)
855kBbeyond-the-hype.pdf - Presentation Available under License Spectrum Terms of Access. |
Abstract
This presentation will cover our ongoing work investigating and deploying generative AI technology in the context of libraries and memory institutions. It’s not novel that libraries provide online human or machine-based chat services, but taking advantage of generative AI requires new technical approaches and considerations around the ethics and usefulness of conversational agents. We will discuss our development of a chatbot configured for delivering academic library information services. This includes defining a protocol for assessing and guiding implementation decisions as well as evaluating the tool’s utility.
Our initial step in developing the chatbot involved building a knowledge base (stored on an in-house metadata management system), which could be connected to generative AI technology. Next, we experimented with a variety of open source and proprietary language models to understand how each performs. We are testing the following approaches: A closed source large language model (Bing Chat / Gemini / ChatGPT) prompted to act as reference personnel; a context-aware closed source LLM (OpenAI GPT); and a context-aware open source LLM (Llama). We are testing with questions that a useful chatbot should be able to answer. The chatbot’s responses for each approach are evaluated comparatively.
A key objective of this project is the testing protocol and evaluation framework. Reference questions often require a dynamic conversation, iterating on the direction of inquiry. This makes it challenging to evaluate outputs as merely accurate or inaccurate. Our study builds on Lai (2023) to develop a testing protocol, incorporating multiple dimensions of user interactions. Our protocol will support the interrogation of ethical concerns around these technologies and their application. We are operationalizing aspects of the LC Labs AI Planning Framework (Library of Congress, 2023) to define use cases for generative AI in information services and ethical criteria.
Divisions: | Concordia University > Library |
---|---|
Item Type: | Conference or Workshop Item (Lecture) |
Refereed: | No |
Authors: | Berrizbeitia, Francisco and Chalifour, Joshua |
Date: | 22 October 2024 |
ID Code: | 994745 |
Deposited By: | Francisco Berrizbeitia |
Deposited On: | 04 Nov 2024 19:52 |
Last Modified: | 04 Nov 2024 19:52 |
Repository Staff Only: item control page