Login | Register

A Personal Research Agent for Semantic Knowledge Management of Scientific Literature


A Personal Research Agent for Semantic Knowledge Management of Scientific Literature

Sateli, Bahar ORCID: https://orcid.org/0000-0001-6863-2037 (2018) A Personal Research Agent for Semantic Knowledge Management of Scientific Literature. PhD thesis, Concordia University.

[thumbnail of Sateli_PhD_S2018.pdf]
Text (application/pdf)
Sateli_PhD_S2018.pdf - Accepted Version
Available under License Spectrum Terms of Access.


The unprecedented rate of scientific publications is a major threat to the productivity of knowledge workers, who rely on scrutinizing the latest scientific discoveries for their daily tasks. Online digital libraries, academic publishing databases and open access repositories grant access to a plethora of information that can overwhelm a researcher, who is looking to obtain fine-grained knowledge relevant for her task at hand. This overload of information has encouraged researchers from various disciplines to look for new approaches in extracting, organizing, and managing knowledge from the immense amount of available literature in ever-growing repositories.

In this dissertation, we introduce a Personal Research Agent that can help scientists in discovering, reading and learning from scientific documents, primarily in the computer science domain. We demonstrate how a confluence of techniques from the Natural Language Processing and Semantic Web domains can construct a semantically-rich knowledge base, based on an inter-connected graph of scholarly artifacts – effectively transforming scientific literature from written content in isolation, into a queryable web of knowledge, suitable for machine interpretation.

The challenges of creating an intelligent research agent are manifold: The agent's knowledge base, analogous to his 'brain', must contain accurate information about the knowledge `stored' in documents. It also needs to know about its end-users' tasks and background knowledge. In our work, we present a methodology to extract the rhetorical structure (e.g., claims and contributions) of scholarly documents. We enhance our approach with entity linking techniques that allow us to connect the documents with the Linked Open Data (LOD) cloud, in order to enrich them with additional information from the web of open data. Furthermore, we devise a novel approach for automatic profiling of scholarly users, thereby, enabling the agent to personalize its services, based on a user's background knowledge and interests. We demonstrate how we can automatically create a semantic vector-based representation of the documents and user profiles and utilize them to efficiently detect similar entities in the knowledge base. Finally, as part of our contributions, we present a complete architecture providing an end-to-end workflow for the agent to exploit the opportunities of linking a formal model of scholarly users and scientific publications.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (PhD)
Authors:Sateli, Bahar
Institution:Concordia University
Degree Name:Ph. D.
Program:Computer Science
Date:9 February 2018
Thesis Supervisor(s):Witte, René
Keywords:Semantic Publishing, Semantic Web, Natural Language Processing, Knowledge Base, Scientific Literature, Text Mining, Artificial Intelligence, Personal Research Agent
ID Code:983757
Deposited By: BAHAR SATELI
Deposited On:05 Jun 2018 14:19
Last Modified:05 Jun 2018 14:19
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top