Login | Register

A Tight Coupling Context-Based Framework for Dataset Discovery


A Tight Coupling Context-Based Framework for Dataset Discovery

Alsaig, Alaa (2023) A Tight Coupling Context-Based Framework for Dataset Discovery. PhD thesis, Concordia University.

[thumbnail of Alsaig_25194601_PhD_S2023.pdf]
Text (application/pdf)
Alsaig_25194601_PhD_S2023.pdf - Updated Version
Available under License Spectrum Terms of Access.


Discovering datasets of relevance to meet research goals is at the core of different analysis tasks in order to prove proposed hypothesis and theories. In particular, researchers
in Artificial Intelligence (AI) and Machine Learning (ML) research domains where relevant
datasets are essential for precise predictions have identified how the absence of methods to
discover quality datasets are leading to delay and in many cases failure, of ML projects.
Many research reports have brought out the absence of dataset discovery methods that fills
the gap between analysis requirements and available datasets, and have given statistics to
show how it hinders the process of analysis, with completion rate less than 2%. To the
best of our knowledge, removing the above inadequacies remains “an open problem of great
importance”. It is in this context that the thesis is making a contribution on context-based
tightly coupled framework that will tightly couple dataset providers and data analytics
teams. Through this framework, dataset providers publish the metadata descriptions of
their datasets and analysts formulate and submit rich queries with goal specifications and
quality requirements. The dataset search engine component tightly couples the query specification
with metadata specifications datasets through a formal contextualized semantic
matching and quality-based ranking and discover all datasets that are relevant to analyst
requirements. The thesis gives a proof of concept prototype implementation and reports on
its performance and efficiency through a case study.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (PhD)
Authors:Alsaig, Alaa
Institution:Concordia University
Degree Name:Ph. D.
Program:Software Engineering
Date:15 May 2023
Thesis Supervisor(s):Vangalur, Alagar and Olga, Ormandjieva
ID Code:992253
Deposited On:17 Nov 2023 14:57
Last Modified:17 Nov 2023 14:57
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top