Alsaig, Alaa (2023) A Tight Coupling Context-Based Framework for Dataset Discovery. PhD thesis, Concordia University.
Preview |
Text (application/pdf)
1MBAlsaig_25194601_PhD_S2023.pdf - Updated Version Available under License Spectrum Terms of Access. |
Abstract
Discovering datasets of relevance to meet research goals is at the core of different analysis tasks in order to prove proposed hypothesis and theories. In particular, researchers
in Artificial Intelligence (AI) and Machine Learning (ML) research domains where relevant
datasets are essential for precise predictions have identified how the absence of methods to
discover quality datasets are leading to delay and in many cases failure, of ML projects.
Many research reports have brought out the absence of dataset discovery methods that fills
the gap between analysis requirements and available datasets, and have given statistics to
show how it hinders the process of analysis, with completion rate less than 2%. To the
best of our knowledge, removing the above inadequacies remains “an open problem of great
importance”. It is in this context that the thesis is making a contribution on context-based
tightly coupled framework that will tightly couple dataset providers and data analytics
teams. Through this framework, dataset providers publish the metadata descriptions of
their datasets and analysts formulate and submit rich queries with goal specifications and
quality requirements. The dataset search engine component tightly couples the query specification
with metadata specifications datasets through a formal contextualized semantic
matching and quality-based ranking and discover all datasets that are relevant to analyst
requirements. The thesis gives a proof of concept prototype implementation and reports on
its performance and efficiency through a case study.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (PhD) |
Authors: | Alsaig, Alaa |
Institution: | Concordia University |
Degree Name: | Ph. D. |
Program: | Software Engineering |
Date: | 15 May 2023 |
Thesis Supervisor(s): | Vangalur, Alagar and Olga, Ormandjieva |
ID Code: | 992253 |
Deposited By: | ALAA ABDULBASIT ALSAIG |
Deposited On: | 17 Nov 2023 14:57 |
Last Modified: | 17 Nov 2023 14:57 |
Repository Staff Only: item control page