Login | Register

A Design and Implementation of Learned Index for Processing Multi-Dimensional Queries Over Relational Data

Title:

A Design and Implementation of Learned Index for Processing Multi-Dimensional Queries Over Relational Data

Parmiss, Shahinfard (2025) A Design and Implementation of Learned Index for Processing Multi-Dimensional Queries Over Relational Data. Masters thesis, Concordia University.

[thumbnail of Shahinfard_MSc_F2025.pdf.pdf]
Preview
Text (application/pdf)
Shahinfard_MSc_F2025.pdf.pdf - Accepted Version
Available under License Spectrum Terms of Access.
1MB

Abstract

We study the performance evaluation and analysis of Flood, a learned index designed to process multi-dimensional queries over relational data. Unlike traditional indexing methods such as KD-Trees and R-Trees, which rely on static partitioning strategies, Flood leverages machine learning techniques to dynamically adapt its grid structure and data layout based on data distribution and query workloads. We identify and evaluate the core components of the Flood framework—including grid-based partitioning, learned layout optimization, and refinement steps—and assess the impact of each component on overall system performance. We compare Flood’s efficiency and scalability against baselines such as KD-Trees, Z-order indexing, and brute-force scan across multiple datasets and query workloads. Our experiments reveal that scan emerges as the dominant bottleneck, with layout tuning and component configuration introducing significant overhead. To gain a deeper understanding of Flood and explore opportunities for improvement, we developed a modular prototype implementation from scratch. This modular design enabled a systematic, in-depth performance study by isolating components and allowing for alternative configurations. We also refined the cost model calibration and proposed a new optimization strategy for guiding layout selection. Our work contributes to more effective configuration and tuning of learned indexes by offering insights into the trade-offs, limitations, and opportunities of using learned indexes as an alternative or complementary solution for supporting multi-dimensional queries in relational database systems.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Parmiss, Shahinfard
Institution:Concordia University
Degree Name:M. Sc.
Program:Computer Science
Date:June 2025
Thesis Supervisor(s):Shiri, Nematollaah
Keywords:Learned Indexes, Big Data Analytics, Machine Learning in Databases, Multidimensional Indexing, Quer Cost Modeling, Data-Driven Optimization, Layout Optimization
ID Code:995759
Deposited By: Parmiss Shahinfard
Deposited On:04 Nov 2025 15:40
Last Modified:04 Nov 2025 15:40
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top