Login | Register

A Hilbert space compression framework for parallel relational OLAP


A Hilbert space compression framework for parallel relational OLAP

Cueva, David (2007) A Hilbert space compression framework for parallel relational OLAP. Masters thesis, Concordia University.

[thumbnail of MR28950.pdf]
Text (application/pdf)
MR28950.pdf - Accepted Version


The Data Cube is the central abstraction behind the power of On-Line Analytical Processing (OLAP) systems. It enables knowledge workers to analyze vast amounts of enterprise data, in order to make timely and informed decisions. Nevertheless, this potential is accompanied by massive amounts of storage requirements. This fact challenges even the most powerful parallel systems to efficiently pack and manage the data, allowing the delivery of fast responses to user queries. An ample number of methods concentrate on generic data compression, but relatively few have been proposed which fulfill the requirements of distributed OLAP environments. This thesis investigates these opportunities by presenting a compression framework specially tailored for parallel OLAP. New algorithms for data and index compression are proposed. The data encoding method is based on the Hilbert space-filling curve. It eliminates the inherent redundancies in OLAP data and preserves compression granularity at the block level. The index encoding technique is based on the concept of packed R-trees, and when combined with our native query engine provides fast and I/O efficient access to the specific blocks satisfying a user request. Additionally, unlimited scalability is achieved by a set of supporting techniques permitting the framework to handle arbitrarily large data sets. We have performed a broad evaluation of our framework, including comparison benchmarks with several influential published methods. Our compression algorithms deliver state-of-the-art ratios with averages above 80% for data and 95% for indexes. Finally, the number of blocks accessed by the query engine is typically reduced by a factor of 10

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Cueva, David
Pagination:xii, 117 leaves : ill. ; 29 cm.
Institution:Concordia University
Degree Name:M. Comp. Sc.
Program:Computer Science and Software Engineering
Thesis Supervisor(s):Eavis, Todd
Identification Number:LE 3 C66C67M 2007 C84
ID Code:975319
Deposited By: Concordia University Library
Deposited On:22 Jan 2013 16:06
Last Modified:13 Jul 2020 20:07
Related URLs:
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top