Luo, Jun (2012) ECTree: An Extended Tree Index Structure for Attributed Subgraph Queries. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
2MBLuo_MASc_S2012.pdf - Accepted Version |
Abstract
Graphs are popular data structures for modeling complex data types, especially graphs with attributes for gene sequences, protein structures, chemical compounds, protein interaction networks, social networks, etc. There is a need for managing such graph data and providing efficient querying tools. In the graph mining realm, the problem lies in indexing a large number of graphs for fast retrieval. Indexing attributed graphs and using attributed queries can provide faster response time and more refined results.
This thesis focuses on extending an existing index to support attributed graph indexing and providing subgraph querying access to the extended index. The aim is
to find a way such that the labels of the graphs as well as the attributes of the graphs are indexed at the same time. A query format is provided to query the extended index on the attributes with flexibility which allows intervals to be used. In addition, regular expressions and label groups are used as query labels so that multiple queries that have similar structures can be combined as a single query. This also benefits in that a query graph does not have to use fixed labels. We also introduce a vertex degree-attribute based vector to capture both the features of a data graph and a query graph. A novel pruning method is proposed and implemented so that the pruning based on the degree-attribute vectors can still be adopted even when it is not clear how to define a histogram pruning for the query graphs that use non-fixed labels. All the techniques presented in our work are validated through experiments on both real and synthetic datasets.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Luo, Jun |
Institution: | Concordia University |
Degree Name: | M.A. Sc. |
Program: | Software Engineering |
Date: | 29 March 2012 |
Thesis Supervisor(s): | Butler, Gregory |
ID Code: | 974001 |
Deposited By: | JUN LUO |
Deposited On: | 19 Jun 2012 17:57 |
Last Modified: | 18 Jan 2018 17:37 |
Repository Staff Only: item control page