Kim, Soyoung (2016) On the Semantics of Queries over Graphs with Uncertainty. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
3MBKim_MSc_F2016.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
We study the semantics of queries over uncertain graphs, which are directed graphs in which each edge is associated with a value in [0,1] representing its cer- tainty. In this work, we consider the certainty values as probabilities and show the challenges involved in evaluating the reachability and transitive closure queries over uncertain/probabilistic graphs. As the evaluation method, we adopted graph re- duction from automata theory used for finding regular expressions for input finite state machines. However, we show that different order of eliminating nodes may yield different certainty associated with the results. We then formulate the notion of "correct" results for queries over uncertain graphs, justified based on the notion of common sub-expressions, and identify common paths and avoid their redundant multiple contributions during the reduction. We identify a set of possible patterns to facilitate the reduction process. We have implemented the proposed ideas for answering reachability and transitive closure queries. We evaluated the effectiveness of the proposed solutions using a library of many uncertain graphs with different sizes and structures. We believe the proposed ideas and solution techniques can yield query processing tools for uncertain data management systems.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Kim, Soyoung |
Institution: | Concordia University |
Degree Name: | M. Comp. Sc. |
Program: | Computer Science and Software Engineering |
Date: | 2 September 2016 |
Thesis Supervisor(s): | Shiri, Nematollaah |
Keywords: | Uncertain Graphs, transitive closure, recursive queries, reachability query problem, probabilistic graphs |
ID Code: | 981891 |
Deposited By: | SOYOUNG KIM |
Deposited On: | 08 Nov 2016 16:16 |
Last Modified: | 18 Jan 2018 17:54 |
References:
[1] Y. Amsterdamer, D. Deutch, T. Milo, and V. Tannen. On provenance minimiza- tion. ACM Transactions on Database Systems (TODS), 37(4):30, 2012.[2] K. Arijit and C. Lei. On uncertain graphs modeling and queries. Proceedings of the VLDB Endowment, 8(12):2042–2043, 2015.
[3] C. Baier and M. Grosser. Recognizing ω-regular languages with probabilistic automata. In 20th Annual IEEE Symposium on Logic in Computer Science (LICS’05), pages 137–146. IEEE, 2005.
[4] F. Bonchi, F. Gullo, A. Kaltenbrunner, and Y. Volkovich. Core decomposition of uncertain graphs. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1316–1325. ACM, 2014.
[5] K. Chatterjee, L. Doyen, and T. A. Henzinger. Probabilistic weighted automata. In International Conference on Concurrency Theory, pages 244–258. Springer, 2009.
[6] E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SIAM Journal on Computing, 32(5):1338–1355, 2003.
80
[7] G. S. Fishman. A comparison of four monte carlo methods for estimating the probability of st connectedness. IEEE Transactions on reliability, 35(2):145–155, 1986.
[8] R. W. Floyd. Algorithm 97: Shortest path. Communications of the ACM, 5(6):345, 1962.
[9] M. Hua and J. Pei. Probabilistic path queries in road networks: traffic uncer- tainty aware path selection. In Proceedings of the 13th International Conference on Extending Database Technology, pages 347–358. ACM, 2010.
[10] J. Jarvis and D. R. Shier. Graph-theoretic analysis of finite markov chains. Applied mathematical modeling: a multidisciplinary approach, 1999.
[11] R. Jin, L. Liu, and C. C. Aggarwal. Discovering highly reliable subgraphs in uncertain graphs. In Proceedings of the 17th ACM SIGKDD international con- ference on Knowledge discovery and data mining, pages 992–1000. ACM, 2011.
[12] R. Jin, L. Liu, B. Ding, and H. Wang. Distance-constraint reachability compu- tation in uncertain graphs. Proceedings of the VLDB Endowment, 4(9):551–562, 2011.
[13] R. E. Korf. Depth-first iterative-deepening: An optimal admissible tree search. Artificial Intelligence, 27:97–109, 1985.
[14] P. Linz. An Introduction to Formal Languages and Automata 5th edtion. John Wiley and Sons Ltd.,Chichester, fifth edition, 2012.
81
[15] L. Lü and T. Zhou. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and its Applications, 390(6):1150–1170, 2011.
[16] S. Maniu, R. Cheng, and P. Senellart. Probtree: A query-efficient representation of probabilistic graphs. In 1st International Workshop on Big Uncertain Data, BUDA 2014, 2014.
[17] M. Mohri. Semiring frameworks and algorithms for shortest-distance problems. Journal of Automata, Languages and Combinatorics, 7(3):321–350, 2002.
[18] M. Mohri. Weighted finite-state transducer algorithms. an overview. In Formal Languages and Applications, pages 551–563. Springer, 2004.
[19] E. Nuutila and E. Soisalon-Soininen. On finding the strongly connected compo- nents in a directed graph. Information Processing Letters, 49(1):9–14, 1994.
[20] P. Parchas. Uncertain graph processing through representative instances and sparsification. In VLDB PhD Workshop, Hawaii, USA, 2015.
[21] M. Potamias, F. Bonchi, A. Gionis, and G. Kollios. K-nearest neighbors in uncertain graphs. Proceedings of the VLDB Endowment, 3(1-2):997–1008, 2010.
[22] W. H. Press. Numerical recipes 3rd edition: The art of scientific computing. Cambridge university press, 2007.
[23] P. Przymus, A. Boniewicz, M. Burzańska, and K. Stencel. Recursive query facilities in relational databases: a survey. In Database Theory and Application, Bio-Science and Bio-Technology, pages 89–99. Springer, 2010.
82
[24] S. Roy, V. Perduca, and V. Tannen. Faster query answering in probabilistic databases using read-once functions. In Proceedings of the 14th International Conference on Database Theory, pages 232–243. ACM, 2011.
[25] A. D. Sarma, O. Benjelloun, A. Halevy, and J. Widom. Working models for un- certain data. In 22nd International Conference on Data Engineering (ICDE’06), pages 7–7. IEEE, 2006.
[26] P. Sen, A. Deshpande, and L. Getoor. Read-once functions and query evaluation in probabilistic databases. Proceedings of the VLDB Endowment, 3(1-2):1068– 1079, 2010.
[27] I. Sevo. Probabilitic graphs. Ser. Math. Inform., 28(1):27–42, 2013.
[28] P. Sevon, L. Eronen, P. Hintsanen, K. Kulovesi, and H. Toivonen. Link discovery in graphs derived from biological databases. In International Workshop on Data Integration in the Life Sciences, pages 35–49. Springer, 2006.
[29] R. E. Tarjan. Depth-first search and linear graph algorithms. SIAM Journal on Computing, 1(2):146–160, 1972.
Repository Staff Only: item control page