Keivanloo, Iman (2013) Source Code Similarity and Clone Search. PhD thesis, Concordia University.
Preview |
Text (application/pdf)
5MBKeivanloo_PhD_F2013.pdf - Accepted Version |
Abstract
Historically, clone detection as a research discipline has focused on devising source code similarity measurement and search solutions to cancel out effects of code reuse in software maintenance. However, it has also been observed that identifying duplications and similar programming patterns can be exploited for pragmatic reuse. Identifying such patterns requires a source code similarity model for detection of Type-1, 2, and 3 clones. Due to the lack of such a model, ad-hoc pattern detection models have been devised as part of state of the art solutions that support pragmatic reuse via code search.
In this dissertation, we propose a clone search model which is based on the clone detection principles and satisfies the fundamental requirements for supporting pragmatic reuse. Our research presents a clone search model that not only supports scalability, short response times, and Type-1, 2 and 3 detection, but also emphasizes the need for supporting ranking as a key functionality. Our model takes advantage of a multi-level (non-positional) indexing approach to achieve a scalable and fast retrieval with high recall. Result sets are ranked using two ranking approaches: Jaccard similarity coefficient and the cosine similarity (vector space model) which exploits the code patterns’ local and global frequencies. We also extend the model by adapting a form of semantic search to cover bytecode code. Finally, we demonstrate how the proposed clone search model can be applied for spotting working code examples in the context of pragmatic reuse. Further evidence of the applicability of the clone search model is provided through performance evaluation.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (PhD) |
Authors: | Keivanloo, Iman |
Institution: | Concordia University |
Degree Name: | Ph. D. |
Program: | Computer Science |
Date: | 20 June 2013 |
Thesis Supervisor(s): | Rilling, Juergen |
ID Code: | 977472 |
Deposited By: | IMAN KEIVANLOO |
Deposited On: | 13 Jan 2014 14:41 |
Last Modified: | 18 Jan 2018 17:44 |
Repository Staff Only: item control page