Alrabaee S, Saleem N, Preda S, Wang L, Debbabi M. OBA2: an onion approach to binary code authorship attribution. Digit Investig 2014: S94-103. Elsevier. Alrabaee S, Shirani P, Wang L, Debbabi M. SIGMA: a semantic integrated graph matching approach for identifying reused functions in binary code. Digit Investigations 2015;12:S61-71. Balakrishnan G, Reps T.Wysinwyx: what you see is not what you execute. ACM Trans Program Lang Syst (TOPLAS) 2010;32(6) [ACM]. Edler K, Franke T, Bhandarkar P, Dasgupta A. Exploiting function similarity for code size reduction. In: Proceedings of the 2014 SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems; 2014. p. 85-94 [ACM]. Farhadi M, Fung B, Charland P, Debbabi M. BinClone: detecting code clones in malware. In: Software Security Reliability, 2014 Eighth International Conference on. IEEE; 2014. p. 78-87. F. Farnstrom, J. Lewis, and C. Elkan, Scalability for clustering algorithms revisited, ACM SIGKDD Explor Newsl, Vol 21, 51-57. Gascon H, Yamaguchi F, Arp D, Rieck K. Structural detection of android malware using embedded call graphs. In: Proceedings of the 2013 ACM workshop on Artificial intelligence and security; 2013. p. 45-54 [ACM]. Hamerly G, Elkan C. Learning the k in A > means. In: Advances in neural information processing systems16; 2004. p. 281. IDA Pro multi-processor disassembler and debugger, Available from: https://www.hex-rays.com/products/ida/, [accessed 09.06.14]. Jacobson E, Rosenblum N, Miller B. Labeling library functions in stripped binaries. In: The 10th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools (SIGSOFT '11); 2011. p. 1-8 [ACM]. Lindorfer M, Di Federico A, Maggi F, Comparetti PM, Zanero S. Lines of malicious code: insights into the malicious software industry. In: Proceedings of the 28th Annual Computer Security Applications Conference; 2012, December. p. 349-58 [ACM]. Rahimian A, Charland P, Preda S, Debbabi M. RESource: a framework for online matching of assembly with open source code. In: Foundations and Practice of Security (FPS 2013). Springer Berlin Heidelberg; 2013. p. 211-26. Rosenblum N, Miller B, Zhu X. Extracting compiler provenance from program binaries. In: The 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering (SIGSOFT '10); 2010. p. 21e8. ACM. Rosenblum N, Miller B, Zhu X. Recovering the toolchain provenance of binary code. In: The 2011 International Symposium on Software Testing and Analysis; 2011. p. 100-10 [ACM]. Rosenblum N, Zhu X, Miller B. Who wrote this code? Identifying the authors of program binaries. In: Computer security-ESORICS. Springer Berlin Heidelberg; 2011. p. 172e89. Ruttenberg B, Miles C, Kellogg L, Notani V, Howard M, LeDoux C, et al. Identifying shared software components to support malware forensics. In: Detection of Intrusions and Malware, and Vulnerability Assessment. Springer International Publishing; 2014. p. 21-40. Stojanovic S, Radivojevic Z, Cvetanovic M. Approach for estimating similarity between procedures in differently compiled binaries, information and software technology. Elseiver; 2014. The data set. Available from: https://github.com/BinSigma/BinComp/tree/master/Dataset, [accessed 30.04.15]. The Google Code Jam. Available from: https://code.google.com/codejam, [accessed 27.10.14]. The PEiD tool. Available from: http://www.woodmann.com/collaborative/tools/index.php/PEiD, [accessed 14.08.14]. The RDG Packer Detector. Available from: http://www.woodmann.com/collaborative/tools/index.php/RDG_Packer_Detector, [accessed 14.08.14]. Toderici A, Stamp M. Chi-squared distance and metamorphic virus detection. J Comput Virol Hacking Tech 2013;9(0):1-14. Springer.