Login | Register

a framework for automated similarity analysis of malware


a framework for automated similarity analysis of malware

Song, Weilong (2014) a framework for automated similarity analysis of malware. Masters thesis, Concordia University.

[thumbnail of Song_MASc_F2014.pdf]
Text (application/pdf)
Song_MASc_F2014.pdf - Accepted Version
Available under License Spectrum Terms of Access.


Malware, a category of software including viruses, worms, and other malicious programs, is developed by hackers to damage, disrupt, or perform other harmful actions on data, computer systems and networks. Malware analysis, as an indispensable part of the work of IT security specialists, aims to gain an in-depth understanding of malware code. Manual analysis of malware is a very costly and time-consuming process. As more malware variants are evolved by hackers who occasionally use a copy-paste-modify programming style to accelerate the generation of large number of malware, the effort spent in analyzing similar pieces of malicious code has dramatically grown. One approach to remedy this situation is to automatically perform similarity analysis on malware samples and identify the functions they share in order to minimize duplicated effort in analyzing similar codes of malware variants.

In this thesis, we present a framework to match cloned functions in a large chunk of malware samples. Firstly, the instructions of the functions to be analyzed are extracted from the disassembled malware binary code and then normalized. We propose a new similarity metric and use it to determine the pair-wise similarity among malware samples based on the calculated similarity of their functions. The developed tool also includes an API class recognizer designed to determine probable malicious operations that can be performed by malware functions. Furthermore, it allows us to visualize the relationship among functions inside malware codes and locate similar functions importing the same API class. We evaluate this framework on three malware datasets including metamorphic viruses created by malware generation tools, real-life malware variants in the wild, and two well-known botnet trojans. The obtained experimental results confirm that the proposed framework is effective in detecting similar malware code.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:Thesis (Masters)
Authors:Song, Weilong
Institution:Concordia University
Degree Name:M.A. Sc.
Program:Information Systems Security
Date:September 2014
ID Code:978935
Deposited By: WEI LONG SONG
Deposited On:04 Nov 2014 17:09
Last Modified:18 Jan 2018 17:48
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top