Toward Practical Machine Learning Solutions for IoT Malware Analysis

Title:

Toward Practical Machine Learning Solutions for IoT Malware Analysis

Zhang, Yefei (2025) Toward Practical Machine Learning Solutions for IoT Malware Analysis. PhD thesis, Concordia University.

Preview

Text (application/pdf)
Zhang_PhD_F2025.pdf - Accepted Version
Available under License Spectrum Terms of Access.

27MB

Abstract

The Internet of Things (IoT) is rapidly expanding across personal, commercial, and critical infrastructure sectors. However, the low-cost requirements, limited resources, and heterogeneous ecosystems of IoT devices leave them highly vulnerable to malware attacks. Once compromised, attackers can exploit these devices for distributed denial-of-service (DDoS) attacks, data theft, or monetization. The evolving malware landscape—driven by obfuscation, packing, and code reuse—further complicates defenses, as adversaries continuously release new variants and families.

Machine Learning (ML) has become a cornerstone of malware analysis (MA), providing the ability to process large volumes of data and outperform traditional detection methods. Despite this promise, translating ML-based defenses into practical, real-world solutions remains a critical challenge. The difficulty stems from the conflict between ML’s assumption of independent and identically distributed (i.i.d.) data and the intrinsic characteristics of the malware domain, where distribution shifts and adversarial threats are the norm: drift leads to inevitable model aging without adaptation, while AML exploits model vulnerabilities to manipulate predictions.

The practicality of solutions is shaped not only by these challenges but by the entire research process. If domain-specific characteristics are overlooked during this process, the resulting solutions risk limited applicability. Malware research can be categorized into three types: observational studies that analyze malware behavior and landscape features, offensive research that exposes security gaps, and engineering solutions that propose defensive mechanisms. Although observational and offensive works might be expected to dominate, evidence shows that engineering solutions have been disproportionately prevalent, often at the expense of a deeper understanding of domain-specific realities.

To bridge the translation gap of current ML-based Malware analysis research, this research emphasizes the practicality of ML-based IoT malware defense solutions by aligning them with domain-specific characteristics and requirements. The contributions made include: (1) analyzing the vulnerabilities of existing ML-based malware detection models against AML attacks, (2) examining the impact of fundamental malware landscape attributes, including temporal, spatial, and architectural inconsistencies, on solution robustness, and (3) designing new IoT malware classification frameworks that integrate drift-handling mechanisms. By combining observational studies, offensive analysis, and engineering solutions, the research advances both theoretical understanding and practical deployment of ML-based malware defense.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:	Thesis (PhD)
Authors:	Zhang, Yefei
Institution:	Concordia University
Degree Name:	Ph. D.
Program:	Computer Science
Date:	15 October 2025
Thesis Supervisor(s):	Assi, Chadi and Yan, Jun
ID Code:	996543
Deposited By:	Yefei Zhang
Deposited On:	29 Jun 2026 15:34
Last Modified:	29 Jun 2026 15:34

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

Toward Practical Machine Learning Solutions for IoT Malware Analysis

Toward Practical Machine Learning Solutions for IoT Malware Analysis

Abstract