Zhang, Yefei (2025) Toward Practical Machine Learning Solutions for IoT Malware Analysis. PhD thesis, Concordia University.
Preview |
Text (application/pdf)
27MBZhang_PhD_F2025.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
The Internet of Things (IoT) is rapidly expanding across personal, commercial, and critical infrastructure sectors. However, the low-cost requirements, limited resources, and heterogeneous ecosystems of IoT devices leave them highly vulnerable to malware attacks. Once compromised, attackers can exploit these devices for distributed denial-of-service (DDoS) attacks, data theft, or monetization. The evolving malware landscape—driven by obfuscation, packing, and code reuse—further complicates defenses, as adversaries continuously release new variants and families.
Machine Learning (ML) has become a cornerstone of malware analysis (MA), providing the ability to process large volumes of data and outperform traditional detection methods. Despite this promise, translating ML-based defenses into practical, real-world solutions remains a critical challenge. The difficulty stems from the conflict between ML’s assumption of independent and identically distributed (i.i.d.) data and the intrinsic characteristics of the malware domain, where distribution shifts and adversarial threats are the norm: drift leads to inevitable model aging without adaptation, while AML exploits model vulnerabilities to manipulate predictions.
The practicality of solutions is shaped not only by these challenges but by the entire research process. If domain-specific characteristics are overlooked during this process, the resulting solutions risk limited applicability. Malware research can be categorized into three types: observational studies that analyze malware behavior and landscape features, offensive research that exposes security gaps, and engineering solutions that propose defensive mechanisms. Although observational and offensive works might be expected to dominate, evidence shows that engineering solutions have been disproportionately prevalent, often at the expense of a deeper understanding of domain-specific realities.
To bridge the translation gap of current ML-based Malware analysis research, this research emphasizes the practicality of ML-based IoT malware defense solutions by aligning them with domain-specific characteristics and requirements. The contributions made include: (1) analyzing the vulnerabilities of existing ML-based malware detection models against AML attacks, (2) examining the impact of fundamental malware landscape attributes, including temporal, spatial, and architectural inconsistencies, on solution robustness, and (3) designing new IoT malware classification frameworks that integrate drift-handling mechanisms. By combining observational studies, offensive analysis, and engineering solutions, the research advances both theoretical understanding and practical deployment of ML-based malware defense.
| Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
|---|---|
| Item Type: | Thesis (PhD) |
| Authors: | Zhang, Yefei |
| Institution: | Concordia University |
| Degree Name: | Ph. D. |
| Program: | Computer Science |
| Date: | 15 October 2025 |
| Thesis Supervisor(s): | Assi, Chadi and Yan, Jun |
| ID Code: | 996543 |
| Deposited By: | Yefei Zhang |
| Deposited On: | 29 Jun 2026 15:34 |
| Last Modified: | 29 Jun 2026 15:34 |
Repository Staff Only: item control page


Download Statistics
Download Statistics