Bagora, Prateek (2023) Data Labeling for Fault Detection in Cloud: A Test Suite-Based Active Learning Approach. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
2MBBagora_MCS_F2023.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Cloud computing enables ubiquitous on-demand network access to a shared pool of configurable computing resources with minimal management efforts from the user. It has evolved as a key computing paradigm to enable a wide variety of applications such as e-commerce, social networks, high-performance computing, mission-critical applications, and Internet of Things (IoT). Ensuring the quality of service of applications deployed in inherently complex and fault-prone cloud environments is of utmost concern to service providers and end users. Machine learning-based fault management solutions enable proactive identification and mitigation of faults in cloud environments to attain the desired reliability, though they require labeled cloud metrics data for training and evaluation. Moreover, the high dynamicity in cloud environments brings forth emerging data distributions, which necessitate frequent labeling of cloud metrics data stemming from an evolving data distribution for model adaptation. In this thesis, we study the problem of data labeling for fault detection in cloud environments, paying close attention to the phenomenon of evolving cloud metric data distributions. More specifically, we propose a test suite-based active learning framework for automated labeling of cloud metrics data with the corresponding cloud system state while accounting for emerging fault patterns and data or concept drifts. We implemented our solution on a cloud testbed and introduced various emerging data distribution scenarios to evaluate the proposed framework's labeling efficacy over known and emerging data distributions. According to our evaluation results, the proposed framework achieves about 41% higher weighted F1-score and 34% higher average Area Under One-vs-Rest Receiver Operating Characteristic curves (OvR ROC AUC score) than a system without any adaptation for emerging data distributions.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Bagora, Prateek |
Institution: | Concordia University |
Degree Name: | M. Comp. Sc. |
Program: | Computer Science |
Date: | 24 May 2023 |
Thesis Supervisor(s): | Glitho, Roch |
Keywords: | Automated Data Labeling, Fault Detection, Cloud Computing, Active Learning, Deep Learning |
ID Code: | 992319 |
Deposited By: | Prateek Bagora |
Deposited On: | 14 Nov 2023 19:52 |
Last Modified: | 14 Nov 2023 19:52 |
Repository Staff Only: item control page