Shayesteh, Behshid (2024) Machine Learning for Fault Prediction in Clouds. PhD thesis, Concordia University.
Preview |
Text (application/pdf)
4MBShayesteh_PhD_F2024.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
The vast adoption of cloud computing has increased the size and complexity of data centers, increasing possibility of faults. Fault can negatively impact the performance, availability, and reliability of cloud services, leading to significant maintenance cost and revenue loss for cloud service providers. Therefore, fault prediction in clouds is a critical task. Machine Learning (ML) is increasingly used for this purpose due to their pattern recognition capabilities. While predicting faults in clouds using ML enables a proactive approach to prevent faults, building accurate prediction models that can maintain their performance in dynamic clouds is challenging. One problem is concept drift, where changes in data distribution can degrade model performance. Similarly, feature drift, which is changes in feature relevancy, can also degrade the model performance. Additionally, models accuracy is influenced by data-related parameters, necessitating selection of these parameters to achieve a high model performance. Existing ML-based fault prediction solutions do not focus on adaptability to dynamic conditions like concept or feature drift. Additionally, selecting data-related parameters to balance model performance and resource consumption is not addressed in current literature.
This thesis mainly focuses on addressing the challenges of employing ML models for predicting faults and predicting application performance degradation caused by faults in cloud environments. We first propose a concept drift adaptation algorithm for fault prediction in clouds using Reinforcement Learning (RL). This algorithm considers the cloud operator's requirements, and uses RL to select the most appropriate drift adaptation method as well as data size for adaptation that fulfills the requirements. Second, we propose a feature drift adaptation solution for adapting the model to feature drifts while predicting application performance degradation in clouds. This solution consists of a feature drift detector that monitors the performance of the prediction model as well as the feature importance, and a feature drift adaptor that measures the drift severity to adapt the prediction model. Finally, we propose a multi-objective optimization algorithm to select the training data size, data sampling interval, input window, and prediction horizon for training an ML model that predicts application performance degradation in clouds.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering |
---|---|
Item Type: | Thesis (PhD) |
Authors: | Shayesteh, Behshid |
Institution: | Concordia University |
Degree Name: | Ph. D. |
Program: | Information and Systems Engineering |
Date: | 30 May 2024 |
Thesis Supervisor(s): | Glitho, Roch |
ID Code: | 994133 |
Deposited By: | Behshid Shayesteh |
Deposited On: | 24 Oct 2024 18:00 |
Last Modified: | 24 Oct 2024 18:00 |
Repository Staff Only: item control page