Sun, Jincheng ORCID: https://orcid.org/0000-0002-1180-7542 (2022) Deep Reinforcement Learning and Graph Learning to Plan Resource Provision for Large Scale Cloud-based Game Servers. Masters thesis, Concordia University.
Text (application/pdf)
2MBSun_MA_S2022.pdf - Accepted Version Restricted to Repository staff only Available under License Spectrum Terms of Access. |
Abstract
To meet the service-level objectives (SLOs), video game companies maintain a pool of virtual machines on the cloud to support millions of online game players. In the study case of this thesis, a rule-based planning algorithm is applied in the ecosystem to automatically scale in and out the number of active virtual machines on demand. The rule-based system maintains a buffer of idle virtual machines to guarantee no under-provision cases. As a result, on average, 30% of the virtual machines requested from the cloud providers are not utilized. Furthermore, game companies often serve players from different geometrical regions. The rule-based system is applied to each region individually, causing more waste from a global perspective.
This thesis aims to reduce idle virtual machines while meeting the SLO of provisioning. First of all, a reinforcement learning-based planning framework with Soft Actor-Critic (SAC) algorithm is proposed to make scaling decisions on a single region. Two reward functions are designed to meet the objectives: (1) a threshold-based reward function to limit the over-provisioning virtual machines within an acceptable range; (2) a cost-based reward function to minimize the cost of virtual machines of two types.
On a global level, when a region is under-provisioned for game servers, game companies tend to place the players into a neighboring over-provisioned region with tolerable delay. To perform multiple-fleet virtual machine planning tasks, a graph-based method is proposed in this thesis. The Heterogeneous Graph Transformer (HGT) algorithm is applied with the SAC framework to minimize the idle virtual machines globally. A threshold-based and square percentage error reward function is designed to reduce the multiple-fleet level over-provision and minimize the planning error on a single region.
The notable benefits of the approaches in this thesis are in two aspects. In the single-fleet virtual machine planning scenario, the SAC-FCNN model (1) reduces the misprediction virtual machine waste to 22.4%, which is 5.78% lower as compared to the rule-based system; (2) satisfies the SLO of over-provisioning virtual machines at least 99.0% of the testing time. In the multiple-fleet virtual machine planning scenario, the SAC-HGT model; (3) reduces the misprediction virtual machines waste by more than 9.61% of the SAC-FCNN model and 28.90% more than the rule-based system; (4) meets the SLO of over-provisioning at least 99.0% of the testing time on the multiple-fleet level.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Sun, Jincheng |
Institution: | Concordia University |
Degree Name: | M.A. Sc. |
Program: | Electrical and Computer Engineering |
Date: | 6 January 2022 |
Thesis Supervisor(s): | Liu, Yan |
Keywords: | Virtual machine planning, Cloud service, Reinforcement learning, Graph neural network |
ID Code: | 990383 |
Deposited By: | Jincheng Sun |
Deposited On: | 27 Oct 2022 14:46 |
Last Modified: | 27 Oct 2022 14:46 |
Repository Staff Only: item control page