Deep Reinforcement Learning and Graph Learning to Plan Resource Provision for Large Scale Cloud-based Game Servers

Title:

Deep Reinforcement Learning and Graph Learning to Plan Resource Provision for Large Scale Cloud-based Game Servers

Sun, Jincheng ORCID: https://orcid.org/0000-0002-1180-7542 (2022) Deep Reinforcement Learning and Graph Learning to Plan Resource Provision for Large Scale Cloud-based Game Servers. Masters thesis, Concordia University.

Text (application/pdf)
Sun_MA_S2022.pdf - Accepted Version
Restricted to Repository staff only
Available under License Spectrum Terms of Access.

2MB

Abstract

To meet the service-level objectives (SLOs), video game companies maintain a pool of virtual machines on the cloud to support millions of online game players. In the study case of this thesis, a rule-based planning algorithm is applied in the ecosystem to automatically scale in and out the number of active virtual machines on demand. The rule-based system maintains a buffer of idle virtual machines to guarantee no under-provision cases. As a result, on average, 30% of the virtual machines requested from the cloud providers are not utilized. Furthermore, game companies often serve players from different geometrical regions. The rule-based system is applied to each region individually, causing more waste from a global perspective.

This thesis aims to reduce idle virtual machines while meeting the SLO of provisioning. First of all, a reinforcement learning-based planning framework with Soft Actor-Critic (SAC) algorithm is proposed to make scaling decisions on a single region. Two reward functions are designed to meet the objectives: (1) a threshold-based reward function to limit the over-provisioning virtual machines within an acceptable range; (2) a cost-based reward function to minimize the cost of virtual machines of two types.

On a global level, when a region is under-provisioned for game servers, game companies tend to place the players into a neighboring over-provisioned region with tolerable delay. To perform multiple-fleet virtual machine planning tasks, a graph-based method is proposed in this thesis. The Heterogeneous Graph Transformer (HGT) algorithm is applied with the SAC framework to minimize the idle virtual machines globally. A threshold-based and square percentage error reward function is designed to reduce the multiple-fleet level over-provision and minimize the planning error on a single region.

The notable benefits of the approaches in this thesis are in two aspects. In the single-fleet virtual machine planning scenario, the SAC-FCNN model (1) reduces the misprediction virtual machine waste to 22.4%, which is 5.78% lower as compared to the rule-based system; (2) satisfies the SLO of over-provisioning virtual machines at least 99.0% of the testing time. In the multiple-fleet virtual machine planning scenario, the SAC-HGT model; (3) reduces the misprediction virtual machines waste by more than 9.61% of the SAC-FCNN model and 28.90% more than the rule-based system; (4) meets the SLO of over-provisioning at least 99.0% of the testing time on the multiple-fleet level.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering
Item Type:	Thesis (Masters)
Authors:	Sun, Jincheng
Institution:	Concordia University
Degree Name:	M.A. Sc.
Program:	Electrical and Computer Engineering
Date:	6 January 2022
Thesis Supervisor(s):	Liu, Yan
Keywords:	Virtual machine planning, Cloud service, Reinforcement learning, Graph neural network
ID Code:	990383
Deposited By:	Jincheng Sun
Deposited On:	27 Oct 2022 14:46
Last Modified:	27 Oct 2022 14:46

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

Deep Reinforcement Learning and Graph Learning to Plan Resource Provision for Large Scale Cloud-based Game Servers

Deep Reinforcement Learning and Graph Learning to Plan Resource Provision for Large Scale Cloud-based Game Servers

Abstract