Live Testing of Cloud Services

Title:

Live Testing of Cloud Services

Jebbar, Oussama (2023) Live Testing of Cloud Services. PhD thesis, Concordia University.

Preview

Text (application/pdf)
Jebbar_PhD_S2023.pdf - Accepted Version
Available under License Spectrum Terms of Access.

3MB

Abstract

Service providers use the cloud due to the dynamic infrastructure it offers at a low cost. However, sharing the infrastructure with other service providers as well as relying on remote services that may be inaccessible from the development environment create major limitations for development time testing. Modern service providers have an increasing need to test their services in the production environment. Such testing helps increase the reliability of the test results and detect problems that could not be detected in the development environment such as the noisy neighbor problem. Furthermore, testing in production enables other software engineering activities such as fault prediction and fault localization and makes them more efficient.
Test interferences are a major problem for testing in production as they can have damaging effects ranging from unreliable and degraded performance to a malfunctioning or inaccessible system. The countermeasures that are taken to alleviate the risk of test interferences are called test isolation. Existing approaches for test isolation have limited applicability in the cloud context because the assumptions under which they operate are seldom satisfied in the cloud context. Moreover, when running tests in production, failures can happen and whether they are due to the testing activity or not the damage they cause cannot be ignored. To deal with such issues and manage to quickly get the system back to a healthy state in the case of a failure, human intervention should be reduced in the orchestration and execution of testing activities in production. Thus, the need for a solution that automates the orchestration of tests in production while taking into consideration the particularity of a cloud system such as the existence of multiple fault tolerance mechanisms.
In this thesis, we define live testing as testing a system in its production environment, while it is serving, without causing any intolerable disruption to its usage. We propose an architecture that can help cope with the major challenges of live testing, namely reducing human intervention and providing test isolation. Our proposed architecture is composed of two building blocks, the Test Planner and the Test Execution Framework. To make the solution we are proposing independent from the technologies used in a cloud system, we propose the use of UML Testing Profile (UTP) to model the artifacts involved in this architecture. To reduce human intervention in testing activities, we start by automating test execution and orchestration in production. To achieve this goal, we propose an execution semantics that we associate with UTP concepts that are relevant for test execution. Such an execution semantics represent the behavior that the Test Execution Framework exhibits while executing tests. We propose a test case selection method and test plan generation method to automate the activities that are performed by the Test Planner. To alleviate the risk of test interferences, we also propose a set of test methods that can be used for test isolation. As opposed to existing test isolation techniques, our test methods do not make any assumptions about the parts of the system for which test isolation can be provided, nor about the feature to be tested. These test methods are used in the design of test plans. In fact, the applicability of each test method varies according to several factors including the risk of test interferences that parts of the system present, the availability of resources, and the impact of the test method on the provisioning of the service. To be able to select the right test method for each situation, information about the risk of test interference and the cost of test isolation need to be provided. We propose a method, configured instance evaluation method, that automates the process of obtaining such information. Our method evaluates the software involved in the realization of the system in terms of the risk of test interference it presents, and the cost to provide test isolation for that software.
In this thesis, we also discuss the feasibility of our proposed methods and evaluate the provided solutions. We implemented a prototype for the test plan generation and showcased it in a case study. We also implemented a part of the configured instance evaluation method, and we show that it can help confirm the presence of a risk of test interference. We showcase one of our test methods on a case study using an application deployed in a Kubernetes managed cluster. We also provide proof of the soundness of our execution semantics. Furthermore, we evaluate, in terms of the resulting test plan’s execution time, the algorithms involved in the test plan generation method. We show that for two of the activities in our solution our proposed algorithms provide optimal solutions; and, for one activity we identify in which situations our algorithm does not manage to give the optimal solution. Finally, we prove that our test case selection method reduces the test suite without compromising the configuration fault detection power.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:	Thesis (PhD)
Authors:	Jebbar, Oussama
Institution:	Concordia University
Degree Name:	Ph. D.
Program:	Software Engineering
Date:	January 2023
Thesis Supervisor(s):	Khendek, Ferhat and Toeroe, Maria
ID Code:	991761
Deposited By:	Oussama Jebbar
Deposited On:	21 Jun 2023 14:44
Last Modified:	21 Jun 2023 14:44

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

Live Testing of Cloud Services

Live Testing of Cloud Services

Abstract