Elhami Fard, Neshat (2022) Control of Multi-agent Reinforcement Learning Systems Under Adversarial Attacks. PhD thesis, Concordia University.
Preview |
Text (application/pdf)
5MBElhami Fard_PhD_S2023.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
This Ph.D. dissertation studies the control of multi-agent reinforcement learning (MARL) and multi-agent deep reinforcement learning (MADRL) systems under adversarial attacks.
Various attacks are investigated, and several defence algorithms (mitigation approaches) are proposed to assist the consensus control and proper data transmission.
We studied the consensus problem of a leaderless, homogeneous MARL system using actor-critic algorithms, with and without malicious agents.
We considered various distance-based immediate reward functions to improve the system's performance.
In addition to proposing four different immediate reward functions based on Euclidean, n-norm, and Chebyshev distances, we rigorously demonstrated which reward function performs better based on a cumulative reward for each agent and the entire team of agents.
The claims have been proven theoretically, and the simulation confirmed theoretical findings.
We examined whether modifying the malicious agent's neural network (NN) structure, as well as providing a compatible combination of the mean squared error (MSE) loss function and the sigmoid activation function can mitigate the destructive effects of the malicious agent on the leaderless, homogeneous, MARL system performance.
In addition to the theoretical support, the simulation confirmed the findings of the theory.
We studied the gradient-based adversarial attacks on cluster-based, heterogeneous MADRL systems with time-delayed data transmission using deep Q-network (DQN) algorithms.
We introduced two novel observations, termed on-time and time-delay observations, considered when the data transmission channel is idle and the data is transmitted on-time or time-delayed.
By considering the distance between the neighbouring agents, we presented a novel immediate reward function that appends a distance-based reward to the previously utilized reward to improve the MADRL system performance. We considered three types of gradient-based attacks to investigate the robustness of the proposed system data transmission. Two defence methods were proposed to reduce the effects of the discussed malicious attacks.
The theoretical results are illustrated and verified with simulation examples.
We also investigated the data transmission robustness between agents of a cluster-based, heterogeneous MADRL system under a gradient-based adversarial attack. An algorithm using a DQN approach and a proportional feedback controller to defend against the fast gradient sign method (FGSM) attack and improve the DQN agent performance was proposed.
Simulation results are included to verify the presented results.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering |
---|---|
Item Type: | Thesis (PhD) |
Authors: | Elhami Fard, Neshat |
Institution: | Concordia University |
Degree Name: | Ph. D. |
Program: | Electrical and Computer Engineering |
Date: | 16 October 2022 |
Thesis Supervisor(s): | Selmic, Rastko |
ID Code: | 991525 |
Deposited By: | Neshat Elhami Fard |
Deposited On: | 21 Jun 2023 14:43 |
Last Modified: | 21 Jun 2023 14:43 |
Repository Staff Only: item control page