Login | Register

Control of Multi-agent Reinforcement Learning Systems Under Adversarial Attacks


Control of Multi-agent Reinforcement Learning Systems Under Adversarial Attacks

Elhami Fard, Neshat (2022) Control of Multi-agent Reinforcement Learning Systems Under Adversarial Attacks. PhD thesis, Concordia University.

[thumbnail of Elhami Fard_PhD_S2023.pdf]
Text (application/pdf)
Elhami Fard_PhD_S2023.pdf - Accepted Version
Available under License Spectrum Terms of Access.


This Ph.D. dissertation studies the control of multi-agent reinforcement learning (MARL) and multi-agent deep reinforcement learning (MADRL) systems under adversarial attacks.
Various attacks are investigated, and several defence algorithms (mitigation approaches) are proposed to assist the consensus control and proper data transmission.

We studied the consensus problem of a leaderless, homogeneous MARL system using actor-critic algorithms, with and without malicious agents.
We considered various distance-based immediate reward functions to improve the system's performance.
In addition to proposing four different immediate reward functions based on Euclidean, n-norm, and Chebyshev distances, we rigorously demonstrated which reward function performs better based on a cumulative reward for each agent and the entire team of agents.
The claims have been proven theoretically, and the simulation confirmed theoretical findings.

We examined whether modifying the malicious agent's neural network (NN) structure, as well as providing a compatible combination of the mean squared error (MSE) loss function and the sigmoid activation function can mitigate the destructive effects of the malicious agent on the leaderless, homogeneous, MARL system performance.
In addition to the theoretical support, the simulation confirmed the findings of the theory.

We studied the gradient-based adversarial attacks on cluster-based, heterogeneous MADRL systems with time-delayed data transmission using deep Q-network (DQN) algorithms.
We introduced two novel observations, termed on-time and time-delay observations, considered when the data transmission channel is idle and the data is transmitted on-time or time-delayed.
By considering the distance between the neighbouring agents, we presented a novel immediate reward function that appends a distance-based reward to the previously utilized reward to improve the MADRL system performance. We considered three types of gradient-based attacks to investigate the robustness of the proposed system data transmission. Two defence methods were proposed to reduce the effects of the discussed malicious attacks.
The theoretical results are illustrated and verified with simulation examples.

We also investigated the data transmission robustness between agents of a cluster-based, heterogeneous MADRL system under a gradient-based adversarial attack. An algorithm using a DQN approach and a proportional feedback controller to defend against the fast gradient sign method (FGSM) attack and improve the DQN agent performance was proposed.
Simulation results are included to verify the presented results.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering
Item Type:Thesis (PhD)
Authors:Elhami Fard, Neshat
Institution:Concordia University
Degree Name:Ph. D.
Program:Electrical and Computer Engineering
Date:16 October 2022
Thesis Supervisor(s):Selmic, Rastko
ID Code:991525
Deposited By: Neshat Elhami Fard
Deposited On:21 Jun 2023 14:43
Last Modified:21 Jun 2023 14:43
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top