Sadhukhan, Priyam (2021) Proximal Policy Optimization for Formation Control and Obstacle Avoidance in Multi-Agent Systems. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
3MBSadhukhan_MASc_S2022.pdf - Accepted Version |
Abstract
In this thesis, deep reinforcement learning (DRL) is used for intelligent formation control and obstacle avoidance in multi-agent systems through reward shaping. The objective of this work is to study the application of proximal policy optimization (PPO) algorithm for maneuvering a formation of agents around obstacles. Each agent in the multi-agent system is modeled as a holonomic second-order integrator and the formation is allowed to shrink while maintaining its shape in order to navigate around obstacles and take the geometric centroid of the formation towards the goal. We investigated both angle-based rewards and bearing-based rewards. Experiments were carried out in a two-dimensional simulation environment with different number of agents and multiple obstacles between the formation and the goal. Curriculum learning was used to train the agents in environments with different initializations for the agents, goal and obstacles. Simulation results show the effectiveness of the different reward schemes.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Sadhukhan, Priyam |
Institution: | Concordia University |
Degree Name: | M.A. Sc. |
Program: | Electrical and Computer Engineering |
Date: | 14 October 2021 |
Thesis Supervisor(s): | Selmic, Rastko |
ID Code: | 990080 |
Deposited By: | Priyam Sadhukhan |
Deposited On: | 16 Jun 2022 15:08 |
Last Modified: | 16 Jun 2022 15:08 |
Repository Staff Only: item control page