Deep Reinforcement Learning-based Automated Penetration Testing for Active Distribution Networks

Title:

Deep Reinforcement Learning-based Automated Penetration Testing for Active Distribution Networks

Li, Yuanliang (2024) Deep Reinforcement Learning-based Automated Penetration Testing for Active Distribution Networks. PhD thesis, Concordia University.

Preview

Text (application/pdf)
Li_PhD_S2025.pdf - Accepted Version
Available under License Spectrum Terms of Access.

9MB

Abstract

The smart grid is a highly complex cyber-physical system of heterogeneous components with sensory, control, computation, and communication. Due to its complexity, dimensionality, uncertainty, and strong cyber-physical coupling, manually identifying critical vulnerabilities against cyberattacks at infrastructure levels has proven to be challenging. In the information and communication technology (ICT) industry, penetration testing (PT) has demonstrated its efficacy in pinpointing vulnerabilities within information systems through authorized cyberattacks. Building upon the principles of PT, this study delves into exploring effective and efficient PT approaches to discover vulnerabilities for active distribution networks (ADNs) of smart grids based on deep reinforcement learning (DRL) methods.

To overcome the poor efficiency and non-comprehensiveness of common PT in identifying vulnerabilities for an ADN caused by its complex structure and strong cyber-physical coupling, we first propose a DRL-based PT framework and formulate the PT as a Markov decision process (MDP) specifically for the industrial control networks of the ADN. This framework comprehensively considers cyber-physical coupling, realistic cyberattacks, and the physical impacts of ADNs. The framework is applied to model a replay attack scheme on an ADN as the study case, which aims to identify critical attack paths that lead to system voltage violations. Additionally, a co-simulation platform named GridBattleSim was developed specifically for DRL-based PT on ADNs, integrating dedicated simulators for different parts of the ADN. The simulation results show the efficacy of DRL-based PT in learning optimal attack paths under varying system conditions and different levels of attack difficulty.

To overcome the limited observability in practical PT scenarios, a partially observable Markov decision process (POMDP) formulation is proposed, which allows the PT agent to learn PT policies under partially observable conditions. To solve the POMDP and obtain the optimal PT policy, we apply the physical model of the ADN to estimate its full state based on the local observable data captured by the PT agent and then transform the POMDP to an MDP that can be solved by DRL.

Furthermore, to address the sparse reward issue, improve the generalization of reward functions, and improve the interpretability of DRL-based PT, a knowledge-informed AutoPT framework (RM-PT) is proposed, which incorporates cybersecurity domain knowledge based on Reward Machine (RM). We use the lateral movement of PT on ANDs as a case study, where two RMs are designed based on MITRE ATT&CK knowledge base as two PT guidelines. Finally, the deep Q-learning with RM (DQRM) algorithm is applied to train the PT policies. The proposed RM-PT is evaluated under the CyberBattleSim platform. The experimental results show that the knowledge-informed PT exhibits a higher training efficiency compared to the PT without knowledge embedding. Furthermore, RMs that incorporate more detailed domain knowledge exhibit superior PT performance compared to RMs with simpler knowledge.

Finally, we also discuss the future directions of this study in terms of domain knowledge integration for AI-powered PT. We anticipate that the methodologies and findings presented in this study can inspire efforts in securing critical infrastructure and closing research gaps for the cybersecurity of smart grids.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:	Thesis (PhD)
Authors:	Li, Yuanliang
Institution:	Concordia University
Degree Name:	Ph. D.
Program:	Information and Systems Engineering
Date:	11 September 2024
Thesis Supervisor(s):	Yan, Jun
Keywords:	penetration testing; smart grid; reinforcement learning; human knowledge integration
ID Code:	994747
Deposited By:	Yuanliang Li
Deposited On:	17 Jun 2025 14:21
Last Modified:	17 Jun 2025 14:21

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

Deep Reinforcement Learning-based Automated Penetration Testing for Active Distribution Networks

Deep Reinforcement Learning-based Automated Penetration Testing for Active Distribution Networks

Abstract