Login | Register

An XAI-based Framework for Software Vulnerability Contributing Factors Assessment

Title:

An XAI-based Framework for Software Vulnerability Contributing Factors Assessment

Li, Ding (2023) An XAI-based Framework for Software Vulnerability Contributing Factors Assessment. Masters thesis, Concordia University.

[thumbnail of Li_MA_S2024.pdf]
Preview
Text (application/pdf)
Li_MA_S2024.pdf - Accepted Version
Available under License Spectrum Terms of Access.
11MB

Abstract

Software vulnerability detection plays a proactive role in reducing risks to software security and reliability. Despite advancements in deep learning-based detection, a semantic gap persists between model-learned features and human-interpretable vulnerability semantics. The challenge lies in the absence of a systematic approach to assess feature importance, capable of explaining the relationship between these two elements. Explainable Artificial Intelligence (XAI) techniques become indispensable in offering comprehensive explanations of features learned by AI models, emphasizing their applicability in software vulnerability detection.
This research introduces an XAI-based framework to systematically evaluate XAI techniques and apply them for assessing the contributing factors of feature representations in classifying soft- ware code into Common Weakness Enumeration (CWE) types. The focus is on applying XAI methods to examine the importance of features underlying vulnerability detection. An additional challenge arises from the lack of a systematic evaluation to ensure consistent explanation results during the selection of state-of-the-art XAI methods.
To address this, this thesis defines three evaluation metrics for XAI: consistency, stability, and efficiency. A novel XAI method, named Mean-Centroid PredDiff, is introduced to strike a balance among these three metrics, significantly enhancing the framework’s efficacy. This method, along with SHAP, are integrated into the framework based on their well-performance across the evaluation in three domain case studies.
Findings from this work reveal that the proposed framework enables the summarization of the importance of 40 syntactic constructs and the similarities among 20 CWEs based on graph- embedded semantic features. The study results align closely with expert knowledge from the CWE community, achieving approximately 77.8% Top1, 89% Top5 similarity hit rates and mean average precision of 0.70 in CWE classification. The study validates the significance of attention values of transformer-based models in representing the importance of code tokens.
Overall, this thesis contributes a new XAI method to the open-source community, achieving a trade-off of efficiency with consistency and stability. In addition, the XAI-based framework success- fully assesses the nine meta syntactic constructs importance across 20 CWE types and evaluate their similarity. The dataset and the code of framework have been made publicly available on GitHub.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering
Item Type:Thesis (Masters)
Authors:Li, Ding
Institution:Concordia University
Degree Name:M.A. Sc.
Program:Electrical and Computer Engineering
Date:27 September 2023
Thesis Supervisor(s):Liu, Yan
ID Code:993035
Deposited By: Ding Li
Deposited On:05 Jun 2024 15:19
Last Modified:05 Jun 2024 15:19
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top