Heba, Aburish (2023) An Industrial Study on Predicting Crash Report Log Types Using Large Language Models. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
946kBAburish_MA_F2023.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Software crashes and failures take a fair amount of effort and time to resolve. Software developers
use information submitted in crash reports (CRs) to conduct root cause analysis of faults. The
problem is that CRs often lack all the information required. Automatic prediction of CR fields can
therefore reduce the crash resolution process time. In this thesis, we use CR headings and
descriptions to predict the type of log files that should be attached to a CR. Our approach is to use
multilabel learning algorithms to train a machine learning model using a dataset from Ericsson’s
CR database to predict the type of log files based on CR headings and descriptions. We use three
different pre-trained language models Bert, Telecom Bert, and Word2Vector to extract feature
vectors from CR headings and descriptions and then feed these vectors to three different multilabel
learning algorithms, namely Binary Relevance (BR), Classifier Chain (CC), and Neural Network
(NN). Then, we compare the performance of different feature sets. We found that the use of
headings alone with pre-trained language models Bert and Telecom Bert results in the best average
AUC (0.70). The use of descriptions and headings and descriptions together as features resulted in
an average AUC varying from 0.65 to 0.70. In general, the algorithms showed no significant
difference in their performances, but the choice of features impacts the performance. Also, the
performance of predicting each type of log is influenced by the use of keywords in headings and
descriptions that describe these files. We found that log types with a clear definition such as Key
Performance Indicators (KPI) Logs, Post-mortem Dumps (PMD), and execution traces can be
predicted with higher accuracy.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Heba, Aburish |
Institution: | Concordia University |
Degree Name: | M.A. Sc. |
Program: | Electrical and Computer Engineering |
Date: | 25 August 2023 |
Thesis Supervisor(s): | Abdelwahab, Hamou-Lhadj |
ID Code: | 992691 |
Deposited By: | Heba Abu-Rish |
Deposited On: | 15 Nov 2023 15:20 |
Last Modified: | 15 Nov 2023 15:20 |
Repository Staff Only: item control page