Thomas, Jerry George (2019) Anomaly Detection from Textual Content in Financial Records. Masters thesis, Concordia University.
Text (application/pdf)
9MBThomas__MCompSc_F2019.pdf - Accepted Version Restricted to Registered users only Available under License Spectrum Terms of Access. |
Abstract
Most financial institutions mainly use numerical statistics to detect anomalous (malpractice) activities. The textual content in financial records however contains precious information which to date has not been effectively used for detection of anomalous behavior by users. One reason could be that the text elements in these financial records are often unintelligible, cluttered with abbreviations, numbers and symbols, which makes it difficult to build a framework system that can coherently understand and draw conclusions. Rule-based techniques have been proposed but such systems are easy to elude, as they are difficult to generalize and do not scale up. In this thesis we address the problem of detecting anomalous activity using the textual content in financial records. Given the low intelligibility and clutter in such data, we treat this as a classification problem and explore various deep learning based solutions. Specifically, we propose four solutions. In the first technique, we treat all financial records of a user as a single document represented as a set of words, and use a deep learning classification network to distinguish between normal with anomalous behavior. In the second technique we treat these financial records as a time series and use a sequence based deep learning network for classification and in the third technique, we propose a simple convolutional neural network that learns the behaviour from the sequential textual data. In the fourth technique we use the transfer learning method Universal Language Model Fine-Tuning (ULMFiT), to use language modelling to perform unsupervised pre-training, followed by a supervised fine-tuning step. The results of our experiments using real data convincingly indicate that use of the textual content in financial records yields greater accuracy in anomalous behavior detection. They also suggest that deep learning is a viable and effective solution approach for real time anomaly detection by financial institutions. To the best of our knowledge, this is the first attempt to use deep learning solutions for addressing this problem.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Thomas, Jerry George |
Institution: | Concordia University |
Degree Name: | M. Comp. Sc. |
Program: | Computer Science |
Date: | 9 September 2019 |
Thesis Supervisor(s): | Shiri, Nematollaah and Mudur, Sudhir |
ID Code: | 985898 |
Deposited By: | Jerry George Thomas |
Deposited On: | 06 Feb 2020 02:48 |
Last Modified: | 06 Feb 2020 02:48 |
Repository Staff Only: item control page