Anomaly Detection from Textual Content in Financial Records

Title:

Anomaly Detection from Textual Content in Financial Records

Thomas, Jerry George (2019) Anomaly Detection from Textual Content in Financial Records. Masters thesis, Concordia University.

[thumbnail of Thomas__MCompSc_F2019.pdf]

Text (application/pdf)
Thomas__MCompSc_F2019.pdf - Accepted Version
Restricted to Registered users only
Available under License Spectrum Terms of Access.

9MB

Abstract

Most financial institutions mainly use numerical statistics to detect anomalous (malpractice) activities. The textual content in financial records however contains precious information which to date has not been effectively used for detection of anomalous behavior by users. One reason could be that the text elements in these financial records are often unintelligible, cluttered with abbreviations, numbers and symbols, which makes it difficult to build a framework system that can coherently understand and draw conclusions. Rule-based techniques have been proposed but such systems are easy to elude, as they are difficult to generalize and do not scale up. In this thesis we address the problem of detecting anomalous activity using the textual content in financial records. Given the low intelligibility and clutter in such data, we treat this as a classification problem and explore various deep learning based solutions. Specifically, we propose four solutions. In the first technique, we treat all financial records of a user as a single document represented as a set of words, and use a deep learning classification network to distinguish between normal with anomalous behavior. In the second technique we treat these financial records as a time series and use a sequence based deep learning network for classification and in the third technique, we propose a simple convolutional neural network that learns the behaviour from the sequential textual data. In the fourth technique we use the transfer learning method Universal Language Model Fine-Tuning (ULMFiT), to use language modelling to perform unsupervised pre-training, followed by a supervised fine-tuning step. The results of our experiments using real data convincingly indicate that use of the textual content in financial records yields greater accuracy in anomalous behavior detection. They also suggest that deep learning is a viable and effective solution approach for real time anomaly detection by financial institutions. To the best of our knowledge, this is the first attempt to use deep learning solutions for addressing this problem.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science
Item Type:	Thesis (Masters)
Authors:	Thomas, Jerry George
Institution:	Concordia University
Degree Name:	M. Comp. Sc.
Program:	Computer Science
Date:	9 September 2019
Thesis Supervisor(s):	Shiri, Nematollaah and Mudur, Sudhir
ID Code:	985898
Deposited By:	Jerry George Thomas
Deposited On:	06 Feb 2020 02:48
Last Modified:	06 Feb 2020 02:48

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

Anomaly Detection from Textual Content in Financial Records

Anomaly Detection from Textual Content in Financial Records

Abstract