Ebrahimi, Mohammadreza (2016) Automatic Identification of Online Predators in Chat Logs by Anomaly Detection and Deep Learning. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
4MBEbrahimi_MSc_S2016.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Providing a safe environment for juveniles and children in online social networks is considered as a major factor in improving public safety. Due to the prevalence of the online conversations, mitigating the undesirable effects of juvenile abuse in cyberspace has become inevitable. Using automatic ways to address this kind of crime is challenging and demands efficient and scalable data mining techniques. The problem can be casted as a combination of textual preprocessing in data/text mining and binary classification in machine learning. This thesis proposes two machine learning approaches to deal with the following two issues in the domain of online predator identification: 1) The first problem is gathering a comprehensive set of negative training samples which is unrealistic due to the nature of the problem. This problem is addressed by applying an existing method for semi-supervised anomaly detection that allows the training process based on only one class label. The method was tested on two datasets; 2) The second issue is improving the performance of current binary classification methods in terms of classification accuracy and F1-score. In this regard, we have customized a deep learning approach called Convolutional Neural Network to be used in this domain. Using this approach, we show that the classification performance (F1-score) is improved by almost 1.7% compared to the classification method (Support Vector Machine). Two different datasets were used in the empirical experiments: PAN-2012 and SQ (Sûreté du Québec). The former is a large public dataset that has been used extensively in the literature and the latter is a small dataset collected from the Sûreté du Québec.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Ebrahimi, Mohammadreza |
Institution: | Concordia University |
Degree Name: | M. Sc. |
Program: | Computer Science |
Date: | 14 April 2016 |
Thesis Supervisor(s): | Suen, Ching Y. and Ormandjieava, Olga |
ID Code: | 981404 |
Deposited By: | MOHAMMADREZA EBRAHIMI |
Deposited On: | 26 Aug 2016 12:48 |
Last Modified: | 18 Jan 2018 17:53 |
Repository Staff Only: item control page