Mohd Abul Basher, Abdur Rahman (2011) Mining Chat Logs to Extract Information about Authors and Topics for Crime Investigation. Masters thesis, Concordia University.
- Accepted Version
Cybercriminals have been using the Internet to accomplish illegitimate activities and to execute catastrophic attacks. Computer Mediated Communication, such as online chat, provides an anonymous channel for predators to exploit victims. In order to prosecute criminals in a court of law, an investigator often needs to extract evidence from a large volume of chat messages. Most of the existing search tools are keyword-based, and the search terms are provided by an investigator. The quality of the retrieved results depends on the search terms provided. Due to the large volume of chat messages and the large number of participants in public chat rooms, the process is usually time-consuming and error-prone. This thesis presents a topic search model to analyze archives of chat logs for segregating crime-relevant logs from others. Specifically, we propose an extension of the Latent Dirichlet Allocation (LDA)-based model to extract topics, compute the contribution of authors in these topics, and study the transitions of these topics over time. In addition, we present another unique model for characterizing authors-topics over time. This is crucial for investigation because it provides a view of the activity in which authors are involved in certain topics. Experiments on two real-life datasets suggest that the proposed approach can discover hidden criminal topics and the distribution of authors to these topics.
|Divisions:||Concordia University > Faculty of Engineering and Computer Science > Concordia Institute for Information Systems Engineering|
|Item Type:||Thesis (Masters)|
|Authors:||Mohd Abul Basher, Abdur Rahman|
|Degree Name:||M.A. Sc.|
|Program:||Information Systems Security|
|Date:||22 August 2011|
|Thesis Supervisor(s):||Fung, Benjamin|
|Deposited By:||ABDURRAHMAN ABDUL JALIL|
|Deposited On:||21 Nov 2011 16:57|
|Last Modified:||14 Jan 2016 19:10|
Repository Staff Only: item control page