Iqbal, Farkhund, Binsalleeh, Hamad, Fung, Benjamin C.M. and Debbabi, Mourad (2010) Mining writeprints from anonymous e-mails for forensic investigation. Digital Investigation, 7 (1-2). pp. 56-64. ISSN 17422876
Preview |
Text (application/pdf)
629kB2010_Mining_Writeprints_from_Anonymous_E-mails.pdf |
Official URL: http://dx.doi.org/10.1016/j.diin.2010.03.003
Abstract
Many criminals exploit the convenience of anonymity in the cyber world to conduct illegal activities. E-mail is the most commonly used medium for such activities. Extracting knowledge and information from e-mail text has become an important step for cybercrime investigation and evidence collection. Yet, it is one of the most challenging and time-consuming tasks due to special characteristics of e-mail dataset. In this paper, we focus on the problem of mining the writing styles from a collection of e-mails written by multiple anonymous authors. The general idea is to first cluster the anonymous e-mails by the stylometric features and then extract the writeprint, i.e., the unique writing style, from each cluster. We emphasize that the presented problem together with our proposed solution is different from the traditional problem of authorship identification, which assumes training data is available for building a classifier. Our proposed method is particularly useful in the initial stage of investigation, in which the investigator usually have very little information of the case and the true authors of suspicious e-mails collection. Experiments on a real-life dataset suggest that clustering by writing style is a promising approach for grouping e-mails written by the same author.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering |
---|---|
Item Type: | Article |
Refereed: | Yes |
Authors: | Iqbal, Farkhund and Binsalleeh, Hamad and Fung, Benjamin C.M. and Debbabi, Mourad |
Journal or Publication: | Digital Investigation |
Date: | 2010 |
Digital Object Identifier (DOI): | 10.1016/j.diin.2010.03.003 |
Keywords: | e-mail, writing styles, writeprint, forensic investigation, clustering, classification, stylometric features, authorship analysis |
ID Code: | 36253 |
Deposited By: | DAVID MACAULAY |
Deposited On: | 22 Dec 2011 19:04 |
Last Modified: | 18 Jan 2018 17:36 |
Repository Staff Only: item control page