A unified data mining solution for authorship analysis in anonymous textual communications


Iqbal, Farkhund and Binsalleeh, Hamad and Fung, Benjamin C.M. and Debbabi, Mourad (2013) A unified data mining solution for authorship analysis in anonymous textual communications. Information Sciences, 231 . pp. 98-112. ISSN 00200255

Official URL: http://dx.doi.org/10.1016/j.ins.2011.03.006


The cyber world provides an anonymous environment for criminals to conduct malicious activities such as spamming, sending ransom e-mails, and spreading botnet malware. Often, these activities involve textual communication between a criminal and a victim, or between criminals themselves. The forensic analysis of online textual documents for addressing the anonymity problem called authorship analysis is the focus of most cybercrime investigations. Authorship analysis is the statistical study of linguistic and computational characteristics of the written documents of individuals. This paper is the first work that presents a unified data mining solution to address authorship analysis problems based on the concept of frequent pattern-based writeprint. Extensive experiments on real-life data suggest that our proposed solution can precisely capture the writing styles of individuals. Furthermore, the writeprint is effective to identify the author of an anonymous text from a group of suspects and to infer sociolinguistic characteristics of the author.

Divisions:Concordia University > Faculty of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:Article
Authors:Iqbal, Farkhund and Binsalleeh, Hamad and Fung, Benjamin C.M. and Debbabi, Mourad
Journal or Publication:Information Sciences
Digital Object Identifier (DOI):10.1016/j.ins.2011.03.006
Keywords:Authorship identification; Authorship characterization; Stylometric features; Writeprint; Frequent patterns; Cyber forensics
ID Code:976945
Deposited On:08 Mar 2013 16:11
Last Modified:05 Nov 2016 04:47


