Login | Register

A novel approach of mining write-prints for authorship attribution in e-mail forensics


A novel approach of mining write-prints for authorship attribution in e-mail forensics

Ibqal, F, Hadjidi, R., R, Fung, Benjamin C.M. and Debbabi, Mourad (2008) A novel approach of mining write-prints for authorship attribution in e-mail forensics. Digital Investigation, 5 . S42-S51. ISSN 17422876

Text (application/pdf)

Official URL: http://dx.doi.org/10.1016/j.diin.2008.05.001


There is an alarming increase in the number of cyber-crime incidents through anonymous e-mails. The problem of email authorship attribution is to identify the most plausible author of an anonymous e-mail from a group of potential suspects. Most previous contributions employed a traditional classification approach, such as decision tree and Support Vector Machine (SVM), to identify the author and studied the effects of different writing style features on the classification accuracy. However, little attention has been given on ensuring the quality of the evidence. In this paper, we introduce an innovative data mining method to capture the writeprint of every suspect and model it as combinations of features that occurred frequently in the suspect’s emails. This notion is called frequent pattern, which has proven to be effective in many data mining applications, but it is the first time to be applied to the problem of authorship attribution. Unlike the traditional approach, the extracted write-print by our method is unique among the suspects and, therefore, provides convincing and credible evidence for presenting it in a court of law. Experiments on real-life e-mails suggest that the proposed method can effectively identify the author and the results are supported by a strong evidence.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:Article
Authors:Ibqal, F and Hadjidi, R., R and Fung, Benjamin C.M. and Debbabi, Mourad
Journal or Publication:Digital Investigation
Digital Object Identifier (DOI):10.1016/j.diin.2008.05.001
Keywords:E-mail forensic analysis, authorship identification, data mining, write-print, frequent itemsets
ID Code:36250
Deposited On:22 Dec 2011 17:28
Last Modified:18 Jan 2018 17:36
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Back to top Back to top