Szporer, Adam (2012) E-mail Analysis for Investigators: Techniques and Implementation. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
2MBSzporer_MASc_S2012.pdf - Accepted Version |
Abstract
E-mail is a common form of communication in regular use today. As such, it is a normal part of investigating a person or a crime. At present, there are many tools to perform bulk analysis and basic searching, but our research advances the state of the art by applying text mining and unsupervised learning techniques to automate the e-mail analysis process. Our key goals are to group similar e-mails together and to identify the concepts (subjects of discussion) of those e-mail groups. We present several new methods to increase the grouping accuracy: e-mail domain analysis and word pair analysis. We also present a technique for concept analysis. These goals are achieved by integrating our research with the capabilities of Weka, an open-source machine learning suite, and WordNet, a lexical database of the English language. We apply this research to the publicly available Enron e-mail dataset. We verify the results by examining the comparative advantage of each new technique.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Szporer, Adam |
Institution: | Concordia University |
Degree Name: | M.A. Sc. |
Program: | Information Systems Security |
Date: | February 2012 |
Thesis Supervisor(s): | Debbabi, Mourad |
ID Code: | 973595 |
Deposited By: | ADAM SZPORER |
Deposited On: | 18 Jun 2012 19:57 |
Last Modified: | 18 Jan 2018 17:36 |
Repository Staff Only: item control page