Szporer, Adam (2012) E-mail Analysis for Investigators: Techniques and Implementation. Masters thesis, Concordia University.
- Accepted Version
E-mail is a common form of communication in regular use today. As such, it is a normal part of investigating a person or a crime. At present, there are many tools to perform bulk analysis and basic searching, but our research advances the state of the art by applying text mining and unsupervised learning techniques to automate the e-mail analysis process. Our key goals are to group similar e-mails together and to identify the concepts (subjects of discussion) of those e-mail groups. We present several new methods to increase the grouping accuracy: e-mail domain analysis and word pair analysis. We also present a technique for concept analysis. These goals are achieved by integrating our research with the capabilities of Weka, an open-source machine learning suite, and WordNet, a lexical database of the English language. We apply this research to the publicly available Enron e-mail dataset. We verify the results by examining the comparative advantage of each new technique.
|Divisions:||Concordia University > Faculty of Engineering and Computer Science > Concordia Institute for Information Systems Engineering|
|Item Type:||Thesis (Masters)|
|Degree Name:||M.A. Sc.|
|Program:||Information Systems Security|
|Thesis Supervisor(s):||Debbabi, Mourad|
|Deposited By:||ADAM SZPORER|
|Deposited On:||18 Jun 2012 19:57|
|Last Modified:||18 Jun 2012 19:57|
Repository Staff Only: item control page