[1] Network E-mail Examiner. Web site: http://www.paraben-enterprise.com/, Retrieved on August 15, 2010. Paraben Corporation. [2] Forensic ToolKit. Web site: http://www.accessdata.com/forensictoolkit.html, Retrieved on March 2, 2009. AccessData. [3] Encase. Web site: http://www.guidancesoftware.com/, Retrieved on May 10, 2010. Guidance Software. [4] A. Abbasi and H. Chen. Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace. ACM Transactions on Information Systems, 26(2):1–29, 2008. [5] A. Abbasi, H. Chen, and J. Nunamaker. Stylometric identification in electronic markets: Scalability and robustness. Journal of Management Information Systems, 5(1):49–78, 2008. [6] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In Proc. of ACM SIGMOD Conference, Seattle, WA, 1998. [7] R. Agrawal, T. Imieli´nski, and A. Swami. Mining association rules between sets of items in large databases. In Proc. of the 1993 ACM SIGMOD international conference on Management of data, pages 207–216, Washington, D.C., United States, 1993. ACM. [8] E. Alfonseca and S. Manandhar. An unsupervised method for general named entity recognition and automated concept discovery. In Proc. of International Conference on General WordNet, 2002. [9] J. Allan, J. Carbonell, G. Doddington, J. Yamron, and Y. Yang. Topic detection and tracking pilot study final report. In Proc. of the DARPA Broadcast News Transcription and Understanding Workshop, pages 194–218, 1998. [10] M.-H. Antoni-Lay, G. Francopoulo, and L. Zaysser. A generic model for reusable lexicons: The genelex project. Literary and Linguistic Computing, 9(1), 1994. [11] S. Argamon, M. Koppel, and G. Avneri. Routing documents according to style. In Proc. of the First International Workshop on Innovative Information Systems, 1998. [12] S. Argamon and M. Saric. Style mining of electronic messages for multiple authorship discrimination: first results. In Proc. of the 9th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 475–480, Washington, D.C., 2003. ACM. [13] R. H. Baayen, H. van Halteren, and F. J. Tweedie. Outside the cave of shadows: using syntactic annotation to enhance authorship attribution. Literary and Linguistic Computing, 2:110–120, 1996. [14] R. Barzilay, N. Elhadad, and K. R. Mckeown. Inferring strategies for sentence ordering in multidocument news summarization. Journal of Artificial Intelligence Research, 17:35–55, 2002. [15] R. Barzilay and K. R. Mckeown. Sentence fusion for multidocument news summarization. Computational Linguistics, 31:297–328, 2005. [16] J. Bengel, S. Gauch, E. Mittur, and R. Vijayaraghavan. ChatTrack: Chat Room Topic Detection Using Classification. In Proc. of the 2nd Symposium on Intelligence and Security Informatics (in review, pages 266–277, 2004. [17] M. Bhattacharyya, S. Hershkop, E. Eskin, and S. J. Stolfo. MET: An experimental system for malicious email tracking. In Proc. of the 2002 New Security Paradigms Workshop (NSPW-2002), Virginia Beach, VA, 2002. [18] M. D. Buhmann. Radial Basis Functions: Theory and Implementations. Cambridge University Press, Second edition, 2003. [19] C. J. C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 2:121–167, 1998. [20] J. F. Burrows. Word patterns and story shapes: the statistical analysis of narrative style. Literary and Linguistic Computing, 2:61–67, 1987. [28] C. E. H. Chua and J.Wareham. Fighting internet auction fraud: An assessment and proposal. Computer, 37:31–37, 2004. [29] M. Corney, O. de Vel, A. Anderson, and G. Mohay. Gender-preferential text mining of e-mail discourse. In ACSAC’02: Proc. of the 18th Annual Computer Security Applications Conference, pages 21–27, Washington, DC, USA, 2002. IEEE Computer Society. [30] N. Cristianini and J. Shawe-Taylor. An introduction to Support Vector Machines. Cambridge University Press, UK, 2000. [31] D. Cutting, D. Karger, J. Pedersen, and J. Tukey. Scatter/gather: A cluster-based approach to browsing large document collections. In Proc. of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 318–329, 1992. [32] D. Das and A. F. T. Martins. A survey on automatic text summarization. Web site: http://www.cs.cmu.edu/ nasmith/LS2/das-martins.07.pdf, 2007. Language Technologies Institute, Carnegie Mellon University. [33] O. de Vel. Mining e-mail authorship. In Proc. of ACM International Conference on Knowledge Discovery and Data Mining (KDD), Boston, 2000. [34] O. de Vel, A. Anderson, M. Corney, and G. Mohay. Mining e-mail content for author identification forensics. SIGMOD Record, 30(4):55–64, 2001. [35] O. de Vel, A. Anderson, M. Corney, and G. Mohay. Multi-topic e-mail authorship attribution forensics. In Proc. of ACM Conference on Computer Security - Workshop on Data Mining for Security Applications, 2001. [36] O. de Vel, M. Corney, A. Anderson, and G. Mohay. Language and gender author cohort analysis of e-mail for computer forensics. In Proc. of Digital Forensic Research Workshop, 2002. [37] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(1):1–38, 1977. [38] J. Diesner and K. M. Carley. Exploration of communication networks from the enron email corpus. In Proc. of Workshop on Link Analysis, Counterterrorism and Security, SIAM International Conference on Data Mining, pages 21–23. SIAM, 2005. [39] H. Dong, S. C. Hui, and Y. He. Structural analysis of chat messages for topic detection. Online Information Review, 30(5):496–516, 2006. [40] E. Elnahrawy. Log-based chat room monitoring using text categorization: A comparative study. In Proc. of the International Association of Science and Technology for Development Conference on Information and Knowledge Sharing (IKS 2002), pages 381–388. St. Thomas, US Virgin Islands, USA, 2002. [41] J. R. Finkel, T. Grenager, and C. Manning. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling. In Proc. of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), pages 363–370, 2005. [42] J. Foertsch. The impact of electronic networks on scholarly communication: Avenues for research. Discourse Processes, 19(2):301–328, 1995. [43] R. S. Forsyth and D. I. Holmes. Feature finding for text classification. Literary and Linguistic Computing, 11(4):163–174, 1996. [44] E. Frank and S. Kramer. Ensembles of nested dichotomies for multi-class problems. In Proc. of the 21st International conference of Machine Learning (ICML-2004, pages 305–312. ACM Press, 2004. [45] N. Friedman, D. Geiger, and M. Goldszmidt. Bayesian Network Classifiers. Machine Learning, 29:131–163, 1977. [46] B. C. M. Fung, K. Wang, and M. Ester. Hierarchical document clustering using frequent itemsets. In Proc. of the 3rd SIAM International Conference on Data Mining (SDM), pages 59–70, San Francisco, CA, May 2003. [47] M. Gamon. Linguistic correlates of style: authorship classification with deep linguistic analysis features. In Proc. of the 20th International Conference on Computational Linguistics, pages 611–617, Geneva, Switzerland, 2004. [48] A. M. George. WordNet: A Lexical Database for English. Communications of the ACM, 38(11):39–41, 1995. [49] A. Gray, P. Sallis, and S. Macdonell. Software forensics: Extending authorship analysis techniques to computer programs. In Proc. of the 3rd Biannual Conf. Int. Assoc. of Forensic Linguists (IAFL’97, pages 1–8, 1997. [50] R. Hadjidj, M. Debbabi, H. Lounis, F. Iqbal, A. Szporer, and D. Benredjem. Towards an integrated email forensics analysis framework. Digital Investigation, 5(3- 4):124–137, 2009. [51] J. Han and J. Pei. Mining frequent patterns by pattern-growth: methodology and implications. SIGKDD Explor. Newsl., 2(2):14–20, 2000. [52] C. Hansen. To Catch a Predator: Protecting Your Kids from Online Enemies Already in Your Home. Tantor Media, 2007. [53] A. Hartigan and M.A. Wong. A k-means clustering algorithm. Applied Statistics, 28(1):100–108, 1979. [54] J. Heer, S. K. Card, and J. A. Landay. prefuse: a toolkit for interactive information visualization. In Proc. of the SIGCHI conference on Human factors in computing systems, pages 421–430, Portland, Oregon, USA, 2005. ACM. [55] M. Hegland. The apriori algorithm - a tutorial. WSPC/Lecture Notes Series, 9(7), March 2005. http://www2.ims.nus.edu.sg/preprints/2005-29.pdf. [56] D. I. Holmes. The evolution of stylometry in humanities. Literary and Linguistic Computing, 13(3):111–117, 1998. [57] J. D. Holt and S. M. Chung. Efficient mining of association rules in text databases. In Proc. of the 8th ACM International Conference on Information and Knowledge Management (CIKM), pages 234–242, Kansas City, Missouri, United States, 1999. ACM. [58] F. Iqbal, H. Binsalleeh, B. C. M. Fung, and M. Debbabi. Mining writeprints from anonymous e-mails for forensic investigation. Digital Investigation, pages 1–9, 2010. [59] F. Iqbal, H. Binsalleeh, B. C. M. Fung, and M. Debbabi. Mining writeprints from anonymous e-mails for forensic investigation. Digital Investigation, in press. [60] F. Iqbal, R. Hadjidj, B. C. M. Fung, and M. Debbabi. A novel approach of mining write-prints for authorship attribution in e-mail forensics. Digital Investigation, 5(1):42–51, 2008. [61] T. Joachims. Text categorization with support vector machines: Learning with many relevant features. In Proc. of European Conf. Machine Learning (ECML’98), pages 137–142. Springer Verlag, 1998. [62] T. Kolenda, L. K. Hansen, and J. Larsen. Signal detection using ICA: Application to chat room topic spotting. In Proc. of the Third International Conference on Independent Component Analysis and Blind Source Separation, pages 540–545, 2001.