Login | Register

Methodologies for the Management, Normalization and Identification of Sexual Predation of Minors in Cyber Chat Logs

Title:

Methodologies for the Management, Normalization and Identification of Sexual Predation of Minors in Cyber Chat Logs

Sekeres, John (2022) Methodologies for the Management, Normalization and Identification of Sexual Predation of Minors in Cyber Chat Logs. Masters thesis, Concordia University.

[thumbnail of Sekeres_MCompSc_F2022.pdf]
Preview
Text (application/pdf)
Sekeres_MCompSc_F2022.pdf - Accepted Version
Available under License Spectrum Terms of Access.
6MB

Abstract

Neural networks based on the Transformer architecture have shown great results in tasks such as machine translation and text generation. Our contribution provides a methodology for an AI agent capable of Sexual Predator Identification (SPI) based on the classification capabilities of models built on the Transformer architecture. Results are comparable to existing state-of-the-art methods, with a F0.5 score of 92.5% for predator identification on the PAN2012 test dataset consisting of 2,004,235 lines of text. Practical considerations require an AI agent that can evaluate large numbers of chats quickly. In that regard the Transformer based AI agent is able to evaluate over 2 million lines of text in under 6 minutes on a modestly configured workstation.
An AI agent by itself does not provide a complete solution to sexual predator identification. In an effort to give practical value to an AI agent, we address the vitally important but often overlooked issues of chat management and normalization. Our contribution provides a methodology for efficiently transforming raw chats from a native format into a consistent 'normalized' format suitable for analysis. We define a methodology to the problem of managing large numbers of chats, converting/normalizing 10,000 documents in a dataset in under 3 minutes on a modestly configured workstation. We present a software-based solution that among other things brings together chat management, normalization, and AI based analysis into a cohesive, productive environment that law enforcement can use to identify and build a case against suspected predators.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Sekeres, John
Institution:Concordia University
Degree Name:M. Comp. Sc.
Program:Computer Science
Date:28 August 2022
Thesis Supervisor(s):Suen, Ching Y. and Olga, Ormandjieva
ID Code:991466
Deposited By: JOHN SEKERES
Deposited On:21 Jun 2023 14:43
Last Modified:21 Jun 2023 14:43
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top