Login | Register

Novel Feature Extraction Methods to Automatically Detect Gender and Age From Handwritten Documents

Title:

Novel Feature Extraction Methods to Automatically Detect Gender and Age From Handwritten Documents

Alqawasmeh, Najla (2022) Novel Feature Extraction Methods to Automatically Detect Gender and Age From Handwritten Documents. PhD thesis, Concordia University.

[thumbnail of AL-Qawasmeh_PhD_F2022.pdf]
Preview
Text (application/pdf)
AL-Qawasmeh_PhD_F2022.pdf - Accepted Version
Available under License Spectrum Terms of Access.
10MB

Abstract

Handwriting acts as a mirror, reflecting the writer’s personality characteristics and demographic
properties. As a result, handwriting analysis has become significant in a variety of disciplines, including medical, socialization, and security. Identifying a person’s biometric information from
handwriting has recently become a crucial research topic. Therefore, we decided to focus our research on extracting gender and age from handwriting, as the former is important in psychology,
document analysis, paleography, graphology, and forensic investigation. The latter on the other hand, is essential in medical diagnosis and forensic analysis.
Handwriting analysis is commonly done by extracting relevant features such as slant, pen pressure, word spacing, and others. The traditional method of studying handwritten texts involved a graphologist visually inspecting a set of related features. However, analyzing many handwritings takes a long time, making it a tiring and time-consuming process. In addition, the analyst’s level of expertise influences the analysis too. As a result, we have studied and developed an automatic technique for analyzing handwriting documents to extract the writer’s gender and age without additional human intervention.
The proposed analysis systems include the five essential aspects of handwriting analysis systems. An Arabic dataset was first gathered to test and evaluate the proposed systems. We named it Free Style Handwritten Samples (FSHS). Our dataset stands out from others because it has 2200 writers, more than any other Arabic dataset currently available. This made the dataset vary with the iii handwriting styles, as the total number of handwritten documents in the dataset is 2700. The dataset is mostly text-independent, except 500 samples are text-dependent. The second phase is image acquisition and preprocessing. Moreover, to extract the related features, image processing techniques
and the transfer learning method were utilised. While in the classification phase, Support Vector Machine (SVM) and Neural Network (NN) algorithms were employed. Finally, multiple assessment metrics were used to evaluate the proposed systems.
Automatically identifying a writer’s gender and age is a challenging task due to the overlapping features of handwritings from different writers. Auto-gender and age detection can be implemented using a set of extracted features or by applying transfer learning. This research presents two different methodologies to detect the gender and age of a writer from handwritten documents. First, a set of related features were extracted for the gender detection system, such as pen pressure, word spacing, text line irregularity, and left and right margins. At the same time, the percentage of black and white pixels and irregularities in pen pressure, slant, and text lines were retrieved to determine the age of writer. Then, the extracted features were analysed separately, followed by the generation and testing of combinations of two and three features. Finally, the sum of all features was evaluated. Subsequently, an automatic feature extraction methodology leveraging the knowledge from two Convolutional Neural Networks (CNN), GoogleNet and ResNet, was applied.
The two proposed systems performed well when applied to the FSHS dataset. The accuracy rates for the gender detection system employing SVM and NN methods, respectively, were 94.7% and 97.1%. In addition, when we used the SVM and NN approaches to apply the age detection system to the same dataset, we obtained accuracy values of 71% and 63.5%, respectively.
To compare our results to other available works, the proposed systems were applied to the public datasets ICDAR2013 and the Khatt dataset. Finally, a number of assessment and tests were carried out to assess the effectiveness of the proposed systems, and the results reveal that the proposed gender and age detection systems outperform the current state-of-the-art with an accuracy rates of 67% and 96.2%, respectively.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (PhD)
Authors:Alqawasmeh, Najla
Institution:Concordia University
Degree Name:Ph. D.
Program:Computer Science
Date:13 July 2022
Thesis Supervisor(s):Suen, Ching Y.
ID Code:991167
Deposited By: Najla Alqawasmeh
Deposited On:27 Oct 2022 14:27
Last Modified:05 May 2026 15:18
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top