alqawasmeh, najla (2022) Novel Feature Extraction Methods to Automatically Detect Gender and Age From Handwritten Documents. PhD thesis, Concordia University.
Preview |
Text (application/pdf)
10MBAL-Qawasmeh_PhD_F2022.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Handwriting acts as a mirror, reflecting the writer’s personality characteristics and demographic
properties. As a result, handwriting analysis has become significant in a variety of disciplines, in�cluding medical, socialisation, and security. Identifying a person’s biometric information from
handwriting has recently become a crucial research topic. Therefore, we decided to focus our re�search on extracting gender and age from handwriting, as the former is important in psychology,
document analysis, paleography, graphology, and forensic investigation. The latter on the other
hand, is essential in medical diagnosis and forensic analysis.
Handwriting analysis is commonly done by extracting relevant features such as slant, pen pres�sure, word spacing, and others. The traditional method of studying handwritten texts involved a
graphologist visually inspecting a set of related features. However, analysing many handwritings
takes a long time, making it a tiring and time-consuming process. In addition, the analyst’s level
of expertise influences the analysis too. As a result, we have studied and developed an automatic
technique for analysing handwriting documents to extract the writer’s gender and age without addi�tional human intervention.
The proposed analysis systems include the five essential aspects of handwriting analysis sys�tems. An Arabic dataset was first gathered to test and evaluate the proposed systems. We named
it Free Style Handwritten Samples (FSHS). Our dataset stands out from others because it has 2200
writers, more than any other Arabic dataset currently available. This made the dataset vary with the
iii
handwriting styles, as the total number of handwritten documents in the dataset is 2700. The dataset
is mostly text-independent, except 500 samples are text-dependent. The second phase is image ac�quisition and preprocessing. Moreover, to extract the related features, image processing techniques
and the transfer learning method were utilised. While in the classification phase, Support Vector
Machine (SVM) and Neural Network (NN) algorithms were employed. Finally, multiple assess�ment metrics were used to evaluate the proposed systems.
Automatically identifying a writer’s gender and age is a challenging task due to the overlapping
features of handwritings from different writers. Auto-gender and age detection can be implemented
using a set of extracted features or by applying transfer learning. This research presents two dif�ferent methodologies to detect the gender and age of a writer from handwritten documents. First,
a set of related features were extracted for the gender detection system, such as pen pressure, word
spacing, text line irregularity, and left and right margins. At the same time, the percentage of black
and white pixels and irregularities in pen pressure, slant, and text lines were retrieved to determine
the age of writer. Then, the extracted features were analysed separately, followed by the generation
and testing of combinations of two and three features. Finally, the sum of all features was evalu�ated. Subsequently, an automatic feature extraction methodology leveraging the knowledge from
two Convolutional Neural Networks (CNN), GoogleNet and ResNet, was applied.
The two proposed systems performed well when applied to the FSHS dataset. The accuracy
rates for the gender detection system employing SVM and NN methods, respectively, were 94.7%
and 97.1%. In addition, when we used the SVM and NN approaches to apply the age detection
system to the same dataset, we obtained accuracy values of 71% and 63.5%, respectively.
To compare our results to other available works, the proposed systems were applied to the public
datasets ICDAR2013 and the Khatt dataset. Finally, a number of assessment and tests were carried
out to assess the effectiveness of the proposed systems, and the results reveal that the proposed
gender and age detection systems outperform the current state-of-the-art with an accuracy rates of
67% and 96.2%, respectively.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (PhD) |
Authors: | alqawasmeh, najla |
Institution: | Concordia University |
Degree Name: | Ph. D. |
Program: | computer science |
Date: | 13 July 2022 |
Thesis Supervisor(s): | Suen, Ching Y. |
ID Code: | 991167 |
Deposited By: | Najla Alqawasmeh |
Deposited On: | 27 Oct 2022 14:27 |
Last Modified: | 27 Oct 2022 14:27 |
Repository Staff Only: item control page