Login | Register

Spotting Keywords in Offline Handwritten Documents Using Hausdorff Edit Distance

Title:

Spotting Keywords in Offline Handwritten Documents Using Hausdorff Edit Distance

Ameri, Mohammad Reza (2018) Spotting Keywords in Offline Handwritten Documents Using Hausdorff Edit Distance. PhD thesis, Concordia University.

[img]
Preview
Text (application/pdf)
Ameri_PhD_F2018.pdf - Accepted Version
Available under License Spectrum Terms of Access.
20MB

Abstract

Keyword spotting has become a crucial topic in handwritten document recognition, by enabling content-based retrieval of scanned documents using search terms. With a query keyword, one can search and index the digitized handwriting which in turn facilitates understanding of manuscripts. Common automated techniques address the keyword spotting problem through statistical representations.
Structural representations such as graphs apprehend the complex structure of handwriting. However, they are rarely used, particularly for keyword spotting techniques, due to high computational costs. The graph edit distance, a powerful and versatile method for matching any type of labeled graph, has exponential time complexity to calculate the similarities of graphs. Hence, the use of graph edit distance is constrained to small size graphs.
The recently developed Hausdorff edit distance algorithm approximates the graph edit distance with quadratic time complexity by efficiently matching local substructures. This dissertation speculates using Hausdorff edit distance could be a promising alternative to other template-based keyword spotting approaches in term of computational time and accuracy. Accordingly, the core contribution of this thesis is investigation and development of a graph-based keyword spotting technique based on the Hausdorff edit distance algorithm. The high representational power of graphs combined with the efficiency of the Hausdorff edit distance for graph matching achieves remarkable speedup as well as accuracy. In a comprehensive experimental evaluation, we demonstrate the solid performance of the proposed graph-based method when compared with state of the art, both, concerning precision and speed.
The second contribution of this thesis is a keyword spotting technique which incorporates dynamic time warping and Hausdorff edit distance approaches. The structural representation of graph-based approach combined with statistical geometric features representation compliments each other in order to provide a more accurate system. The proposed system has been extensively evaluated with four types of handwriting graphs and geometric features vectors on benchmark datasets. The experiments demonstrate a performance boost in which outperforms individual systems.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (PhD)
Authors:Ameri, Mohammad Reza
Institution:Concordia University
Degree Name:Ph. D.
Program:Computer Science
Date:6 June 2018
Thesis Supervisor(s):Bui, Tien D. and Ficher, Andreas
ID Code:984355
Deposited By: MOHAMAD REZA AMERI
Deposited On:31 Oct 2018 17:42
Last Modified:31 Oct 2018 17:42
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Back to top Back to top