De Souza, Andrea Barretto (2003) Automatic filter selection using image quality assessment. Masters thesis, Concordia University.
We present a method for automatically selecting the best filter to treat poorly printed documents using image quality assessment. In order to estimate the quality of the image, we introduce five quality measures: stroke thickness factor, broken character factor, touching character factor, small speckle factor, and white speckle factor. Based on the information provided by the quality measures, a set of rules uses a two-stage decision process to choose the best filter among 4 morphological filters to be applied to an image. Other preprocessing tasks implemented are: skew correction, connected components analysis, and detection of reference lines. Our database contains 736 document images that were divided in three sets: training, validation and testing. Most images have one or more of the following degradations: broken characters, touching characters and salt-and-pepper noise. A training set of 370 images was used to develop the system. Experimental results on the test set of 183 images show a significant improvement in the recognition rate from 73.24% using no filter at all to 93.09% after applying a filter that was automatically selected. The recognition rate refers to the number of characters that were correctly recognized in the image using a commercial OCR. Three commercial OCR's were used to demonstrate the improvement obtained in the recognition rates in the training set.
|Divisions:||Concordia University > Faculty of Engineering and Computer Science > Computer Science and Software Engineering|
|Item Type:||Thesis (Masters)|
|Authors:||De Souza, Andrea Barretto|
|Pagination:||x, 85 leaves : ill., tables ; 29 cm.|
|Degree Name:||Theses (M.Comp.Sc.)|
|Program:||Computer Science and Software Engineering|
|Thesis Supervisor(s):||Suen, Ching Y|
|Deposited By:||Concordia University Libraries|
|Deposited On:||27 Aug 2009 17:27|
|Last Modified:||08 Dec 2010 15:26|
Repository Staff Only: item control page
Downloads per month over past year