Khoury, Nabil (2011) Human Identification of Problematic Handwritten Digits for Pattern Recognition. Masters thesis, Concordia University.
|PDF (Final Draft) - Accepted Version|
After decades of work in pattern recognition, humans are still considered the best recognizers of images and symbols especially in unconstrained everyday applications. This has made the human visual model a major topic of interest in pattern recognition research. A number of studies have presented promising recognition models that incorporate different aspects of the human model such as selective attention, biologically plausible saliency detection and top-down recognition. On the other hand, the last hundred years of research in human eye movement behaviour has revived the ancient philosophical idea that we see in our mind’s eye. Several computational models of eye movement control were suggested that successfully predict eye movement behaviour demonstrating a close coupling between eye movements and underlying oculomotor and cognitive processes. In the present study, the author evaluates a combined approach to identifying features of interest for Pattern Recognition applications. In the data collection stage, sixty participants are asked to verbally identify fifty-four problematic and twenty prototypical handwritten digits. Both verbal responses and visual fixations are recorded for further analysis. In the analysis stage, a smaller set of ambiguous digit images is identified based on how often participants change their minds about the numeral they represent. For each digit, visual fixations are grouped based on the numeral that participants called out. Each fixation group is then combined into a single fixation heat map. Results show that by comparing and contrasting heat maps for a given digit the features deemed most disambiguating by the human model can be identified.
|Divisions:||Concordia University > Faculty of Engineering and Computer Science > Computer Science and Software Engineering|
|Item Type:||Thesis (Masters)|
|Degree Name:||M. Comp. Sc.|
|Date:||14 September 2011|
|Thesis Supervisor(s):||Krzyzak, Adam and Suen, Ching. Y.|
|Keywords:||eye movement, pattern recognition, vision science, handwriting recognition, handwritten digit recognition, identifying features|
|Deposited By:||NABIL KHOURY|
|Deposited On:||21 Nov 2011 11:47|
|Last Modified:||10 Jan 2012 08:17|
|Additional Information:||This thesis presents an inter-disciplinary study to identify disambiguating features for handwritten digit recognition using verbal and visual responses from human participants|
EyeLink data viewer 2010. Vol. 1.10.1SR Research Ltd. http://www.sr-research.com/ (accessed 4/16/2011).
Experiment builder 2009. Vol. 1.5.58SR Research Ltd. http://www.sr-research.com/ (accessed 4/16/2011).
EyeLink II head-mounted user manual 2009. Vol. 2.14SR Research Ltd. http://www.sr-research.com/ (accessed 4/16/2011).
MATLAB 2009. Vol. 22.214.171.1249 (R2009b). Natick, Massachusetts: The MathWorks Inc.
Barriere, C. and R. Plamondon. 1998. Human identification of letters in mixed-script handwriting: An upper bound on recognition rates. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 28, no. 1: 78-81.
Brandt, Stephan A. and Lawrence W. Stark. 1997. Spontaneous eye movements during visual imagery reflect the content of the visual scene. Journal of Cognitive Neuroscience 9, no. 1: 27-38.
Brookes, D. M. 2010. VOICEBOX: A speech processing toolbox for MATLAB. http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html.Vol. 2010.
Brubeck, M., J. Haberman, and D. Mazzoni. 2010. Audacity: Free audio editor and recorder. Vol. 1.3 Beta. SourceForge.Net. http://audacity.sourceforge.net/.
Brysbaert, Marc. 1995. Arabic number reading: On the nature of the numerical scale and the origin of phonological recoding. Journal of Experimental Psychology: General 124, no. 4: 434-452.
Buswell, G. T. 1937. How adults read. Chicago, IL: University of Chicago.
Caldara, R. and S. Miellet. 2011. iMap: A novel method for statistical fixation mapping of eye movement data. Behavior Research Methods: 1-15.
Chernyak, D. A. and L. W. Stark. 2001. Top-down guided eye movements. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 31, no. 4: 514-522.
Côté, M., E. Lecolinet, M. Cheriet, and C. Y. Suen. 1998. Automatic reading of cursive scripts using a reading model and perceptual concepts. International Journal on Document Analysis and Recognition 1, no. 1: 3-17.
Duchowski, Andrew T. and SpringerLink. 2007. Eye tracking methodology. 2nd ed. London: Springer.
Exel, S. and L. Pessoa. 1998. Attentive visual recognition. Proceedings of the Fourteenth International Conference on Pattern Recognition, 1998 1, no. 1: 690-692.
Hacisalihzade, S. S., L. W. Stark, and J. S. Allen. 1992. Visual perception and sequences of eye movement fixations: A stochastic modeling approach. IEEE Transactions on Systems, Man and Cybernetics 22, no. 3: 474-481.
Itti, L., C. Koch, and E. Niebur. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, no. 11: 1254-1259.
Jain, A. K., R. P. W. Duin, and Jianchang Mao. 2000. Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, no. 1: 4-37.
Just, M. A. and P. A. Carpenter. 1980. A theory of reading: From eye fixations to comprehension. Psychological Review 87, no. 4: 329-354.
Keller, J. G., S. K. Rogers, M. Kabrisky, and M. E. Oxley. 1999. Object recognition based on human saccadic behaviour. Pattern Analysis & Applications 2, no. 3: 251-
Kienzle, W., F. A. Wichmann, B. Scholkopf, and M. O. Franz. 2007. A nonparametric approach to bottom-up visual saliency. Advances in Neural Information Processing Systems 19, no. 1: 689-696.
Lauer, F., C. Y. Suen, and G. Bloch. 2007. A trainable feature extractor for handwritten digit recognition. Pattern Recognition 40, no. 6: 1816-1824.
LeCun, Y. and C. Cortes. MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/ (accessed 4/16/2011).
Legge, G. E., S. J. Ahn, T. S. Klitz, and A. Luebker. 1997. Psychophysics of reading—XVI. the visual span in normal and low vision. Vision Research 37, no. 14: 1999-2010.
Maw, N. N. and M. Pomplun. 2004. Studying human face recognition with the gaze-contingent window technique. In Proceedings of the twenty-sixth annual meeting of the cognitive science society, ed. Forbus K., Gentner D., Regier T., 927-932. Chicago, Illinois: Citeseer.
Meng X. and Z. Wang. 2009. A pre-attentive model of biological vision. IEEE International Conference on Intelligent Computing and Intelligent Systems, 2009. ICIS 2009 3, no. 1: 154-158.
Noton, D. and L. Stark. 1971. Eye movements and visual perception. Scientific American 224, no. 6: 35-43.
Ojanpää, H. 2006. Visual search and eye movements: Studies of perceptual span.University of Helsinki, Faculty of Behavioural Sciences, Department of Psychology and Finnish Institute of Occupational Health. In University of Helsinki.
Ojanpää, H. and R. Näsänen. 2003. Effects of luminance and colour contrast on the search of information on display devices. Displays 24, no. 4-5: 167-178.
Osberger, W. and A. J. Maeder. 1998. Automatic identification of perceptually important regions in an image. Proceedings of the Fourteenth International Conference on Pattern Recognition, 1998 1, no. 1: 701-704.
Paulson, E. J. and K. S. Goodman. 1999. Influential studies in eye-movement research. International Reading Association, Inc. http://www.readingonline.org/research/eyemove.html (accessed 4/16/2011).
Peters, R. J. and L. Itti. 2007. Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention. IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR '07. 1, no. 1: 1-8.
Plannerer, B. 2005. The speech signal. Chap. 1, In An introduction to speech recognition. Vol. 1.1, 3. Munich, Germany: speech-recognition.de.
Privitera, C. M. and L. W. Stark. 2000. Algorithms for defining visual regions-of-interest: Comparison with eye fixations. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, no. 9: 970-982.
Rao, R., G. Zelinsky, M. Hayhoe, and D. Ballard. 1996. Modeling saccadic targeting in visual search. Advances in Neural Information Processing Systems: 836-842.
Rao, R. P., G. J. Zelinsky, M. M. Hayhoe, and D. H. Ballard. 2002. Eye movements in iconic visual search. Vision Research 42, no. 11: 1447-1463.
Rayner, K. 1998. Eye movements in reading and information processing: 20 years of research. Psychological Bulletin 124, no. 3: 372-422.
Reichle, E. D., K. Rayner, and A. Pollatsek. 2003. The E-Z reader model of eye-movement control in reading: Comparisons to other models. The Behavioral and Brain Sciences 26, no. 4: 445-76; discussion 477-526.
Reilly, R. G. and J. K. O'Regan. 1998. Eye movement control during reading: A simulation of some word-targeting strategies. Vision Research 38, no. 2: 303-317.
Rybak, I. A., V. I. Gusakova, A. V. Golovan, L. N. Podladchikova, and N. A. Shevtsova. 1998. A model of attention-guided visual perception and recognition. Vision Research 38, no. 15-16: 2387-2400.
Salah, A. A., E. Alpaydin, and L. Akarun. 2001. A selective attention based method for visual pattern recognition. Proceedings of the Twenty-Third Annual Conference of the Cognitive Science Society (Online): 881-886.
Salah, A. A., E. Alpaydin, and L. Akarun. 2002. A selective attention based method for visual pattern recognition with application to handwritten digit recognition and face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, no. 3: 420-425.
Schomaker, L. and E. Segers. 1999. Finding features used in the human reading of cursive handwriting. International Journal on Document Analysis and Recognition 2, no. 1: 13-18.
Sirotenko, M. 2009. MNIST-import script for matlab. http://sites.google.com/site/mihailsirotenko/projects/convolutional-neural-network-class (accessed 4/16/2011).
Stark, L. W. and C. Privitera. 1997. Top-down and bottom-up image processing. International Conference on Neural Networks, 1997 4, no. 1: 2294-2299.
Stark, L. W. and Y. S. Choi. 1996. Experimental metaphysics: The scanpath as an epistemological mechanism. In Advances in psychology, ed. H. S. Stiehl and C. Freksa W.H. Zangemeister. Vol. Volume 116, 3-69North-Holland.
Suen, C. Y., J. Kim, K. Kim, Q. Xu, and L. Lam. 2000. Handwriting recognition-the last frontiers. Proceedings of the 15th International Conference on Pattern Recognition, 2000 4, no. 1: 1-10.
Suen, C. Y. and J. Tan. 2005. Analysis of errors of handwritten digits made by a
multitude of classifiers. Pattern Recognition Letters 26, no. 3: 369-379.
Tappert, C. C., C. Y. Suen, and T. Wakahara. 1990. The state of the art in online handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, no. 8: 787-808.
Tinker, M. A. 1936. Reliability and validity of eye-movement measures of reading. Journal of Experimental Psychology 19, no. 6: 732-746.
Tolonen, T. and M. Karjalainen. 2000. A computationally efficient multipitch analysis model. IEEE Transactions on Speech and Audio Processing 8, no. 6: 708-716.
Watanabe, S. 1985. Pattern recognition: Human and mechanical. New York: Wiley.
Watanabe, Y., J. Gyoba, and K. Maruyama. 1983. Reaction time and eye movements in the recognition task of hand-written katakana-letters: An experimental verification of the discriminant analysis of letter recognition by Hayashi's quantification. Shinrigaku Kenkyu : The Japanese Journal of Psychology 54, no. 1: 58-61.
Woodford, O. 2007. SC - powerful image rendering. Vol. 2010The MathWorks Inc.
Yagi, T., K. Gouhara, and Y. Uchikawa. 1993. An algorithm of eye movement in selective fixation. IEEE International Conference on Neural Networks, 1993 2, no. 1: 761-765.
Yarbus, A. L. 1967. Eye movements and vision. New York: Plenum press.
Zhang, W., Y. Hyejin, S. Dimitris, and G. Zelinsky. 2006. A computational model of eye movements during object class detection. In Advances in neural information processing systems 18, ed. Y. Weiss, B. Schölkopf, and J. Platt, 1609-1616. Cambridge, MA: MIT Press.
Repository Staff Only: item control page