Chu, Joseph Lin (2014) Using Support Vector Machines, Convolutional Neural Networks and Deep Belief Networks for Partially Occluded Object Recognition. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
2MBChu_MCompSc_S2014.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Artificial neural networks have been widely used for machine learning tasks such as object recognition. Recent developments have made use of biologically inspired architectures, such as the Convolutional Neural Network, and the Deep Belief Network. A theoretical method for estimating the optimal number of feature maps for a Convolutional Neural Network maps using the dimensions of the receptive field or convolutional kernel is proposed. Empirical experiments are performed that show that the method works to an extent for extremely small receptive fields, but doesn't generalize as clearly to all receptive field sizes. We then test the hypothesis that generative models such as the Deep Belief Network should perform better on occluded object recognition tasks than purely discriminative models such as Convolutional Neural Networks. We find that the data does not support this hypothesis when the generative models are run in a partially discriminative manner. We also find that the use of Gaussian visible units in a Deep Belief Network trained on occluded image data allows it to also learn to classify non-occluded images.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Chu, Joseph Lin |
Institution: | Concordia University |
Degree Name: | M. Comp. Sc. |
Program: | Computer Science |
Date: | 18 March 2014 |
Thesis Supervisor(s): | Krzyzak, Adam |
Keywords: | neural networks, convolutional neural networks, deep belief networks, support vector machines, feature maps, occlusions, object recognition |
ID Code: | 978484 |
Deposited By: | JOSEPH CHU |
Deposited On: | 03 Jul 2014 18:02 |
Last Modified: | 18 Jan 2018 17:46 |
References:
1] Ackley, D. H., Hinton, G. E. and Sejnowski, T. J. [1985], ‘A learning algorithm forboltzmann machines’, Cognitive Science 9, 147–169.
[2] Anderson, J. R. [2000], Cognitive Psychology And Its Implications, 5th ed. edn, Worth
Publishers, New York.
[3] Bengio, Y. [2009], ‘Learning deep architectures for ai’, Foundations and Trends in
Machine Learning 2(1), 1–127.
[4] Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G.,
Turian, J., Warde-Farley, D. and Bengio, Y. [2010], Theano: a CPU and GPU math
expression compiler, in ‘Proceedings of the Python for Scientific Computing Conference
(SciPy)’.
[5] Boureau, Y.-L., Ponce, J. and LeCun, Y. [2010], A theoretical analysis of feature
pooling in visual recognition, in ‘Proceedings of the 27th International Conference on
Machine Learning (ICML-10)’, pp. 111–118.
[6] Bryson, A. E. and Ho, Y. C. [1969], Applied Optimal Control, Blaisdell, New York.
[7] Chang, C.-C. and Lin, C.-J. [2011], ‘LIBSVM: A library for support vector machines’,
ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27. Software
available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[8] Chu, J. L. and Krzy˙zak, A. [2014a], Analysis of feature maps selection in supervised
learning using convolutional neural networks, in M. Sokolova and P. van Beek, eds,
‘Canadian Conference on Artificial Intelligence 2014, Lecture Notes on Artifical Intelligece
(LNAI)’, Vol. 8436, Springer International Publishing Switzerland, pp. 59–70.
[9] Chu, J. L. and Krzy˙zak, A. [2014b], Application of support vector machines, convolutional
neural networks and deep belief networks to recognition of partially occluded
objects, in L. Rutkowski, ed., ‘The 13th International Conference on Artificial Intelligence
and Soft Computing ICAISC 2014, Lecture Notes on Artifical Intelligece
(LNAI)’, Vol. 8467, Springer International Publishing Switzerland, pp. 34–46.
[10] Ciresan, D., Meier, U. and Schmidhuber, J. [2012], Multi-column deep neural networks
for image classification, in ‘Proceedings of 2012 IEEE Conference on Computer Vision
and Pattern Recognition (CVPR)’, IEEE, pp. 3642–3649.
[11] Coates, A., Ng, A. Y. and Lee, H. [2011], An analysis of single-layer networks in
unsupervised feature learning, in ‘International Conference on Artificial Intelligence
and Statistics (AISTATS)’, pp. 215–223.
[12] Collobert, R. and Bengio, S. [2004], ‘Links between perceptrons, mlps and svms’,
Proceedings of the 21st International Conference on Machine Learning p. 23.
[13] Cortes, C. and Vapnik, V. N. [1995], ‘Support-vector networks’, Machine Learning
20, 273–297.
[14] Dreyfus, S. [1962], ‘The numerical solution of variational problems’, Journal of Mathematical
Analysis and Applications 5(1), 30–45.
[15] Duda, R. O., Hart, P. E. and Stork, D. G. [2001], Pattern Classification, second edition
edn, John Wiley & Sons, Inc.
[16] Eigen, D., Rolfe, J., Fergus, R. and LeCun, Y. [2013], ‘Understanding Deep Architectures
using a Recursive Convolutional Network’, ArXiv e-prints .
[17] Fei-Fei, L., Fergus, R. and Perona, P. [2004], ‘Learning generative visual models from
few training examples: an incremental bayesian approach tested on 101 object categories’,
Workshop on Generative-Model Based Vision, IEEE Computer Vision and
Pattern Recognition 2004 .
[18] Fukushima, K. [2003], ‘Neocognitron for handwritten digit recognition’, Neurocomputing
51, 161–180.
[19] Fukushima, K. and Miyake, S. [1982], ‘Neocognitron: A new algorithm for pattern
recognition tolerant of deformations and shifts in position’, Pattern Recognition
15(6), 455–469.
[20] Hebb, D. [1949], The Organization of Behaviour, John Wiley, New York.
[21] Hinton, G. E. [2002], ‘Training products of experts by minimizing contrastive divergence’,
Neural Computation 14(8), 1771–1800.
[22] Hinton, G. E. [2010], ‘A practical guide to training restricted boltzmann machines’,
Momentum 9(1), 599–619.
[23] Hinton, G. E., Osindero, S. and Teh, Y. W. [2006], ‘A fast learning algorithm for deep
belief nets’, Neural Computation 18, 1527–1554.
[24] Hinton, G. E. and Salakhutdinov, R. R. [2006], ‘Reducing the dimensionality of data
with neural networks’, Science 313, 504–507.
[25] Hopfield, J. J. [1982], ‘Neural networks and physical systems with emergent collective
computational abilities’, Proceedings of the National Academy of Sciences of the USA
79(8), 2554–2558.
[26] Huang, F. J. and LeCun, Y. [2006], ‘Large-scale learning with svm and convolutional
nets for generic object categorization’, Proceedings of the 2006 IEEE Conference on
Computer Vision and Pattern Recognition (CVPR) 1, 284–291.
[27] Hubel, D. H. and Wiesel, T. N. [1962], ‘Receptive fields, binocular interaction and functional
architecture in a cats visual cortex’, Journal of Physiology (London) 160, 106–
154.
[28] Krizhevsky, A., Sutskever, I. and Hinton, G. [2012], Imagenet classification with deep
convolutional neural networks, in ‘Advances in Neural Information Processing Systems
25’, pp. 1106–1114.
[29] LeCun, Y., Bottou, L., Bengio, Y. and P., H. [1998], ‘Gradient-based learning applied
to document recognition’, Proceedings of the IEEE 86(11), 2278–2324.
[30] LeCun, Y., Huang, F. and Bottou, L. [2004], ‘Learning methods for generic object
recognition with invariance to pose and lighting’, Proceedings of IEEE Conference on
Computer Vision and Pattern Recognition (CVPR) 2, 97–104.
[31] Lee, H., Grosse, R., Ranganath, R. and Ng, A. Y. [2009], ‘Convolutional deep belief
networks for scalable unsupervised learning of hierarchical representations’, Proceedings
of the 26th International Conference on Machine Learning pp. 609–616.
[32] McClelland, J. and Rumelhart, D. [1988], Explorations in Parallel Distributed Processing,
MIT Press, Cambridge.
[33] McCulloch, W. S. and Pitts, W. [1943], ‘A logical calculus of ideas immanent in nervous
activity’, Bulletin of Mathematical Biophysics 5, 115–133.
[34] Mehrotra, K., Mohan, C. K. and Ranka, S. [1997], Elements Of Artificial Neural Networks,
The MIT Press, Cambridge, MA.
[35] Minsky, M. L. and Papert, S. A. [1969], Perceptrons, MIT Press, Cambridge.
[36] Mohamed, A. R., Yu, D. and Deng, L. [2010], ‘Investigation of full-sequence training
of deep belief networks for speech recognition’, Conference of the International Speech
Communication Association (INTERSPEECH) pp. 2846–2849.
[37] Nair, V. and Hinton, G. E. [2009], ‘3d object recognition with deep belief nets’, Advances
in Neural Information Processing Systems (NIPS) pp. 1339–1347.
[38] Ngiam, J., Chen, Z., Chia, D., Koh, P. W., Le, Q. V. and Ng, A. [2010], ‘Tiled
convolutional neural networks’, Advances in Neural Information Processing Systems
(NIPS) pp. 1279–1287.
[39] Nguyen, G. H., Phung, S. L. and Bouzerdoum, A. [2009], Reduced training of convolutional
neural networks for pedestrian detection, in ‘International Conference on
Information Technology and Applications’.
[40] Osindero, S. and Hinton, G. [2008], ‘Modeling image patches with a directed hierarchy
of markov random fields’, Advances In Neural Information Processing Systems (NIPS)
20.
[41] Pylyshyn, Z. W. [1998], ‘What is cognitive science?’.
URL: http://ruccs.rutgers.edu/ftp/pub/papers/ruccsbook.PDF
[42] Pylyshyn, Z. W. [2003], ‘Return of the mental image: Are there really pictures in the
brain?’, Trends in Cognitive Science 7(3), 113–118.
[43] Ranzato, M. A., Huang, F. J., Boureau, Y. L. and LeCun, Y. [2007], ‘Unsupervised
learning of invariant feature hierarchies with applications to object recognition’, 2007
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 1–8.
[44] Ranzato, M., Susskind, J., Mnih, V. and Hinton, G. [2011], ‘On deep generative models
with applications to recognition’, 2011 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR) pp. 2857–2864.
[45] Rosenblatt, F. [1958], ‘The perceptron, a probabilistic model for information storage
and organization in the brain’, Psychological Review 62, 386–408.
[46] Rumelhart, D. E., Hinton, G. E. and Williams, R. J. [1986], ‘Learning internal representations
by error propagation’, Parallel Distributed Processing 1.
[47] Russell, S. and Norvig, P., eds [2003], Artificial Intelligence: A Modern Approach, 2nd
ed. edn, Pearson Education, Upper Saddle River, NJ.
[48] Salakhutdinov, R. and Hinton, G. E. [2009], ‘Deep boltzmann machines’, International
Conference on Artificial Intelligence and Statistics (AISTATS) pp. 448–455.
[49] Scherer, D., Schulz, H. and Behnke, S. [2010], Accelerating large-scale convolutional
neural networks with parallel graphics multiprocessors, in ‘Artificial Neural Networks–
ICANN 2010’, Springer, pp. 82–91.
[50] Schulz, H., Muller, A. and S., B. [2010], ‘Exploiting local structure in stacked boltzmann
machines’, European Symposium on Artificial Neural Networks, Computational
Intelligence and Machine Learning (ESANN) .
[51] Simard, P., Steinkraus, D. and Platt, J. C. [2003], Best practices for convolutional
neural networks applied to visual document analysis., in ‘International Conference on
Document Analysis and Recognition (ICDAR)’, Vol. 3, pp. 958–962.
[52] Smolensky, P. [1986], Information processing in dynamical systems: Foundations of
harmony theory, in D. E. Rumelhart and J. L. McLelland, eds, ‘Parallel Distributed
Processing: Explorations in the Microstructure of Cognition’, Vol. 1, MIT Press, chapter
6, pp. 194–281.
[53] Thagard, P. [1996], Mind: Introduction To Cognitive Science, The MIT Press, Cambridge,
MA.
[54] Uetz, R. and Behnke, S. [2009a], Large-scale object recognition with cuda-accelerated
hierarchical neural networks, in ‘IEEE International Conference on Intelligent Computing
and Intelligent Systems, 2009. ICIS 2009.’, Vol. 1, IEEE, pp. 536–541.
[55] Uetz, R. and Behnke, S. [2009b], Locally-connected hierarchical neural networks for
gpu-accelerated object recognition, in ‘NIPS 2009 Workshop on Large-Scale Machine
Learning: Parallelism and Massive Datasets’.
[56] Werbos, P. [1974], Beyond regression: New tools for prediction and analysis in the
behavioral sciences, PhD thesis, Harvard University.
Repository Staff Only: item control page