1] Ackley, D. H., Hinton, G. E. and Sejnowski, T. J. [1985], ‘A learning algorithm for boltzmann machines’, Cognitive Science 9, 147–169. [2] Anderson, J. R. [2000], Cognitive Psychology And Its Implications, 5th ed. edn, Worth Publishers, New York. [3] Bengio, Y. [2009], ‘Learning deep architectures for ai’, Foundations and Trends in Machine Learning 2(1), 1–127. [4] Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D. and Bengio, Y. [2010], Theano: a CPU and GPU math expression compiler, in ‘Proceedings of the Python for Scientific Computing Conference (SciPy)’. [5] Boureau, Y.-L., Ponce, J. and LeCun, Y. [2010], A theoretical analysis of feature pooling in visual recognition, in ‘Proceedings of the 27th International Conference on Machine Learning (ICML-10)’, pp. 111–118. [6] Bryson, A. E. and Ho, Y. C. [1969], Applied Optimal Control, Blaisdell, New York. [7] Chang, C.-C. and Lin, C.-J. [2011], ‘LIBSVM: A library for support vector machines’, ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. [8] Chu, J. L. and Krzy˙zak, A. [2014a], Analysis of feature maps selection in supervised learning using convolutional neural networks, in M. Sokolova and P. van Beek, eds, ‘Canadian Conference on Artificial Intelligence 2014, Lecture Notes on Artifical Intelligece (LNAI)’, Vol. 8436, Springer International Publishing Switzerland, pp. 59–70. [9] Chu, J. L. and Krzy˙zak, A. [2014b], Application of support vector machines, convolutional neural networks and deep belief networks to recognition of partially occluded objects, in L. Rutkowski, ed., ‘The 13th International Conference on Artificial Intelligence and Soft Computing ICAISC 2014, Lecture Notes on Artifical Intelligece (LNAI)’, Vol. 8467, Springer International Publishing Switzerland, pp. 34–46. [10] Ciresan, D., Meier, U. and Schmidhuber, J. [2012], Multi-column deep neural networks for image classification, in ‘Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)’, IEEE, pp. 3642–3649. [11] Coates, A., Ng, A. Y. and Lee, H. [2011], An analysis of single-layer networks in unsupervised feature learning, in ‘International Conference on Artificial Intelligence and Statistics (AISTATS)’, pp. 215–223. [12] Collobert, R. and Bengio, S. [2004], ‘Links between perceptrons, mlps and svms’, Proceedings of the 21st International Conference on Machine Learning p. 23. [13] Cortes, C. and Vapnik, V. N. [1995], ‘Support-vector networks’, Machine Learning 20, 273–297. [14] Dreyfus, S. [1962], ‘The numerical solution of variational problems’, Journal of Mathematical Analysis and Applications 5(1), 30–45. [15] Duda, R. O., Hart, P. E. and Stork, D. G. [2001], Pattern Classification, second edition edn, John Wiley & Sons, Inc. [16] Eigen, D., Rolfe, J., Fergus, R. and LeCun, Y. [2013], ‘Understanding Deep Architectures using a Recursive Convolutional Network’, ArXiv e-prints . [17] Fei-Fei, L., Fergus, R. and Perona, P. [2004], ‘Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories’, Workshop on Generative-Model Based Vision, IEEE Computer Vision and Pattern Recognition 2004 . [18] Fukushima, K. [2003], ‘Neocognitron for handwritten digit recognition’, Neurocomputing 51, 161–180. [19] Fukushima, K. and Miyake, S. [1982], ‘Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position’, Pattern Recognition 15(6), 455–469. [20] Hebb, D. [1949], The Organization of Behaviour, John Wiley, New York. [21] Hinton, G. E. [2002], ‘Training products of experts by minimizing contrastive divergence’, Neural Computation 14(8), 1771–1800. [22] Hinton, G. E. [2010], ‘A practical guide to training restricted boltzmann machines’, Momentum 9(1), 599–619. [23] Hinton, G. E., Osindero, S. and Teh, Y. W. [2006], ‘A fast learning algorithm for deep belief nets’, Neural Computation 18, 1527–1554. [24] Hinton, G. E. and Salakhutdinov, R. R. [2006], ‘Reducing the dimensionality of data with neural networks’, Science 313, 504–507. [25] Hopfield, J. J. [1982], ‘Neural networks and physical systems with emergent collective computational abilities’, Proceedings of the National Academy of Sciences of the USA 79(8), 2554–2558. [26] Huang, F. J. and LeCun, Y. [2006], ‘Large-scale learning with svm and convolutional nets for generic object categorization’, Proceedings of the 2006 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1, 284–291. [27] Hubel, D. H. and Wiesel, T. N. [1962], ‘Receptive fields, binocular interaction and functional architecture in a cats visual cortex’, Journal of Physiology (London) 160, 106– 154. [28] Krizhevsky, A., Sutskever, I. and Hinton, G. [2012], Imagenet classification with deep convolutional neural networks, in ‘Advances in Neural Information Processing Systems 25’, pp. 1106–1114. [29] LeCun, Y., Bottou, L., Bengio, Y. and P., H. [1998], ‘Gradient-based learning applied to document recognition’, Proceedings of the IEEE 86(11), 2278–2324. [30] LeCun, Y., Huang, F. and Bottou, L. [2004], ‘Learning methods for generic object recognition with invariance to pose and lighting’, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2, 97–104. [31] Lee, H., Grosse, R., Ranganath, R. and Ng, A. Y. [2009], ‘Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations’, Proceedings of the 26th International Conference on Machine Learning pp. 609–616. [32] McClelland, J. and Rumelhart, D. [1988], Explorations in Parallel Distributed Processing, MIT Press, Cambridge. [33] McCulloch, W. S. and Pitts, W. [1943], ‘A logical calculus of ideas immanent in nervous activity’, Bulletin of Mathematical Biophysics 5, 115–133. [34] Mehrotra, K., Mohan, C. K. and Ranka, S. [1997], Elements Of Artificial Neural Networks, The MIT Press, Cambridge, MA. [35] Minsky, M. L. and Papert, S. A. [1969], Perceptrons, MIT Press, Cambridge. [36] Mohamed, A. R., Yu, D. and Deng, L. [2010], ‘Investigation of full-sequence training of deep belief networks for speech recognition’, Conference of the International Speech Communication Association (INTERSPEECH) pp. 2846–2849. [37] Nair, V. and Hinton, G. E. [2009], ‘3d object recognition with deep belief nets’, Advances in Neural Information Processing Systems (NIPS) pp. 1339–1347. [38] Ngiam, J., Chen, Z., Chia, D., Koh, P. W., Le, Q. V. and Ng, A. [2010], ‘Tiled convolutional neural networks’, Advances in Neural Information Processing Systems (NIPS) pp. 1279–1287. [39] Nguyen, G. H., Phung, S. L. and Bouzerdoum, A. [2009], Reduced training of convolutional neural networks for pedestrian detection, in ‘International Conference on Information Technology and Applications’. [40] Osindero, S. and Hinton, G. [2008], ‘Modeling image patches with a directed hierarchy of markov random fields’, Advances In Neural Information Processing Systems (NIPS) 20. [41] Pylyshyn, Z. W. [1998], ‘What is cognitive science?’. URL: http://ruccs.rutgers.edu/ftp/pub/papers/ruccsbook.PDF [42] Pylyshyn, Z. W. [2003], ‘Return of the mental image: Are there really pictures in the brain?’, Trends in Cognitive Science 7(3), 113–118. [43] Ranzato, M. A., Huang, F. J., Boureau, Y. L. and LeCun, Y. [2007], ‘Unsupervised learning of invariant feature hierarchies with applications to object recognition’, 2007 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 1–8. [44] Ranzato, M., Susskind, J., Mnih, V. and Hinton, G. [2011], ‘On deep generative models with applications to recognition’, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2857–2864. [45] Rosenblatt, F. [1958], ‘The perceptron, a probabilistic model for information storage and organization in the brain’, Psychological Review 62, 386–408. [46] Rumelhart, D. E., Hinton, G. E. and Williams, R. J. [1986], ‘Learning internal representations by error propagation’, Parallel Distributed Processing 1. [47] Russell, S. and Norvig, P., eds [2003], Artificial Intelligence: A Modern Approach, 2nd ed. edn, Pearson Education, Upper Saddle River, NJ. [48] Salakhutdinov, R. and Hinton, G. E. [2009], ‘Deep boltzmann machines’, International Conference on Artificial Intelligence and Statistics (AISTATS) pp. 448–455. [49] Scherer, D., Schulz, H. and Behnke, S. [2010], Accelerating large-scale convolutional neural networks with parallel graphics multiprocessors, in ‘Artificial Neural Networks– ICANN 2010’, Springer, pp. 82–91. [50] Schulz, H., Muller, A. and S., B. [2010], ‘Exploiting local structure in stacked boltzmann machines’, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN) . [51] Simard, P., Steinkraus, D. and Platt, J. C. [2003], Best practices for convolutional neural networks applied to visual document analysis., in ‘International Conference on Document Analysis and Recognition (ICDAR)’, Vol. 3, pp. 958–962. [52] Smolensky, P. [1986], Information processing in dynamical systems: Foundations of harmony theory, in D. E. Rumelhart and J. L. McLelland, eds, ‘Parallel Distributed Processing: Explorations in the Microstructure of Cognition’, Vol. 1, MIT Press, chapter 6, pp. 194–281. [53] Thagard, P. [1996], Mind: Introduction To Cognitive Science, The MIT Press, Cambridge, MA. [54] Uetz, R. and Behnke, S. [2009a], Large-scale object recognition with cuda-accelerated hierarchical neural networks, in ‘IEEE International Conference on Intelligent Computing and Intelligent Systems, 2009. ICIS 2009.’, Vol. 1, IEEE, pp. 536–541. [55] Uetz, R. and Behnke, S. [2009b], Locally-connected hierarchical neural networks for gpu-accelerated object recognition, in ‘NIPS 2009 Workshop on Large-Scale Machine Learning: Parallelism and Massive Datasets’. [56] Werbos, P. [1974], Beyond regression: New tools for prediction and analysis in the behavioral sciences, PhD thesis, Harvard University.