Object Recognition with Gradient-Based Learning

LeCun, Yann; Haffner, Patrick; Bottou, Léon; Bengio, Yoshua

doi:10.1007/3-540-46805-6_19

Object Recognition with Gradient-Based Learning

Yann LeCun⁵,
Patrick Haffner⁵,
Léon Bottou⁵ &
…
Yoshua Bengio⁵

Chapter
First Online: 22 October 1999

1859 Accesses
359 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1681))

Abstract

Finding an appropriate set of features is an essential problem in the design of shape recognition systems. This paper attempts to show that for recognizing simple objects with high shape variability such as handwritten characters, it is possible, and even advantageous, to feed the system directly with minimally processed images and to rely on learning to extract the right set of features. Convolutional Neural Networks are shown to be particularly well suited to this task. We also show that these networks can be used to recognize multiple objects without requiring explicit segmentation of the objects from their surrounding. The second part of the paper presents the Graph Transformer Network model which extends the applicability of gradient-based learning to systems that use graphs to represents features, objects, and their combinations.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bengio, Y., LeCun, Y., Nohl, C., and Burges, C. (1995). LeRec: A NN/HMM Hybrid for On-Line Handwriting Recognition. Neural Computation, 7(5).
Google Scholar
Bottou, L. and Gallinari, P. (1991). A Framework for the Cooperation of Learning Algorithms. In Touretzky, D. and Lippmann, R., editors, Advances in Neural Information Processing Systems, volume 3, Denver. Morgan Kaufmann.
Google Scholar
Bottou, L., LeCun, Y., and Bengio, Y. (1997). Global Training of Document Processing Systems using Graph Transformer Networks. In Proc. of Computer Vision and Pattern Recognition, Puerto-Rico. IEEE.
Google Scholar
Burges, C. J. C. and Schoelkopf, B. (1997). Improving the accuracy and speed of support vector machines. In M. Mozer, M. J. and Petsche, T., editors, Advances in Neural Information Processing Systems 9. The MIT Press, Cambridge.
Google Scholar
Driancourt, X. and Bottou, L. (1991). MLP, LVQ and DP: Comparison & Cooperation. In Proceedings of the International Joint Conference on Neural Networks, Seattle.
Google Scholar
Drucker, H., Schapire, R., and Simard, P. (1993). Improving performance in neural networks using a boosting algorithm. In Hanson, S. J., Cowan, J. D., and Giles, C. L., editors, Advances in Neural Information Processing Systems 5, pages 42–49, San Mateo, CA. Morgan Kaufmann.
Google Scholar
Fukushima, K. (1975). Cognitron: A Self-Organizing Multilayered Neural Network. Biological Cybernetics, 20:121–136.
Article Google Scholar
Fukushima, K. and Miyake, S. (1982). Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position. Pattern Recognition, 15:455–469.
Article Google Scholar
Hubel, D. H. and Wiesel, T. N. (1962). Receptive Fields, Binocular Interaction, and Functional Architecture in the Cat’s Visual Cortex. Journal of Physiology (London), 160:106–154.
Article Google Scholar
Keeler, J., Rumelhart, D., and Leow, W. K. (1991). Integrated segmentation and recognition of hand-printed numerals. In Lippmann, R. P., Moody, J. M., and Touretzky, D. S., editors, Neural Information Processing Systems, volume 3, pages 557–563. Morgan Kaufmann Publishers, San Mateo, CA.
Google Scholar
Lades, M., Vorbrüggen, J. C., Buhmann, J., and von der Malsburg, C. (1993). Distortion Invariant Object Recognition in the Dynamic Link Architecture. IEEE Trans. Comp., 42(3):300–311.
Article Google Scholar
Lawrence, S., Giles, C. L., Tsoi, A. C., and Back, A. D. (1997). Face Recognition: A Convolutional Neural Network Approach. IEEE Transactions on Neural Networks, 8(1):98–113.
Article Google Scholar
LeCun, Y. (1986). Learning Processes in an Asymmetric Threshold Network. In Bienenstock, E., Fogelman-Soulié, F., and Weisbuch, G., editors, Disordered systems and biological organization, pages 233–240, Les Houches, France. Springer-Verlag.
Chapter Google Scholar
LeCun, Y. (1987). Modeles connexionnistes de l’apprentissage (connectionist learning models). PhD thesis, Université P. et M. Curie (Paris 6).
Google Scholar
LeCun, Y. (1988). A theoretical framework for Back-Propagation. In Touretzky, D., Hinton, G., and Sejnowski, T., editors, Proceedings of the 1988 Connectionist Models Summer School, pages 21–28, CMU, Pittsburgh, Pa. Morgan Kaufmann.
Google Scholar
LeCun, Y. (1989). Generalization and Network Design Strategies. In Pfeifer, R., Schreter, Z., Fogelman, F., and Steels, L., editors, Connectionism in Perspective, Zurich, Switzerland. Elsevier.
Google Scholar
LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1989). Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation, 1(4):541–551.
Article Google Scholar
LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1990). Handwritten digit recognition with a back-propagation net work. In Touretzky, D., editor, Advances in Neural Information Processing Systems 2 (NIPS*89), Denver, CO. Morgan Kaufman.
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, (86)11:2278–2324.
Article Google Scholar
LeCun, Y., Kanter, I., and Solla, S. (1991). Eigenvalues of covariance matrices: application to neural-network learning. Physical Review Letters, 66(18):2396–2399.
Article Google Scholar
Martin, G. L. (1993). Centered-object integrated segmentation and recognition of overlapping hand-printed characters. Neural Computation, 5:419–429.
Article Google Scholar
Matan, O., Burges, C. J. C., LeCun, Y., and Denker, J. S. (1992). Multi-Digit Recognition Using a Space Displacement Neural Network. In Moody, J. M., Hanson, S. J., and Lippman, R. P., editors, Neural Information Processing Systems, volume 4. Morgan Kaufmann Publishers, San Mateo, CA.
Google Scholar
Mozer, M. C. (1991). The perception of multiple objects: A connectionist approach. MIT Press-Bradford Books, Cambridge, MA.
Google Scholar
Nowlan, S. and Platt, J. (1995). A Convolutional Neural Network Hand Tracker. In Tesauro, G., Touretzky, D., and Leen, T., editors, Advances in Neural Information Processing Systems 7, pages 901–908, San Mateo, CA. Morgan Kaufmann.
Google Scholar
Osuna, E., Freund, R., and Girosi, F. (1997). Training Support Vector Machines: an Application to Face Detection. In Proceedings of CVPR’96, pages 130–136. IEEE Computer Society Press.
Google Scholar
Rabiner, L. R. (1989). A Tutorial On Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77(2):257–286.
Article Google Scholar
Rowley, H. A., Baluja, S., and Kanade, T. (1996). Neural Network-Based Face Detection. In Proceedings of CVPR’96, pages 203–208. IEEE Computer Society Press.
Google Scholar
Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning internal representations by error propagation. In Parallel distributed processing: Explorations in the microstructure of cognition, volume I, pages 318–362. Bradford Books, Cambridge, MA.
Chapter Google Scholar
Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5(2):197–227.
Google Scholar
Vaillant, R., Monrocq, C., and LeCun, Y. (1994). Original approach for the localisation of objects in images. IEE Proc on Vision, Image, and Signal Processing, 141(4):245–250.
Article Google Scholar
Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer, New-York.
Google Scholar
Wang, J. and Jean, J. (1993). Multi-resolution neural networks for omnifont character recognition. In Proceedings of International Conference on Neural Networks, volume III, pages 1588–1593.
Article Google Scholar
Wolf, R. and Platt, J. (1994). Postal address block location using a convolutional locator network. In Cowan, J. D., Tesauro, G., and Alspector, J., editors, Advances in Neural Information Processing Systems 6, pages 745–752.
Google Scholar

Download references

Author information

Authors and Affiliations

AT&T Shannon Lab, 100 Schulz Drive, Red Bank, NJ, 07701, USA
Yann LeCun, Patrick Haffner, Léon Bottou & Yoshua Bengio

Authors

Yann LeCun
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Haffner
View author publications
You can also search for this author in PubMed Google Scholar
Léon Bottou
View author publications
You can also search for this author in PubMed Google Scholar
Yoshua Bengio
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

LeCun, Y., Haffner, P., Bottou, L., Bengio, Y. (1999). Object Recognition with Gradient-Based Learning. In: Shape, Contour and Grouping in Computer Vision. Lecture Notes in Computer Science, vol 1681. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46805-6_19

Download citation

DOI: https://doi.org/10.1007/3-540-46805-6_19
Published: 22 October 1999
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66722-3
Online ISBN: 978-3-540-46805-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics