Skip to main content

Object Recognition with Gradient-Based Learning

  • Chapter
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1681))

Abstract

Finding an appropriate set of features is an essential problem in the design of shape recognition systems. This paper attempts to show that for recognizing simple objects with high shape variability such as handwritten characters, it is possible, and even advantageous, to feed the system directly with minimally processed images and to rely on learning to extract the right set of features. Convolutional Neural Networks are shown to be particularly well suited to this task. We also show that these networks can be used to recognize multiple objects without requiring explicit segmentation of the objects from their surrounding. The second part of the paper presents the Graph Transformer Network model which extends the applicability of gradient-based learning to systems that use graphs to represents features, objects, and their combinations.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bengio, Y., LeCun, Y., Nohl, C., and Burges, C. (1995). LeRec: A NN/HMM Hybrid for On-Line Handwriting Recognition. Neural Computation, 7(5).

    Google Scholar 

  2. Bottou, L. and Gallinari, P. (1991). A Framework for the Cooperation of Learning Algorithms. In Touretzky, D. and Lippmann, R., editors, Advances in Neural Information Processing Systems, volume 3, Denver. Morgan Kaufmann.

    Google Scholar 

  3. Bottou, L., LeCun, Y., and Bengio, Y. (1997). Global Training of Document Processing Systems using Graph Transformer Networks. In Proc. of Computer Vision and Pattern Recognition, Puerto-Rico. IEEE.

    Google Scholar 

  4. Burges, C. J. C. and Schoelkopf, B. (1997). Improving the accuracy and speed of support vector machines. In M. Mozer, M. J. and Petsche, T., editors, Advances in Neural Information Processing Systems 9. The MIT Press, Cambridge.

    Google Scholar 

  5. Driancourt, X. and Bottou, L. (1991). MLP, LVQ and DP: Comparison & Cooperation. In Proceedings of the International Joint Conference on Neural Networks, Seattle.

    Google Scholar 

  6. Drucker, H., Schapire, R., and Simard, P. (1993). Improving performance in neural networks using a boosting algorithm. In Hanson, S. J., Cowan, J. D., and Giles, C. L., editors, Advances in Neural Information Processing Systems 5, pages 42–49, San Mateo, CA. Morgan Kaufmann.

    Google Scholar 

  7. Fukushima, K. (1975). Cognitron: A Self-Organizing Multilayered Neural Network. Biological Cybernetics, 20:121–136.

    Article  Google Scholar 

  8. Fukushima, K. and Miyake, S. (1982). Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position. Pattern Recognition, 15:455–469.

    Article  Google Scholar 

  9. Hubel, D. H. and Wiesel, T. N. (1962). Receptive Fields, Binocular Interaction, and Functional Architecture in the Cat’s Visual Cortex. Journal of Physiology (London), 160:106–154.

    Article  Google Scholar 

  10. Keeler, J., Rumelhart, D., and Leow, W. K. (1991). Integrated segmentation and recognition of hand-printed numerals. In Lippmann, R. P., Moody, J. M., and Touretzky, D. S., editors, Neural Information Processing Systems, volume 3, pages 557–563. Morgan Kaufmann Publishers, San Mateo, CA.

    Google Scholar 

  11. Lades, M., Vorbrüggen, J. C., Buhmann, J., and von der Malsburg, C. (1993). Distortion Invariant Object Recognition in the Dynamic Link Architecture. IEEE Trans. Comp., 42(3):300–311.

    Article  Google Scholar 

  12. Lawrence, S., Giles, C. L., Tsoi, A. C., and Back, A. D. (1997). Face Recognition: A Convolutional Neural Network Approach. IEEE Transactions on Neural Networks, 8(1):98–113.

    Article  Google Scholar 

  13. LeCun, Y. (1986). Learning Processes in an Asymmetric Threshold Network. In Bienenstock, E., Fogelman-Soulié, F., and Weisbuch, G., editors, Disordered systems and biological organization, pages 233–240, Les Houches, France. Springer-Verlag.

    Chapter  Google Scholar 

  14. LeCun, Y. (1987). Modeles connexionnistes de l’apprentissage (connectionist learning models). PhD thesis, Université P. et M. Curie (Paris 6).

    Google Scholar 

  15. LeCun, Y. (1988). A theoretical framework for Back-Propagation. In Touretzky, D., Hinton, G., and Sejnowski, T., editors, Proceedings of the 1988 Connectionist Models Summer School, pages 21–28, CMU, Pittsburgh, Pa. Morgan Kaufmann.

    Google Scholar 

  16. LeCun, Y. (1989). Generalization and Network Design Strategies. In Pfeifer, R., Schreter, Z., Fogelman, F., and Steels, L., editors, Connectionism in Perspective, Zurich, Switzerland. Elsevier.

    Google Scholar 

  17. LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1989). Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation, 1(4):541–551.

    Article  Google Scholar 

  18. LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1990). Handwritten digit recognition with a back-propagation net work. In Touretzky, D., editor, Advances in Neural Information Processing Systems 2 (NIPS*89), Denver, CO. Morgan Kaufman.

    Google Scholar 

  19. LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, (86)11:2278–2324.

    Article  Google Scholar 

  20. LeCun, Y., Kanter, I., and Solla, S. (1991). Eigenvalues of covariance matrices: application to neural-network learning. Physical Review Letters, 66(18):2396–2399.

    Article  Google Scholar 

  21. Martin, G. L. (1993). Centered-object integrated segmentation and recognition of overlapping hand-printed characters. Neural Computation, 5:419–429.

    Article  Google Scholar 

  22. Matan, O., Burges, C. J. C., LeCun, Y., and Denker, J. S. (1992). Multi-Digit Recognition Using a Space Displacement Neural Network. In Moody, J. M., Hanson, S. J., and Lippman, R. P., editors, Neural Information Processing Systems, volume 4. Morgan Kaufmann Publishers, San Mateo, CA.

    Google Scholar 

  23. Mozer, M. C. (1991). The perception of multiple objects: A connectionist approach. MIT Press-Bradford Books, Cambridge, MA.

    Google Scholar 

  24. Nowlan, S. and Platt, J. (1995). A Convolutional Neural Network Hand Tracker. In Tesauro, G., Touretzky, D., and Leen, T., editors, Advances in Neural Information Processing Systems 7, pages 901–908, San Mateo, CA. Morgan Kaufmann.

    Google Scholar 

  25. Osuna, E., Freund, R., and Girosi, F. (1997). Training Support Vector Machines: an Application to Face Detection. In Proceedings of CVPR’96, pages 130–136. IEEE Computer Society Press.

    Google Scholar 

  26. Rabiner, L. R. (1989). A Tutorial On Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77(2):257–286.

    Article  Google Scholar 

  27. Rowley, H. A., Baluja, S., and Kanade, T. (1996). Neural Network-Based Face Detection. In Proceedings of CVPR’96, pages 203–208. IEEE Computer Society Press.

    Google Scholar 

  28. Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning internal representations by error propagation. In Parallel distributed processing: Explorations in the microstructure of cognition, volume I, pages 318–362. Bradford Books, Cambridge, MA.

    Chapter  Google Scholar 

  29. Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5(2):197–227.

    Google Scholar 

  30. Vaillant, R., Monrocq, C., and LeCun, Y. (1994). Original approach for the localisation of objects in images. IEE Proc on Vision, Image, and Signal Processing, 141(4):245–250.

    Article  Google Scholar 

  31. Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer, New-York.

    Google Scholar 

  32. Wang, J. and Jean, J. (1993). Multi-resolution neural networks for omnifont character recognition. In Proceedings of International Conference on Neural Networks, volume III, pages 1588–1593.

    Article  Google Scholar 

  33. Wolf, R. and Platt, J. (1994). Postal address block location using a convolutional locator network. In Cowan, J. D., Tesauro, G., and Alspector, J., editors, Advances in Neural Information Processing Systems 6, pages 745–752.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

LeCun, Y., Haffner, P., Bottou, L., Bengio, Y. (1999). Object Recognition with Gradient-Based Learning. In: Shape, Contour and Grouping in Computer Vision. Lecture Notes in Computer Science, vol 1681. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46805-6_19

Download citation

  • DOI: https://doi.org/10.1007/3-540-46805-6_19

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66722-3

  • Online ISBN: 978-3-540-46805-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics