Sign Language Recognition Using Convolutional Neural Networks

Conference paper
First Online: 01 January 2015

pp 572–578
Cite this conference paper

Computer Vision - ECCV 2014 Workshops (ECCV 2014)

Lionel Pigou¹⁶,
Sander Dieleman¹⁶,
Pieter-Jan Kindermans¹⁶ &
…
Benjamin Schrauwen¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8925))

Included in the following conference series:

European Conference on Computer Vision

11k Accesses
133 Citations
1 Altmetric

Abstract

There is an undeniable communication problem between the Deaf community and the hearing majority. Innovations in automatic sign language recognition try to tear down this communication barrier. Our contribution considers a recognition system using the Microsoft Kinect, convolutional neural networks (CNNs) and GPU acceleration. Instead of constructing complex handcrafted features, CNNs are able to automate the process of feature construction. We are able to recognize 20 Italian gestures with high accuracy. The predictive model is able to generalize on users and surroundings not occurring during training with a cross-validation accuracy of 91.7%. Our model achieves a mean Jaccard Index of 0.789 in the ChaLearn 2014 Looking at People gesture spotting competition.

Download to read the full chapter text

Chapter PDF

Similar content being viewed by others

Sign Language Recognition Based on 3D Convolutional Neural Networks

Chapter © 2018

Sign Language Recognition Using Convolutional Neural Networks

Chapter © 2021

Sign Language Recognition: A Comparative Analysis of Deep Learning Models

Chapter © 2022

Keywords

References

Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., Bengio, Y.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy), June 2010, oral Presentation
Google Scholar
Chai, X., Li, G., Lin, Y., Xu, Z., Tang, Y., Chen, X., Zhou, M.: Sign Language Recognition and Translation with Kinect (2013). Language Recognition and Translation with Kinect.pdf. http://vipl.ict.ac.cn/sites/default/files/papers/files/2013_FG_xjchai_Sign
Cireşan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (2012)
Google Scholar
Cooper, H., Ong, E.J., Pugeault, N., Bowden, R.: Sign language recognition using sub-units. The Journal of Machine Learning Research 13(1), 2205–2231 (2012)
MATH Google Scholar
Escalera, S., Bar, X., Gonzlez, J., Bautista, M.A., Madadi, M., Reyes, M., Ponce, V., Escalante, H.J., Shotton, J., Guyon, I.: Chalearn looking at people challenge 2014: Dataset and results. In: ECCV Workshop (2014)
Google Scholar
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics 15, pp. 315–323 (2011). http://eprints.pascal-network.org/archive/00008596/
Goodfellow, I.J., Bulatov, Y., Ibarz, J., Arnoud, S., Shet, V.: Multi-digit number recognition from street view imagery using deep convolutional neural networks (2013). arXiv preprint arXiv:1312.6082
Goodfellow, I.J., Warde-Farley, D., Lamblin, P., Dumoulin, V., Mirza, M., Pascanu, R., Bergstra, J., Bastien, F., Bengio, Y.: Pylearn2: a machine learning research library (2013). arXiv preprint arXiv:1308.4214. http://arxiv.org/abs/1308.4214
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). arXiv preprint arXiv:1207.0580
Jarrett, K., Kavukcuoglu, K.: What is the best multi-stage architecture for object recognition?. In: IEEE 12th International Conference on Computer Vision, pp. 2146–2153 (2009). http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5459469
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: CVPR (2014)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. Advances in Neural Information, 1–9 (2012). http://books.nips.cc/papers/files/nips25/NIPS2012_0534.pdf
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11) (1998)
Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 807–814 (2010)
Google Scholar
Poppe, R.: A survey on vision-based human action recognition. Image and Vision Computing 28(6), 976–990 (2010)
Article Google Scholar
Sutskever, I., Martens, J., Dahl, G., Hinton, G.: On the importance of initialization and momentum in deep learning. In: Proceedings of the 30th International Conference on Machine Learning (ICML 2013), pp. 1139–1147 (2013)
Google Scholar
Van Herreweghe, M.: Prelinguaal dove jongeren en nederlands: een syntactisch onderzoek. Universiteit Gent, Faculteit Letteren en Wijsbegeerte (1996)
Google Scholar
Verschaeren, R.: Automatische herkenning van gebaren met de microsoft kinect (2012)
Google Scholar
Zaki, M.M., Shaheen, S.I.: Sign language recognition using a combination of new vision based features. Pattern Recognition Letters 32(4), 572–577 (2011)
Article Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional neural networks (2013). arXiv preprint arXiv:1311.2901

Download references

Author information

Authors and Affiliations

ELIS, Ghent University, Ghent, Belgium
Lionel Pigou, Sander Dieleman, Pieter-Jan Kindermans & Benjamin Schrauwen

Authors

Lionel Pigou
View author publications
You can also search for this author in PubMed Google Scholar
Sander Dieleman
View author publications
You can also search for this author in PubMed Google Scholar
Pieter-Jan Kindermans
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Schrauwen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lionel Pigou .

Editor information

Editors and Affiliations

University College London, London, United Kingdom
Lourdes Agapito
University of Lugano, Lugano, Switzerland
Michael M. Bronstein
Technische Universität Dresden, Dresden, Germany
Carsten Rother

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Pigou, L., Dieleman, S., Kindermans, PJ., Schrauwen, B. (2015). Sign Language Recognition Using Convolutional Neural Networks. In: Agapito, L., Bronstein, M., Rother, C. (eds) Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science(), vol 8925. Springer, Cham. https://doi.org/10.1007/978-3-319-16178-5_40

Download citation

DOI: https://doi.org/10.1007/978-3-319-16178-5_40
Published: 19 March 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16177-8
Online ISBN: 978-3-319-16178-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics