A system for audio-visual speech recognition

Shdaifat, I.; Grigat, R.-R.

doi:10.21437/Interspeech.2005-367

A system for audio-visual speech recognition

I. Shdaifat, R.-R. Grigat

In this work, a system of audio visual speech recognition will be presented. A new hybrid visual feature combination, which is suitable for audio -visual speech recognition was implemented. The features comprise both the shape and the appearance of lips, the dimensional reduction is applied using discrete cosine transform (DCT). A large visual speech database of the German language has been assembled, the German Audio-Visual Database (GAVD). The conducted experiments using only visual features resulted in a high recognition accuracy and improved the audio-visual speech recognition drastically.

doi: 10.21437/Interspeech.2005-367

Cite as: Shdaifat, I., Grigat, R.-R. (2005) A system for audio-visual speech recognition. Proc. Interspeech 2005, 1197-1200, doi: 10.21437/Interspeech.2005-367

@inproceedings{shdaifat05_interspeech,
  author={I. Shdaifat and R.-R. Grigat},
  title={{A system for audio-visual speech recognition}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={1197--1200},
  doi={10.21437/Interspeech.2005-367}
}