The Role of Evaluation in the Development of Spoken Language Systems

Minker, Wolfgang

doi:10.1023/A:1009660908880

The Role of Evaluation in the Development of Spoken Language Systems

Published: November 1999

Volume 3, pages 5–14, (1999)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Wolfgang Minker¹

60 Accesses
1 Citation
Explore all metrics

Abstract

In this article, several criteria and paradigms are described tomeasure the performance of spoken language systems developed in theframework of national and international research projects. Theseevaluations are carried out in the domain of spontaneous human-humaninteraction as supported by machine translation systems. They are alsoapplied in the domain of spontaneous human-machine interactiontypically used in information retrieval applications. Some evaluationparadigms are discussed in more detail. It is also shown that officialperformance tests and site-specific evaluation criteria arecomplementary in use.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Allen, J.F., Miller, B.W., Ringger, E.K., and Sikorski, T. (1996). A robust system for natural spoken dialogue. Proceedings of the 34th Annual Meeting of the Association of Computational Linguistics (ACL), Santa Cruz, USA, pp. 62–70.
d’Alessandro, C., Aubergé, V., Bailly, G., Béchet, F., Boula de Mareüil, P., Foukia, S., Goldman, J.P., Isabelle, J.F., Keller, E., Marchal, A., Mertens, P., Pagel, V., O’Shaughnessy, D., Richard, G., Talon, M.-H., Wehrli, E., and Yvon, F. (1997). Vers l’évaluation de systèmes de synthèse de parole à partir du texte en français. Proceedings of the Journées Scientifiques et Techniques du Réseau Francophone d’Ingénierie de la Langue de l’AUPELF-UREF, Avignou, France, pp. 393–397.
Bennacef, S.K., Bonneau-Maynard, H., Gauvain, J.L., Lamel, L.F., and Minker, W. (1994). A spoken language system for information retrieval. Proceedings of the International Conference on Speech and Language Processing (ICSLP), Yokohama, Japan, pp. 1271–1274.
Bruce, B. (1975). Case systems for natural language. Artificial Intelligence, 6:327–360.
Google Scholar
Dahl, D.A., Bates, M., Brown, M., Fisher, W., Huncke-Smith, K., Pallett, D., Pao, C., Rudnicky, A., and Shriberg, E. (1992). Expanding the scope of the ATIS task: The ATIS-3 corpus. Proceedings of the ARPA Workshop on Human Language Technology, Plainsborrow, USA, pp. 43–48.
Dolmazon, J.M., Bimbot, F., Adda, G., El Bèze, M., Caërou, J.C., Zeiliger, J., and Adda-Decker, M. (1997). Organisation de la première campagne aupelf pour l’évaluation des systèmes de dictée vocale. Proceedings of the Journées Scientifiques et Techniques du Réseau Francophone d’Ingénierie de la Langue de l’AUPELF-UREF, Avignou, France, pp. 13–18.
Gates, D., Lavie, A., Levin, L., Waibel, A., Gavaldà, M., Mayfield, L., Woszczyna, M., and Zahn, P. (1996). End-to-end evaluation in JANUS: A speech-to-speech translation system. Proceedings of the European Conference on Artificial Intelligence (ECAI), Budapest, Hungary, pp. 35–40.
Gauvain, J.L., Bennacef, S., Devillers, L., Lamel, L., and Rosset, S. (1997). Spoken language component of the MASK kiosk. In K. Varghese and S. Pfleger (Eds.), Human Comfort & Security of Information Systems. Berlin/Heidelberg: Springer-Verlag, pp. 93–103.
Google Scholar
Gibbon, D., Moore, R., and Winski, R. (Eds.) (1997). Handbook of Standards and Resources for Spoken Language Systems. Berlin/New York: Walter de Gruyter.
Google Scholar
Life, A., Salter, I., Temem, J.N., Bernard, F., Rosset, S., Bennacef, S., and Lamel, L. (1996). Data collection for the MASK kiosk: WOZ vs. prototype system. Proceedings of the International Conference on Speech and Language Processing (ICSLP), Philadelphia, USA, pp. 1672–1675.
MADCOW (1992). Multi-site data collection for a spoken language corpus. Proceedings of the DARPA Workshop on Speech and Natural Language, Harriman, USA, pp. 7–14.
Mariani, J.J. (1993). Overview of the cocosda initiative. Workshop of the International Coordinating Committee on Speech Databases and Speech I/O System Assessment, Berlin, Germany, pp. 1–3.
Markowitz, J.A. (1996). Using Speech Recognition. Upper Saddle River, NJ: Prentice Hall.
Google Scholar
Minami, Y., Shikano, K., Takahashi, S., Yamada, T., Yoshioka, O., and Furui, S. (1995). Large-vocabulary continuous speech recognition algorithm applied to a multi-modal telephone directory assistance system. Speech Communication, 15:301–310.
Google Scholar
Minker, W. (1998). Evaluation methodologies for interactive speech systems. Proc. First International Conference on Language Resources and Evaluation(LREC), Granada, Spain, pp. 199–206, May.
Minker, W., Bennacef, S.K., and Gauvain, J.L. (1996). A stochastic case frame approach for natural language understanding. Proceedings of the International Conference on Speech and Language Processing (ICSLP), Philadelphia, USA, pp. 1013–1016.
Néel, F., Chollet, G., Lamel, L.F., Minker, W., and Constantinescu, A. (1996). Reconnaissance et compréhension—Évaluation et applications. Fondements et perspectives en Traitement Automatique de la Parole, AUPELF-UREF, Paris, France, pp. 331–367.
Oerder, M. and Aust, H. (1994). A realtime prototype of an automatic inquiry system. Proceedings of the International Conference on Speech and Language Processing (ICSLP), Yokohama, Japan, pp. 703–706.
Pallett, D.S. (1990). DARPA ATIS test results June 1990. Proceedings of the DARPA Workshop on Speech and Natural Language, Hidden Valley, USA, pp. 114–121.
Pallett, D.S. (1991). DARPA resource management and ATIS benchmark test poster session. Proceedings of the DARPA Workshop on Speech and Natural Language, Pacific Grove, USA, pp. 49–58.
Pallett, D.S., Dahlgren, N.L., Fiscus, J.G., Fisher, W.M., Garofolo, J.S., and Tjaden, B.C. (1992). DARPA February 1992 ATIS benchmark test results. Proceedings of the DARPA Workshop on Speech and Natural Language, Harriman, USA, pp. 15–27.
Pallett, D.S., Fiscus, J.G., Fisher, W.M., Garofolo, J., Lund, B.A., Martin, A., and Przybocki, M.A. (1995). 1994 Benchmark tests for the ARPA spoken language program. Proceedings of the ARPA Workshop on Spoken Language Technology, Austin, USA, pp. 5–36.
Pallett, D.S., Fiscus, J.G., Fisher, W.M., Garofolo, J., Lund, B.A., and Przybocki. M.A. (1994). 1993 Benchmark tests for the ARPA spoken language program. Proceedings of the ARPA Workshop on Spoken Language Technology, Plainsborrow, USA, pp. 15–40.
Rabiner, L.R. (1986). A tutorial on hidden Markov models and selected applications in speech recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 77(2):257–285.
Google Scholar
Ramshaw, L.A. and Boisen, S. (1990). An SLS answer comparator. Technical report, BBN Systems and Technologies Corporation, SLS Note 7, Cambridge.
Waibel, A., Finke, M., Gates, D., Gavaldà, M., Kemp, T., Lavie, A., Maier, M., Mayfield, L., McNair, A., Rogina, I., Shima, K., Sloboda, T., Woszczyna, M., Zeppenfeld, T., and Zahn, P. (1996). JANUS-II-Translation of spontaneous conversational speech. Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), Atlanta, USA, pp. 409–412.
Young, S., Adda-Decker, M., Aubert, X., Dugast, C., Gauvain, J.L., Kershaw, D.J., Lamel, L., Leeuwen, D.A., Pye, D., Robinson, A.J., Steeneken, H.J.M., and Woodland, P.C. (1997). Multilingual large vocabulary speech recognition: The European SQALE project. Computer Speech and Language, 11:73–89.
Google Scholar

Download references

Author information

Authors and Affiliations

Spoken Language Processing Group, LIMSI-CNRS, BP 133, 91403, Orsay Cedex, France
Wolfgang Minker

Authors

Wolfgang Minker
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Minker, W. The Role of Evaluation in the Development of Spoken Language Systems. International Journal of Speech Technology 3, 5–14 (1999). https://doi.org/10.1023/A:1009660908880

Download citation

Issue Date: November 1999
DOI: https://doi.org/10.1023/A:1009660908880

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Role of Evaluation in the Development of Spoken Language Systems

Abstract

Access this article

Similar content being viewed by others

Quality Estimation for English-Hungarian Machine Translation Systems with Optimized Semantic Features

Machine Translation Evaluation: Manual Versus Automatic—A Comparative Study

Measuring the Capability of a Speech Translation System

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

The Role of Evaluation in the Development of Spoken Language Systems

Abstract

Access this article

Similar content being viewed by others

Quality Estimation for English-Hungarian Machine Translation Systems with Optimized Semantic Features

Machine Translation Evaluation: Manual Versus Automatic—A Comparative Study

Measuring the Capability of a Speech Translation System

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation