skip to main content
10.1145/345508.345552acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article
Free Access

Phonetic confusion matrix based spoken document retrieval

Published:01 July 2000Publication History

ABSTRACT

Combined word-based index and phonetic indexes have been used to improve the performance of spoken document retrieval systems primarily by addressing the out-of-vocabulary retrieval problem. However, a known problem with phonetic recognition is its limited accuracy in comparison with word level recognition. We propose a novel method for phonetic retrieval in the CueVideo system based on the probabilistic formulation of term weighting using phone confusion data in a Bayesian framework. We evaluate this method of spoken document retrieval against word-based retrieval for the search levels identified in a realistic video-based distributed learning setting. Using our test data, we achieved an average recall of 0.88 with an average precision of 0.69 for retrieval of out-of-vocabulary words on phonetic transcripts with 35% word error rate. For in-vocabulary words, we achieved a 17% improvement in recall over word-based retrieval with a 17% loss in precision for word error rites ranging from 35 to 65%.

References

  1. 1.Amir, A., Ponceleon, D., Blanchard, B., Petkovic, D., Srinivasan, S. and Cohen, G. Using Audio Time Scale Modification for Video Browsing, in Proceedings of HICSS-33, Hawaii, Jan. 2000.]]Google ScholarGoogle ScholarCross RefCross Ref
  2. 2.Amir, A., Srinivasan, S., Ponceleon, D., and Petkovic, D., CueVideo: Automated Indexing of Video for Searching and Browsing. Demonstration in Proceedings of SIGIR '99, pp. 326, Ca, Aug. 99.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.Dharanipragada, S., Franz, M. and Roukos, S. Audio-Indexing For Broadcast News. In Proceedings of Seventh Text Retrieval Conference, TREC-6, (NIST Special Publication) 1997.]]Google ScholarGoogle Scholar
  4. 4.Dharanipragada, S., and Roukos, S. A Fast vocabulary independent algorithm for spotting words in speech. In Proceedings of lCASSP 98, 1998.]]Google ScholarGoogle ScholarCross RefCross Ref
  5. 5.Fung, R. and Favero, B. Applying Bayesian Networks to Information Retrieval. In Communications of the ACM, March 1995, Vol. 38, No. 3.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.Garofolo, J.,Voorhees, E., Auzanne, C., Stanford, V. and Lund, B. (1997). The TREC-7 Spoken Document Retrieval Track Overview and Results. In Proceedings of the seventh Text Retrieval Conference (TREC-7), pp. 79. NIST Special Publication 500-242.]]Google ScholarGoogle Scholar
  7. 7.James, D. System for Unrestricted Topic Retrieval from Radio News Broadcasts, In Proceedings of ICASSP-96, Atlanta, GA, May196, pp. 279-282.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8.Jansen, B.J., et al. Real Information Retrieval: A study of user queries on the web. In SIGIR FORUM, 32(1) 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.Johnson, S.E., Jourlin, P., Moore, G.L., Jones, K.S. and Woodland, P.C. Spoken Document Retrieval for TREC-7 at Cambridge University. In Proceedings of the Seventh Text Retrieval Conference (TPREC-7), (NIST Special Publication) 1998]]Google ScholarGoogle Scholar
  10. 10.Jones, G. J. F., Foote, J. T., Jones, K. S., and Young, S. J.. Video Mail Retrieval: the effect of word spotting accuracy on precision. In Proceedings of ICASSP 95, volume 1, pp. 309-312, Detroit, MI.]]Google ScholarGoogle Scholar
  11. 11.Jones, G. J. F., Foote, J. T., Jones, K. S., and Young, S. J. Retrieving Spoken Documents by Combining Multiple Index Sources. In Proceedings of SIGIR 96, pp. 30-38, Zurich, Switzerland.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12.Jones, K. S., Walker, S. and Robertson, S.E. A probabilistic model of information retrieval: Develepment and STatus, TR 446, Cambridge University Computer Laboratory, Sept 1998.]]Google ScholarGoogle Scholar
  13. 13.See URL at http://www.lotus.com/home.nsf/tabs/learnspace]]Google ScholarGoogle Scholar
  14. 14.Lunassen, L.M. and Mercer, R.L. An Information Theoretic Approach to Automatic Determination of Phonemic Baseforms. In Proceedings of ICASSP 84, pp. 42.5.1-42.5.4, 1984.]]Google ScholarGoogle Scholar
  15. 15.Maron, M.E. and Kuhns, J.L. On relevance, probabilistic indexng, and information retrieval. L ACM 7 (1960), 21-244.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16.Ng, K. and Zue, V. Phonetic Recognition for Spoken Document Retrieval. In Proceedings of ICASSP 98, pp. 325-328.]]Google ScholarGoogle Scholar
  17. 17.Robertson, S.E. and SparckoJones, K. Relevance weighting of search terms. In Journal of American Society of Information Sciences. 27 (May-June 1976). pp. 126-146.]]Google ScholarGoogle ScholarCross RefCross Ref
  18. 18.Robertson, S.E., Walker, A., Sparck-Jones, K., Hancock-Beaulieu M.M & Gatford, M. Okapi at TREC-3. In Prec. Third Text Retrieval Conference. (NIST special publication), 1995.]]Google ScholarGoogle Scholar
  19. 19.Sch/tuble, P. and Wechsler, M. First Experiences with a System for Content Based Retrieval of Information from Speech Recordings. In IJCAI-95, Workshop on Intelligent Multimedia Information Retrieval, Maybury, M.T.]]Google ScholarGoogle Scholar
  20. 20.Siegler, M.A., Witbrock, M.J., Slattery, S.T., Seymore, K., Jones, R.E. and Hauptmann, A.G. Experiments in Spoken Document Retrieval at CMU. In Ptvceedings of the Seventh Text Retrieval Conference (TREC-7), (NIST Special Publication) 1998.]]Google ScholarGoogle Scholar
  21. 21.Singhal, A., Col, J., Hindle, D., Lewis, D. and Pereira, F. AT&T at TREC-7. In Proceedings of the Seventh Text Retrieval Conference TREC-7, (NIST Special Publication) 1998.]]Google ScholarGoogle Scholar
  22. 22.Srinivasan, S., Petkovic, D., Ponceleon, D. and Viswanathan, M. Query Expansion for Imperfect Speech: Applications in Distributed Learning. In CBAIVL-2000, IEEE Workshop on Content-based Access of Image and Video Libraries, Hilton Head Island, South Carolina.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23.See URL at http://cwp.stanford.edu.]]Google ScholarGoogle Scholar
  24. 24.See URL at http://www-4.ibm.com/software/speecld]]Google ScholarGoogle Scholar
  25. 25.Voorhees, E., Garofolo, J. and Jones, K. (1997). The TREC-6 Spoken Document Retrieval Track Overview and Results. In Proceedings of the sixth Text Retrieval Conference (TREC-6), pp. 83. NIST Special Publication 500-240.]]Google ScholarGoogle ScholarCross RefCross Ref
  26. 26.Wechsler, M., Munteanu, E., and Schuble, P. New techniques for open vocabulary spoken document retrieval. In Proceedings of SIGIR'98, pp, 20-27, Melbourne, Australia, 1998]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. 27.Witbrock, M. and Hauptmann, A. Using Words and Phonetic Strings for Efficient Information Retrieval from Imperfectly Transcribed Spoken Documents. In Proceedings of DL97, The Second ACM International Conference on Digital Libraries, Philadelphia, PA.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Phonetic confusion matrix based spoken document retrieval

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
        July 2000
        396 pages
        ISBN:1581132263
        DOI:10.1145/345508

        Copyright © 2000 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 July 2000

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader