ABSTRACT
Combined word-based index and phonetic indexes have been used to improve the performance of spoken document retrieval systems primarily by addressing the out-of-vocabulary retrieval problem. However, a known problem with phonetic recognition is its limited accuracy in comparison with word level recognition. We propose a novel method for phonetic retrieval in the CueVideo system based on the probabilistic formulation of term weighting using phone confusion data in a Bayesian framework. We evaluate this method of spoken document retrieval against word-based retrieval for the search levels identified in a realistic video-based distributed learning setting. Using our test data, we achieved an average recall of 0.88 with an average precision of 0.69 for retrieval of out-of-vocabulary words on phonetic transcripts with 35% word error rate. For in-vocabulary words, we achieved a 17% improvement in recall over word-based retrieval with a 17% loss in precision for word error rites ranging from 35 to 65%.
- 1.Amir, A., Ponceleon, D., Blanchard, B., Petkovic, D., Srinivasan, S. and Cohen, G. Using Audio Time Scale Modification for Video Browsing, in Proceedings of HICSS-33, Hawaii, Jan. 2000.]]Google ScholarCross Ref
- 2.Amir, A., Srinivasan, S., Ponceleon, D., and Petkovic, D., CueVideo: Automated Indexing of Video for Searching and Browsing. Demonstration in Proceedings of SIGIR '99, pp. 326, Ca, Aug. 99.]] Google ScholarDigital Library
- 3.Dharanipragada, S., Franz, M. and Roukos, S. Audio-Indexing For Broadcast News. In Proceedings of Seventh Text Retrieval Conference, TREC-6, (NIST Special Publication) 1997.]]Google Scholar
- 4.Dharanipragada, S., and Roukos, S. A Fast vocabulary independent algorithm for spotting words in speech. In Proceedings of lCASSP 98, 1998.]]Google ScholarCross Ref
- 5.Fung, R. and Favero, B. Applying Bayesian Networks to Information Retrieval. In Communications of the ACM, March 1995, Vol. 38, No. 3.]] Google ScholarDigital Library
- 6.Garofolo, J.,Voorhees, E., Auzanne, C., Stanford, V. and Lund, B. (1997). The TREC-7 Spoken Document Retrieval Track Overview and Results. In Proceedings of the seventh Text Retrieval Conference (TREC-7), pp. 79. NIST Special Publication 500-242.]]Google Scholar
- 7.James, D. System for Unrestricted Topic Retrieval from Radio News Broadcasts, In Proceedings of ICASSP-96, Atlanta, GA, May196, pp. 279-282.]] Google ScholarDigital Library
- 8.Jansen, B.J., et al. Real Information Retrieval: A study of user queries on the web. In SIGIR FORUM, 32(1) 1998.]] Google ScholarDigital Library
- 9.Johnson, S.E., Jourlin, P., Moore, G.L., Jones, K.S. and Woodland, P.C. Spoken Document Retrieval for TREC-7 at Cambridge University. In Proceedings of the Seventh Text Retrieval Conference (TPREC-7), (NIST Special Publication) 1998]]Google Scholar
- 10.Jones, G. J. F., Foote, J. T., Jones, K. S., and Young, S. J.. Video Mail Retrieval: the effect of word spotting accuracy on precision. In Proceedings of ICASSP 95, volume 1, pp. 309-312, Detroit, MI.]]Google Scholar
- 11.Jones, G. J. F., Foote, J. T., Jones, K. S., and Young, S. J. Retrieving Spoken Documents by Combining Multiple Index Sources. In Proceedings of SIGIR 96, pp. 30-38, Zurich, Switzerland.]] Google ScholarDigital Library
- 12.Jones, K. S., Walker, S. and Robertson, S.E. A probabilistic model of information retrieval: Develepment and STatus, TR 446, Cambridge University Computer Laboratory, Sept 1998.]]Google Scholar
- 13.See URL at http://www.lotus.com/home.nsf/tabs/learnspace]]Google Scholar
- 14.Lunassen, L.M. and Mercer, R.L. An Information Theoretic Approach to Automatic Determination of Phonemic Baseforms. In Proceedings of ICASSP 84, pp. 42.5.1-42.5.4, 1984.]]Google Scholar
- 15.Maron, M.E. and Kuhns, J.L. On relevance, probabilistic indexng, and information retrieval. L ACM 7 (1960), 21-244.]] Google ScholarDigital Library
- 16.Ng, K. and Zue, V. Phonetic Recognition for Spoken Document Retrieval. In Proceedings of ICASSP 98, pp. 325-328.]]Google Scholar
- 17.Robertson, S.E. and SparckoJones, K. Relevance weighting of search terms. In Journal of American Society of Information Sciences. 27 (May-June 1976). pp. 126-146.]]Google ScholarCross Ref
- 18.Robertson, S.E., Walker, A., Sparck-Jones, K., Hancock-Beaulieu M.M & Gatford, M. Okapi at TREC-3. In Prec. Third Text Retrieval Conference. (NIST special publication), 1995.]]Google Scholar
- 19.Sch/tuble, P. and Wechsler, M. First Experiences with a System for Content Based Retrieval of Information from Speech Recordings. In IJCAI-95, Workshop on Intelligent Multimedia Information Retrieval, Maybury, M.T.]]Google Scholar
- 20.Siegler, M.A., Witbrock, M.J., Slattery, S.T., Seymore, K., Jones, R.E. and Hauptmann, A.G. Experiments in Spoken Document Retrieval at CMU. In Ptvceedings of the Seventh Text Retrieval Conference (TREC-7), (NIST Special Publication) 1998.]]Google Scholar
- 21.Singhal, A., Col, J., Hindle, D., Lewis, D. and Pereira, F. AT&T at TREC-7. In Proceedings of the Seventh Text Retrieval Conference TREC-7, (NIST Special Publication) 1998.]]Google Scholar
- 22.Srinivasan, S., Petkovic, D., Ponceleon, D. and Viswanathan, M. Query Expansion for Imperfect Speech: Applications in Distributed Learning. In CBAIVL-2000, IEEE Workshop on Content-based Access of Image and Video Libraries, Hilton Head Island, South Carolina.]] Google ScholarDigital Library
- 23.See URL at http://cwp.stanford.edu.]]Google Scholar
- 24.See URL at http://www-4.ibm.com/software/speecld]]Google Scholar
- 25.Voorhees, E., Garofolo, J. and Jones, K. (1997). The TREC-6 Spoken Document Retrieval Track Overview and Results. In Proceedings of the sixth Text Retrieval Conference (TREC-6), pp. 83. NIST Special Publication 500-240.]]Google ScholarCross Ref
- 26.Wechsler, M., Munteanu, E., and Schuble, P. New techniques for open vocabulary spoken document retrieval. In Proceedings of SIGIR'98, pp, 20-27, Melbourne, Australia, 1998]] Google ScholarDigital Library
- 27.Witbrock, M. and Hauptmann, A. Using Words and Phonetic Strings for Efficient Information Retrieval from Imperfectly Transcribed Spoken Documents. In Proceedings of DL97, The Second ACM International Conference on Digital Libraries, Philadelphia, PA.]] Google ScholarDigital Library
Index Terms
- Phonetic confusion matrix based spoken document retrieval
Recommendations
Indexing confusion networks for morph-based spoken document retrieval
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrievalIn this paper, we investigate methods for improving the performance of morph-based spoken document retrieval in Finnish by extracting relevant index terms from confusion networks. Our approach uses morpheme-like subword units ("morphs") for recognition ...
Cross-language spoken document retrieval using HMM-based retrieval model with multi-scale fusion
Cross-language spoken document retrieval (CL-SDR) is the technology that facilitates automatic retrieval of relevant information from a collection of spoken documents in a language that is different from that used in the queries. Information sources ...
Query expansion using phonetic confusions for Chinese spoken document retrieval
IRAL '00: Proceedings of the fifth international workshop on on Information retrieval with Asian languagesThis paper presents a method of query expansion based on phonetic confusions for retrieving spoken documents using text queries. This method is applied to a Chinese spoken document retrieval task. A series of experiments have been carried out for ...
Comments