skip to main content
10.1145/2396761.2396854acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

A math-aware search engine for math question answering system

Authors Info & Claims
Published:29 October 2012Publication History

ABSTRACT

We propose a math-aware search engine that is capable of handling both textual keywords as well as mathematical expressions. Our math feature extraction and representation framework captures the semantics of math expressions via a Finite State Machine model. We adapt the passive aggressive online learning binary classifier as the ranking model. We benchmarked our approach against three classical information retrieval (IR) strategies on math documents crawled from Math Overflow, a well-known online math question answering system. Experimental results show that our proposed approach can perform better than other methods by more than 9%.

References

  1. A. Andrea, G. Ferruccio, C. C. Sacerdoti, T. Enrico, and Z. Stefano. A content based mathematical search engine: Whelp. In TYPES, pages 17--32, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Ausbrooks, S. Buswell, S. Dalmas, S. Devitt, A. Diaz, R. Hunter, B. Smith, N. Soiffer, R. Sutor, and S. Watt. Mathematical markup language (mathml) version 2.0, 2000.Google ScholarGoogle Scholar
  3. P. N. Bennett, K. El-Arini, T. Joachims, and K. M. Svore. Sigir '11 workshop report: Enriching information retrieval, 2011.Google ScholarGoogle Scholar
  4. H. Block. The perceptron: A model for brain functioning. Rev. Modern Phys., 34:123--135, 1962.Google ScholarGoogle ScholarCross RefCross Ref
  5. S. Buswell, O. Caprotti, D. P. Carlisle, M. C. Dewar, M. Gaetano, and M. Kohlhase. The Open Math standard version 2.0. 2004.Google ScholarGoogle Scholar
  6. Y. Cao, J. Xu, T.-Y. Liu, H. Li, Y. Huang, and H.-W. Hon. Adapting ranking svm to document retrieval. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '06, pages 186--193, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. Journal of Machine Learning Research, pages 551--585, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Dredze, K. Crammer, and F. Pereira. Confidence-weighted linear classification. In ICML '08: Proceedings of the 25th international conference on Machine learning, pages 264--271, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Gao, H. Qi, X. Xia, and J.-Y. Nie. Linear discriminant model for information retrieval. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '05, pages 290--297, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Guillén. Gir with language modeling and dfr using terrier. In Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access, CLEF '08, pages 822--829, Berlin, Heidelberg, 2009. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. S. Jones. Index term weighting. Information Storage and Retrieval, 9(11):619--633, 1973.Google ScholarGoogle ScholarCross RefCross Ref
  12. A. Kohlhase and M. Kohlhase. Reexamining the mkm value proposition: From math web search to math web research. In Calculemus '07 / MKM '07: Proceedings of the 14th symposium on Towards Mechanized Mathematical Assistants, pages 313--326, Berlin, Heidelberg, 2007. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Kohlhase and I. Sucan. A search engine for mathematical formulae. In J. Calmet, T. Ida, and D. Wang, editors, AISC '06: Proceedings of 8th International Conference on Artificial Intelligence and Symbolic Computation, pages 241--253. Springer-Verlag, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Kohlhase and I. A. Sÿucan. A search engine for mathematical formulae. In Proc. of Artificial Intelligence and Symbolic Computation, number 4120 in LNAI, pages 241--253. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Libbrecht and E. Melis. Methods to access and retrieve mathematical content and activemath. In ICMS '06: In Proceeding of the 2nd International Congress on Mathematical Software, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. B. R. Miller and A. Youssef. Augmenting presentation mathml for search. MKM '08: Proceedings of the 7th International Conference on Mathematical Knowledge Management, pages 536--542, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Miner and R. Munavalli. An approach to mathematical search through query formulation and data normalization. In Calculemus '07 / MKM '07: Proceedings of the 14th symposium on Towards Mechanized Mathematical Assistants, pages 342--355, Berlin, Heidelberg, 2007. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Munavalli and R. Miner. Mathfind: a math-aware search engine. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 735--735, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. T. Nguyen, K. Chang, and S. C. Hui. Distribution-aware online classifiers. In T. Walsh, editor, IJCAI, pages 1427--1432. IJCAI/AAAI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. T. Nguyen, S. C. Hui, and K. Chang. A lattice-based approach for mathematical search using formal concept analysis. Expert Systems with Applications, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. I. Ounis, G. Amati, V. Plachouras, B. He, C. MacDonald, and D. Johnson. Terrier information retrieval platform. Advances in Information Retrieval, 3408:517--519, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Singhal. Modern information retrieval: a brief overview. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 24:2001, 2001.Google ScholarGoogle Scholar
  24. L. Wang, J. Lin, and D. Metzler. A cascade ranking model for efficient ranked retrieval. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, SIGIR '11, pages 105--114, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. S. Youssef. Roles of math search in mathematics. In J. M. Borwein and W. M. Farmer, editors, MKM '06: Proceedings of the 5th International Conference on Mathematical Knowledge Management, pages 2--16, Berlin Heidelberg, 2006. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '07, pages 271--278, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Zhang, R. Mao, H. Li, and J. Mao. How to count thumb-ups and thumb-downs?: an information retrieval approach to user-rating based ranking of items. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, SIGIR '11, pages 1223--1224, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A math-aware search engine for math question answering system

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management
      October 2012
      2840 pages
      ISBN:9781450311564
      DOI:10.1145/2396761

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 29 October 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader