Skip to main content

Building Queries for Prior-Art Search

  • Conference paper
Multidisciplinary Information Retrieval (IRFC 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6653))

Included in the following conference series:

Abstract

Prior-art search is a critical step in the examination procedure of a patent application. This study explores automatic query generation from patent documents to facilitate the time-consuming and labor-intensive search for relevant patents. It is essential for this task to identify discriminative terms in different fields of a query patent, which enables us to distinguish relevant patents from non-relevant patents. To this end we investigate the distribution of terms occurring in different fields of the query patent and compare the distributions with the rest of the collection using language modeling estimation techniques. We experiment with term weighting based on the Kullback-Leibler divergence between the query patent and the collection and also with parsimonious language model estimation. Both of these techniques promote words that are common in the query patent and are rare in the collection. We also incorporate the classification assigned to patent documents into our model, to exploit available human judgements in the form of a hierarchical classification. Experimental results show that the retrieval using the generated queries is effective, particularly in terms of recall, while patent description is shown to be the most useful source for extracting query terms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alink, W., Cornacchia, R., de Vries, A.P.: Building strategies, a year later. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)

    Google Scholar 

  2. Atkinson, K.H.: Toward a more rational patent search paradigm. In: Proceedings of the 1st ACM Workshop on Patent Information Retrieval, pp. 37–40 (2008)

    Google Scholar 

  3. Azzopardi, L., Vanderbauwhede, W., Joho, H.: Search system requirements of patent analysts. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 775–776 (2010)

    Google Scholar 

  4. Azzopardi, L., Vinay, V.: Retrievability: an evaluation measure for higher order information access tasks. In: ACM Conference on Information and Knowledge Management, pp. 561–570 (2008)

    Google Scholar 

  5. Bashir, S., Rauber, A.: Improving Retrievability of Patents in Prior-Art Search. In: European Conference on Information Retrieval, pp. 457–470 (2010)

    Google Scholar 

  6. Fujii, A.: Enhancing patent retrieval by citation analysis. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 793–794 (2007)

    Google Scholar 

  7. Fujii, A., Iwayama, M., Kando, N.: Overview of Patent Retrieval Task at NTCIR-4. In: Proceedings of NTCIR-4 Workshop (2004)

    Google Scholar 

  8. Fujii, A., Iwayama, M., Kando, N.: Introduction to the special issue on patent processing. Information Processing and Management 43(5), 1149–1153 (2007)

    Article  Google Scholar 

  9. Fujita, S.: Revisiting the Document Length Hypotheses- NTCIR-4 CLIR and Patent Experiments at Patolis. In: Proceedings of NTCIR-4 Workshop (2004)

    Google Scholar 

  10. Graf, E., Frommholz, I., Lalmas, M., van Rijsbergen, K.: Knowledge modeling in prior art search. In: First Information Retrieval Facility Conference on Advances in Multidisciplinary Retrieval, pp. 31–46 (2010)

    Google Scholar 

  11. Hiemstra, D., Robertson, S.E., Zaragoza, H.: Parsimonious language models for information retrieval. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 178–185 (2004)

    Google Scholar 

  12. Iwayama, M., Fujii, A., Kando, N., Takano, A.: Overview of patent retrieval task at NTCIR-3. In: Proceedings of NTCIR Workshop (2002)

    Google Scholar 

  13. Konishi, K.: Query terms extraction from patent document for invalidity search. In: Proc. of NTCIR 2005 (2005)

    Google Scholar 

  14. Lopez, P., Romary, L.: Experiments with citation mining and key-term extraction for prior art search. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)

    Google Scholar 

  15. Magdy, W., Jones, G.J.F.: Applying the KISS Principle for the CLEF-IP 2010 Prior Art Candidate Patent Search Task. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)

    Google Scholar 

  16. Magdy, W., Jones, G.J.F.: Examining the robustness of evaluation metrics for patent retrieval with incomplete relevance judgements. In: Multilingual and Multimodal Information Access Evaluation, International Conference of the Cross-Language Evaluation Forum, pp. 82–93 (2010)

    Google Scholar 

  17. Magdy, W., Jones, G.J.F.: PRES: a score metric for evaluating recall-oriented information retrieval applications. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 611–618 (2010)

    Google Scholar 

  18. Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  19. Mase, H., Matsubayashi, T., Ogawa, Y., Iwayama, M., Oshio, T.: Proposal of two-stage patent retrieval method considering the claim structure. ACM Transactions on Asian Language Information Processing 4(2), 190–206 (2005)

    Article  Google Scholar 

  20. Meij, E., Weerkamp, W., de Rijke, M.: A query model based on normalized log-likelihood. In: ACM Conference on Information and Knowledge Management, pp. 1903–1906 (2009)

    Google Scholar 

  21. Piori, F.: CLEF-IP 2010: Prior-Art Candidate Search Evaluation Summary. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)

    Google Scholar 

  22. Piroi, F.: CLEF-IP 2010: Retrieval Experiments in the Intellectual Property Domain. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)

    Google Scholar 

  23. Roda, G., Tait, J., Piroi, F., Zenz, V.: CLEF-IP 2009: Retrieval Experiments in the Intellectual Property Domain (2009)

    Google Scholar 

  24. Shaw, J.A., Fox, E.A.: Combination of multiple searches. In: TREC 1994 (1994)

    Google Scholar 

  25. Takaki, T., Fujii, A., Ishikawa, T.: Associative document retrieval by query subtopic analysis and its application to invalidity patent search. In: ACM Conference on Information and Knowledge Management, pp. 399–405 (2004)

    Google Scholar 

  26. Teodoro, D., Gobeill, J., Pasche, E., Vishnyakova, D., Ruch, P., Lovis, C.: Automatic Prior Art Searching and Patent Encoding at CLEF-IP 2010. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)

    Google Scholar 

  27. Xue, X., Croft, W.B.: Automatic query generation for patent search. In: ACM Conference on Information and Knowledge Management, pp. 2037–2040 (2009)

    Google Scholar 

  28. Xue, X., Croft, W.B.: Transforming patents into prior-art queries. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 808–809 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mahdabi, P., Keikha, M., Gerani, S., Landoni, M., Crestani, F. (2011). Building Queries for Prior-Art Search. In: Hanbury, A., Rauber, A., de Vries, A.P. (eds) Multidisciplinary Information Retrieval. IRFC 2011. Lecture Notes in Computer Science, vol 6653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21353-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21353-3_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21352-6

  • Online ISBN: 978-3-642-21353-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics