Abstract
Prior-art search is a critical step in the examination procedure of a patent application. This study explores automatic query generation from patent documents to facilitate the time-consuming and labor-intensive search for relevant patents. It is essential for this task to identify discriminative terms in different fields of a query patent, which enables us to distinguish relevant patents from non-relevant patents. To this end we investigate the distribution of terms occurring in different fields of the query patent and compare the distributions with the rest of the collection using language modeling estimation techniques. We experiment with term weighting based on the Kullback-Leibler divergence between the query patent and the collection and also with parsimonious language model estimation. Both of these techniques promote words that are common in the query patent and are rare in the collection. We also incorporate the classification assigned to patent documents into our model, to exploit available human judgements in the form of a hierarchical classification. Experimental results show that the retrieval using the generated queries is effective, particularly in terms of recall, while patent description is shown to be the most useful source for extracting query terms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alink, W., Cornacchia, R., de Vries, A.P.: Building strategies, a year later. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)
Atkinson, K.H.: Toward a more rational patent search paradigm. In: Proceedings of the 1st ACM Workshop on Patent Information Retrieval, pp. 37–40 (2008)
Azzopardi, L., Vanderbauwhede, W., Joho, H.: Search system requirements of patent analysts. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 775–776 (2010)
Azzopardi, L., Vinay, V.: Retrievability: an evaluation measure for higher order information access tasks. In: ACM Conference on Information and Knowledge Management, pp. 561–570 (2008)
Bashir, S., Rauber, A.: Improving Retrievability of Patents in Prior-Art Search. In: European Conference on Information Retrieval, pp. 457–470 (2010)
Fujii, A.: Enhancing patent retrieval by citation analysis. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 793–794 (2007)
Fujii, A., Iwayama, M., Kando, N.: Overview of Patent Retrieval Task at NTCIR-4. In: Proceedings of NTCIR-4 Workshop (2004)
Fujii, A., Iwayama, M., Kando, N.: Introduction to the special issue on patent processing. Information Processing and Management 43(5), 1149–1153 (2007)
Fujita, S.: Revisiting the Document Length Hypotheses- NTCIR-4 CLIR and Patent Experiments at Patolis. In: Proceedings of NTCIR-4 Workshop (2004)
Graf, E., Frommholz, I., Lalmas, M., van Rijsbergen, K.: Knowledge modeling in prior art search. In: First Information Retrieval Facility Conference on Advances in Multidisciplinary Retrieval, pp. 31–46 (2010)
Hiemstra, D., Robertson, S.E., Zaragoza, H.: Parsimonious language models for information retrieval. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 178–185 (2004)
Iwayama, M., Fujii, A., Kando, N., Takano, A.: Overview of patent retrieval task at NTCIR-3. In: Proceedings of NTCIR Workshop (2002)
Konishi, K.: Query terms extraction from patent document for invalidity search. In: Proc. of NTCIR 2005 (2005)
Lopez, P., Romary, L.: Experiments with citation mining and key-term extraction for prior art search. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)
Magdy, W., Jones, G.J.F.: Applying the KISS Principle for the CLEF-IP 2010 Prior Art Candidate Patent Search Task. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)
Magdy, W., Jones, G.J.F.: Examining the robustness of evaluation metrics for patent retrieval with incomplete relevance judgements. In: Multilingual and Multimodal Information Access Evaluation, International Conference of the Cross-Language Evaluation Forum, pp. 82–93 (2010)
Magdy, W., Jones, G.J.F.: PRES: a score metric for evaluating recall-oriented information retrieval applications. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 611–618 (2010)
Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Mase, H., Matsubayashi, T., Ogawa, Y., Iwayama, M., Oshio, T.: Proposal of two-stage patent retrieval method considering the claim structure. ACM Transactions on Asian Language Information Processing 4(2), 190–206 (2005)
Meij, E., Weerkamp, W., de Rijke, M.: A query model based on normalized log-likelihood. In: ACM Conference on Information and Knowledge Management, pp. 1903–1906 (2009)
Piori, F.: CLEF-IP 2010: Prior-Art Candidate Search Evaluation Summary. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)
Piroi, F.: CLEF-IP 2010: Retrieval Experiments in the Intellectual Property Domain. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)
Roda, G., Tait, J., Piroi, F., Zenz, V.: CLEF-IP 2009: Retrieval Experiments in the Intellectual Property Domain (2009)
Shaw, J.A., Fox, E.A.: Combination of multiple searches. In: TREC 1994 (1994)
Takaki, T., Fujii, A., Ishikawa, T.: Associative document retrieval by query subtopic analysis and its application to invalidity patent search. In: ACM Conference on Information and Knowledge Management, pp. 399–405 (2004)
Teodoro, D., Gobeill, J., Pasche, E., Vishnyakova, D., Ruch, P., Lovis, C.: Automatic Prior Art Searching and Patent Encoding at CLEF-IP 2010. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)
Xue, X., Croft, W.B.: Automatic query generation for patent search. In: ACM Conference on Information and Knowledge Management, pp. 2037–2040 (2009)
Xue, X., Croft, W.B.: Transforming patents into prior-art queries. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 808–809 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mahdabi, P., Keikha, M., Gerani, S., Landoni, M., Crestani, F. (2011). Building Queries for Prior-Art Search. In: Hanbury, A., Rauber, A., de Vries, A.P. (eds) Multidisciplinary Information Retrieval. IRFC 2011. Lecture Notes in Computer Science, vol 6653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21353-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-21353-3_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21352-6
Online ISBN: 978-3-642-21353-3
eBook Packages: Computer ScienceComputer Science (R0)