Skip to main content
Log in

The symbiotic relationship between information retrieval and informetrics

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Informetrics and information retrieval (IR) represent fundamental areas of study within information science. Historically, researchers have not fully capitalized on the potential research synergies that exist between these two areas. Data sources used in traditional informetrics studies have their analogues in IR, with similar types of empirical regularities found in IR system content and use. Methods for data collection and analysis used in informetrics can help to inform IR system development and evaluation. Areas of application have included automatic indexing, index term weighting and understanding user query and session patterns through the quantitative analysis of user transaction logs. Similarly, developments in database technology have made the study of informetric phenomena less cumbersome, and recent innovations used in IR research, such as language models and ranking algorithms, provide new tools that may be applied to research problems of interest to informetricians. Building on the author’s previous work (Wolfram in Applied informetrics for information retrieval research, Libraries Unlimited, Westport, 2003), this paper reviews a sample of relevant literature published primarily since 2000 to highlight how each area of study may help to inform and benefit the other.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Ajiferuke, I., Wolfram, D., & Xie, H. (2004). Modelling website visitation and resource usage characteristics by IP address data. In H. Julien & S. Thompson (Eds.), CAIS/ACSI 2004Access to Information: Technologies, Skills, and Socio-Political Context. http://www.cais-acsi.ca/proceedings/2004/ajiferuke_2004.pdf. Accessed January 25, 2014.

  • Almind, P., & Ingwersen, P. (1997). Informetric analyses on the world wide web: Methodological approaches to “Webometrics”. Journal of Documentation, 53, 404–426.

    Article  Google Scholar 

  • Bar-Ilan, J. (2008). Informetrics at the beginning of the 21st century—A review. Journal of Informetrics, 2(1), 1–52.

    Article  Google Scholar 

  • Bar-Ilan, J., & Peritz, B. (2009). The lifespan of “informetrics” on the web: An eight year study (1998–2006). Scientometrics, 79(1), 7–25.

    Article  Google Scholar 

  • Bassecoulard, E., Lelu, A., & Zitt, M. (2007). A modular sequence of retrieval procedures to delineate a scientific field: From vocabulary to citations and back. In E. Torres-Salinas & H. F. Moed (Eds.), 11th International Conference on Scientometrics and Informetrics (ISSI 2007) (pp. 74–84). Madrid, Spain, 25–27 June 2007.

  • Blei, D. M., & Lafferty, J. D. (2009). Topic Models. In A. N. Srivastava & M. Sahami (Eds.), Classification, clustering, and applications (pp. 71–94). Boca Raton, FL: Chapman & Hall/CRC.

    Google Scholar 

  • Bollen, J., Rodriquez, M. A., & Van de Sompel, H. (2006). Journal status. Scientometrics, 69(3), 669–687.

    Article  Google Scholar 

  • Bollen, J., Van de Sompel, H., Hagberg, A., Bettencourt, L., Chute, R., Rodriguez, M. A., et al. (2009). Clickstream data yields high-resolution maps of science. PLoS ONE, 4(3), e4803.

    Article  Google Scholar 

  • Börner, K. (2010). Atlas of science: Visualizing what we know. Boston: MIT Press.

    Google Scholar 

  • Bradford, S. C. (1934). Sources of information on specific subjects. Engineering, 137, 8–96.

    Google Scholar 

  • Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30, 107–117.

    Article  Google Scholar 

  • Chen, H.-M., & Cooper, M. D. (2001). Using clustering techniques to detect usage patterns in a web-based information system. Journal of the American Society for Information Science and Technology, 52(11), 888–904.

    Article  Google Scholar 

  • Ding, Y. (2011). Applying weighted PageRank to author citation networks. Journal of the American Society for Information Science and Technology, 62(2), 236–245.

    Article  Google Scholar 

  • Ding, Y., Yan, E., Frazho, A., & Caverlee, J. (2009). PageRank for ranking authors in co-citation networks. Journal of the American Society for Information Science and Technology, 60(11), 2229–2243.

    Article  Google Scholar 

  • Egghe, L. (1990). The duality of informetrics systems with applications to the empirical laws. Journal of Information Science, 16, 17–27.

    Article  Google Scholar 

  • Egghe, L., & Rousseau, R. (1997). Duality in information retrieval and the hypergeometric distribution. Journal of Documentation, 53(5), 488–496.

    Article  Google Scholar 

  • Fuhr, N. (1992). Probabilistic models in information retrieval. The Computer Journal, 35(3), 243–255.

    Article  MATH  MathSciNet  Google Scholar 

  • Han, H. J., Joo, S., & Wolfram, D. (2014). Using transaction logs to better understand user search session patterns in an image-based digital library. Journal of the Korean Biblia Society for Library and Information Science, 25(1), 19–37.

    Article  Google Scholar 

  • Liu, X., Bollen, J., Nelson, M. L., & Van de Sompel, H. (2005). Co-authorship networks in the digital library research community. Information Processing and Management, 41(6), 1462–1480.

    Article  Google Scholar 

  • Lu, K., & Wolfram, D. (2012). Measuring author research relatedness: A comparison of word-based, topic-based and author co-citation approaches. Journal of the American Society for Information Science and Technology, 63(10), 1973–1986.

    Article  Google Scholar 

  • Mann, G. S., Mimno, D., & McCallum, A. (2006). Bibliometric impact measures leveraging topic analysis. The ACM Joint Conference on Digital Libraries, June 11–15, 2006, Chapel Hill, North Carolina, USA.

  • Mayr, P. (2013). Relevance distributions across Bradford Zones: Can Bradfordizing improve search? In J. Gorraiz, E. Schiebel, C. Gumpenberger, M. Hörlesberger, & H. Moed (Eds.), 14th International Society of Scientometrics and Informetrics Conference (pp. 1493–1505). Vienna, Austria.

  • Mayr, P., & Mutschke, P. (2013). Bibliometric-enhanced retrieval models for big scholarly information systems. In Big Data, 2013 IEEE International Conference on, IEEE (pp. 5–8).

  • Mutschke, P., Mayr, P., Schaer, P., & Sure, Y. (2011). Science models as value-added services for scholarly information systems. Scientometrics, 89(1), 349–364.

    Article  Google Scholar 

  • Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web. http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf. Accessed June 10, 2014.

  • Peat, H. J., & Willett, P. (1991). The limitations of term co-occurrence data for query expansion in document retrieval systems. Journal of the American Society for Information Science, 42(5), 378–383.

    Article  Google Scholar 

  • Ponte, J., & Croft, W. B. (1998). A language modeling approach to information retrieval. In W. B. Croft, A. Moffat, C. J. van Rijsbergen, R. Wilkinson & J. Zobel (Eds.), Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 275–281). New York: ACM Press.

  • Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. In C. Meek & J. Halpern (Eds.), Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (pp. 487–494). AUAI Press.

  • Salton, G. (1989). Automatic text processing: The transformation, analysis and retrieval of information by computer. Reading, MA: Addison-Wesley Publishing Company.

    Google Scholar 

  • Saracevic, T. (1975). RELEVANCE: A review of and a framework for the thinking on the notion in information science. Journal of the American Society for Information Science, 26(6), 321–343.

    Article  Google Scholar 

  • Saxena, A., Gupta, B. M., & Jauhari, M. (2007). Exploring models for the growth of literature data. DESIDOC Bulletin of Information Technology, 27(3), 3–12.

    Article  Google Scholar 

  • Schneider, J. W., & Borlund, P. (2004). Introduction to bibliometrics for construction and maintenance of thesauri: Methodical considerations. Journal of Documentation, 60(5), 524–549.

    Article  Google Scholar 

  • Schneider, J. W., & Borlund, P. (2005). A bibliometric-based semi-automatic approach to identification of candidate thesaurus terms: Parsing and filtering of noun phrases from citation contexts. In F. Crestani & I. Ruthven (Eds.), Information Context: Nature, Impact, and Role: 5th International Conference on Conceptions of Library and Information Sciences, CoLIS 2005 (pp. 226–237). Berlin: Springer.

  • Song, M., & Ding, Y. (2014). Topic modeling: Measuring scholarly impact using a topical lens. In Y. Ding, R. Rousseau, & D. Wolfram (Eds.), Measuring scholarly impact: Methods and practice (pp. 235–257). New York: Springer.

    Chapter  Google Scholar 

  • Spink, A., Jansen, B. J., Wolfram, D., & Saracevic, T. (2002). From e-sex to e-commerce: Web search changes. Computer Magazine, 35(3), 107–109.

    Article  Google Scholar 

  • Tang, J., Jin, R., & Zhang, J. (2008). A topic modeling approach and its integration into the random walk framework for academic search. In F. Giannotti, D. Gunopulos, F. Turini, C. Zaniolo, N. Ramakrishnan, & X. Wu (Eds.), Data Mining, 2008. ICDM’08. Eighth IEEE International Conference on, IEEE (pp. 1055–1060).

  • Thelwall, M. (2009). Introduction to webometrics: Quantitative Web research for the social sciences. Synthesis lectures on information concepts, retrieval, and services, 1(1), 1–116.

  • Thelwall, M., Vaughan, L., & Björneborn, L. (2005). Webometrics. In B. Cronin (Ed.), Annual review of information science and technology (Vol. 39, pp. 81–135). Medford, NJ: Information Today.

    Google Scholar 

  • Waltman, L., & Yan, E. (2014). PageRank-related methods for analyzing citation networks. In Y. Ding, R. Rousseau, & D. Wolfram (Eds.), Measuring scholarly impact: Methods and practice (pp. 83–100). New York: Springer.

    Chapter  Google Scholar 

  • White, H. D. (2007a). Combining bibliometrics, information retrieval, and relevance theory, part 1: First examples of a synthesis. Journal of the American Society for Information Science and Technology, 55(4), 536–559.

    Article  Google Scholar 

  • White, H. D. (2007b). Combining bibliometrics, information retrieval, and relevance theory, part 2: Some implications for information science. Journal of the American Society for Information Science and Technology, 55(4), 583–605.

    Article  Google Scholar 

  • Wilson, C. S. (1999). Informetrics. In M. Williams (Ed.), Annual review of information science and technology (Vol. 34, pp. 107–247). Medford, NJ: Information Today.

    Google Scholar 

  • Wolfram, D. (2000). A query-level examination of end user searching behaviour on the Excite search engine. In H. Olson (Ed.), Proceedings of the 28th Annual Conference of the Canadian Association for Information Science. http://www.cais-acsi.ca/proceedings/2000/wolfram_2000.pdf. Accessed June 10, 2014.

  • Wolfram, D. (2003). Applied informetrics for information retrieval research. Westport, CT: Libraries Unlimited.

    Google Scholar 

  • Wolfram, D. (2008). Search characteristics in different types of Web-based IR environments: Are they the same? Information Processing and Management, 44, 1279–1292.

    Article  Google Scholar 

  • Wolfram, D., Wang, P., & Zhang, J. (2009). Identifying web search session patterns using cluster analysis: A comparison of three search environments. Journal of the American Society for Information Science and Technology, 60(5), 896–910.

    Article  Google Scholar 

  • Wolfram, D., & Zhang, J. (2008). The influence of indexing practices and term weighting algorithms on document spaces. Journal of the American Society for Information Science and Technology, 59(1), 3–11.

    Article  Google Scholar 

  • Wormell, I. (1998). Informetrics: Exploring databases as analytical tools. Database, 21(5), 25–30.

    Google Scholar 

  • Wren, J. D. (2008). URL decay in MEDLINE—a 4-year follow-up study. Bioinformatics, 24(11), 1381–1385.

    Article  Google Scholar 

  • Xie, I. (2008). Interactive information retrieval in digital environments. Hershey, PA: IGI Publishing.

    Book  MATH  Google Scholar 

  • Yan, E. (2014). Topic-based Pagerank: Toward a topic-level scientific evaluation. Scientometrics, 100(2), 407–437.

    Article  Google Scholar 

  • Yan, E., Ding, Y., Milojević, S., & Sugimoto, C. R. (2012). Topics in dynamic research communities: An exploratory study for the field of information retrieval. Journal of Informetrics, 6(1), 140–153.

    Article  Google Scholar 

  • Zitt, M., & Bassecoulard, E. (2006). Delineating complex scientific fields by hybrid lexical-citation method: An application to nanoscience. Information Processing and Management, 42(6), 1513–1531.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dietmar Wolfram.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wolfram, D. The symbiotic relationship between information retrieval and informetrics. Scientometrics 102, 2201–2214 (2015). https://doi.org/10.1007/s11192-014-1479-0

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-014-1479-0

Keywords

Navigation