Abstract
Informetrics and information retrieval (IR) represent fundamental areas of study within information science. Historically, researchers have not fully capitalized on the potential research synergies that exist between these two areas. Data sources used in traditional informetrics studies have their analogues in IR, with similar types of empirical regularities found in IR system content and use. Methods for data collection and analysis used in informetrics can help to inform IR system development and evaluation. Areas of application have included automatic indexing, index term weighting and understanding user query and session patterns through the quantitative analysis of user transaction logs. Similarly, developments in database technology have made the study of informetric phenomena less cumbersome, and recent innovations used in IR research, such as language models and ranking algorithms, provide new tools that may be applied to research problems of interest to informetricians. Building on the author’s previous work (Wolfram in Applied informetrics for information retrieval research, Libraries Unlimited, Westport, 2003), this paper reviews a sample of relevant literature published primarily since 2000 to highlight how each area of study may help to inform and benefit the other.
Similar content being viewed by others
References
Ajiferuke, I., Wolfram, D., & Xie, H. (2004). Modelling website visitation and resource usage characteristics by IP address data. In H. Julien & S. Thompson (Eds.), CAIS/ACSI 2004—Access to Information: Technologies, Skills, and Socio-Political Context. http://www.cais-acsi.ca/proceedings/2004/ajiferuke_2004.pdf. Accessed January 25, 2014.
Almind, P., & Ingwersen, P. (1997). Informetric analyses on the world wide web: Methodological approaches to “Webometrics”. Journal of Documentation, 53, 404–426.
Bar-Ilan, J. (2008). Informetrics at the beginning of the 21st century—A review. Journal of Informetrics, 2(1), 1–52.
Bar-Ilan, J., & Peritz, B. (2009). The lifespan of “informetrics” on the web: An eight year study (1998–2006). Scientometrics, 79(1), 7–25.
Bassecoulard, E., Lelu, A., & Zitt, M. (2007). A modular sequence of retrieval procedures to delineate a scientific field: From vocabulary to citations and back. In E. Torres-Salinas & H. F. Moed (Eds.), 11th International Conference on Scientometrics and Informetrics (ISSI 2007) (pp. 74–84). Madrid, Spain, 25–27 June 2007.
Blei, D. M., & Lafferty, J. D. (2009). Topic Models. In A. N. Srivastava & M. Sahami (Eds.), Classification, clustering, and applications (pp. 71–94). Boca Raton, FL: Chapman & Hall/CRC.
Bollen, J., Rodriquez, M. A., & Van de Sompel, H. (2006). Journal status. Scientometrics, 69(3), 669–687.
Bollen, J., Van de Sompel, H., Hagberg, A., Bettencourt, L., Chute, R., Rodriguez, M. A., et al. (2009). Clickstream data yields high-resolution maps of science. PLoS ONE, 4(3), e4803.
Börner, K. (2010). Atlas of science: Visualizing what we know. Boston: MIT Press.
Bradford, S. C. (1934). Sources of information on specific subjects. Engineering, 137, 8–96.
Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 30, 107–117.
Chen, H.-M., & Cooper, M. D. (2001). Using clustering techniques to detect usage patterns in a web-based information system. Journal of the American Society for Information Science and Technology, 52(11), 888–904.
Ding, Y. (2011). Applying weighted PageRank to author citation networks. Journal of the American Society for Information Science and Technology, 62(2), 236–245.
Ding, Y., Yan, E., Frazho, A., & Caverlee, J. (2009). PageRank for ranking authors in co-citation networks. Journal of the American Society for Information Science and Technology, 60(11), 2229–2243.
Egghe, L. (1990). The duality of informetrics systems with applications to the empirical laws. Journal of Information Science, 16, 17–27.
Egghe, L., & Rousseau, R. (1997). Duality in information retrieval and the hypergeometric distribution. Journal of Documentation, 53(5), 488–496.
Fuhr, N. (1992). Probabilistic models in information retrieval. The Computer Journal, 35(3), 243–255.
Han, H. J., Joo, S., & Wolfram, D. (2014). Using transaction logs to better understand user search session patterns in an image-based digital library. Journal of the Korean Biblia Society for Library and Information Science, 25(1), 19–37.
Liu, X., Bollen, J., Nelson, M. L., & Van de Sompel, H. (2005). Co-authorship networks in the digital library research community. Information Processing and Management, 41(6), 1462–1480.
Lu, K., & Wolfram, D. (2012). Measuring author research relatedness: A comparison of word-based, topic-based and author co-citation approaches. Journal of the American Society for Information Science and Technology, 63(10), 1973–1986.
Mann, G. S., Mimno, D., & McCallum, A. (2006). Bibliometric impact measures leveraging topic analysis. The ACM Joint Conference on Digital Libraries, June 11–15, 2006, Chapel Hill, North Carolina, USA.
Mayr, P. (2013). Relevance distributions across Bradford Zones: Can Bradfordizing improve search? In J. Gorraiz, E. Schiebel, C. Gumpenberger, M. Hörlesberger, & H. Moed (Eds.), 14th International Society of Scientometrics and Informetrics Conference (pp. 1493–1505). Vienna, Austria.
Mayr, P., & Mutschke, P. (2013). Bibliometric-enhanced retrieval models for big scholarly information systems. In Big Data, 2013 IEEE International Conference on, IEEE (pp. 5–8).
Mutschke, P., Mayr, P., Schaer, P., & Sure, Y. (2011). Science models as value-added services for scholarly information systems. Scientometrics, 89(1), 349–364.
Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web. http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf. Accessed June 10, 2014.
Peat, H. J., & Willett, P. (1991). The limitations of term co-occurrence data for query expansion in document retrieval systems. Journal of the American Society for Information Science, 42(5), 378–383.
Ponte, J., & Croft, W. B. (1998). A language modeling approach to information retrieval. In W. B. Croft, A. Moffat, C. J. van Rijsbergen, R. Wilkinson & J. Zobel (Eds.), Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 275–281). New York: ACM Press.
Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. In C. Meek & J. Halpern (Eds.), Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (pp. 487–494). AUAI Press.
Salton, G. (1989). Automatic text processing: The transformation, analysis and retrieval of information by computer. Reading, MA: Addison-Wesley Publishing Company.
Saracevic, T. (1975). RELEVANCE: A review of and a framework for the thinking on the notion in information science. Journal of the American Society for Information Science, 26(6), 321–343.
Saxena, A., Gupta, B. M., & Jauhari, M. (2007). Exploring models for the growth of literature data. DESIDOC Bulletin of Information Technology, 27(3), 3–12.
Schneider, J. W., & Borlund, P. (2004). Introduction to bibliometrics for construction and maintenance of thesauri: Methodical considerations. Journal of Documentation, 60(5), 524–549.
Schneider, J. W., & Borlund, P. (2005). A bibliometric-based semi-automatic approach to identification of candidate thesaurus terms: Parsing and filtering of noun phrases from citation contexts. In F. Crestani & I. Ruthven (Eds.), Information Context: Nature, Impact, and Role: 5th International Conference on Conceptions of Library and Information Sciences, CoLIS 2005 (pp. 226–237). Berlin: Springer.
Song, M., & Ding, Y. (2014). Topic modeling: Measuring scholarly impact using a topical lens. In Y. Ding, R. Rousseau, & D. Wolfram (Eds.), Measuring scholarly impact: Methods and practice (pp. 235–257). New York: Springer.
Spink, A., Jansen, B. J., Wolfram, D., & Saracevic, T. (2002). From e-sex to e-commerce: Web search changes. Computer Magazine, 35(3), 107–109.
Tang, J., Jin, R., & Zhang, J. (2008). A topic modeling approach and its integration into the random walk framework for academic search. In F. Giannotti, D. Gunopulos, F. Turini, C. Zaniolo, N. Ramakrishnan, & X. Wu (Eds.), Data Mining, 2008. ICDM’08. Eighth IEEE International Conference on, IEEE (pp. 1055–1060).
Thelwall, M. (2009). Introduction to webometrics: Quantitative Web research for the social sciences. Synthesis lectures on information concepts, retrieval, and services, 1(1), 1–116.
Thelwall, M., Vaughan, L., & Björneborn, L. (2005). Webometrics. In B. Cronin (Ed.), Annual review of information science and technology (Vol. 39, pp. 81–135). Medford, NJ: Information Today.
Waltman, L., & Yan, E. (2014). PageRank-related methods for analyzing citation networks. In Y. Ding, R. Rousseau, & D. Wolfram (Eds.), Measuring scholarly impact: Methods and practice (pp. 83–100). New York: Springer.
White, H. D. (2007a). Combining bibliometrics, information retrieval, and relevance theory, part 1: First examples of a synthesis. Journal of the American Society for Information Science and Technology, 55(4), 536–559.
White, H. D. (2007b). Combining bibliometrics, information retrieval, and relevance theory, part 2: Some implications for information science. Journal of the American Society for Information Science and Technology, 55(4), 583–605.
Wilson, C. S. (1999). Informetrics. In M. Williams (Ed.), Annual review of information science and technology (Vol. 34, pp. 107–247). Medford, NJ: Information Today.
Wolfram, D. (2000). A query-level examination of end user searching behaviour on the Excite search engine. In H. Olson (Ed.), Proceedings of the 28th Annual Conference of the Canadian Association for Information Science. http://www.cais-acsi.ca/proceedings/2000/wolfram_2000.pdf. Accessed June 10, 2014.
Wolfram, D. (2003). Applied informetrics for information retrieval research. Westport, CT: Libraries Unlimited.
Wolfram, D. (2008). Search characteristics in different types of Web-based IR environments: Are they the same? Information Processing and Management, 44, 1279–1292.
Wolfram, D., Wang, P., & Zhang, J. (2009). Identifying web search session patterns using cluster analysis: A comparison of three search environments. Journal of the American Society for Information Science and Technology, 60(5), 896–910.
Wolfram, D., & Zhang, J. (2008). The influence of indexing practices and term weighting algorithms on document spaces. Journal of the American Society for Information Science and Technology, 59(1), 3–11.
Wormell, I. (1998). Informetrics: Exploring databases as analytical tools. Database, 21(5), 25–30.
Wren, J. D. (2008). URL decay in MEDLINE—a 4-year follow-up study. Bioinformatics, 24(11), 1381–1385.
Xie, I. (2008). Interactive information retrieval in digital environments. Hershey, PA: IGI Publishing.
Yan, E. (2014). Topic-based Pagerank: Toward a topic-level scientific evaluation. Scientometrics, 100(2), 407–437.
Yan, E., Ding, Y., Milojević, S., & Sugimoto, C. R. (2012). Topics in dynamic research communities: An exploratory study for the field of information retrieval. Journal of Informetrics, 6(1), 140–153.
Zitt, M., & Bassecoulard, E. (2006). Delineating complex scientific fields by hybrid lexical-citation method: An application to nanoscience. Information Processing and Management, 42(6), 1513–1531.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wolfram, D. The symbiotic relationship between information retrieval and informetrics. Scientometrics 102, 2201–2214 (2015). https://doi.org/10.1007/s11192-014-1479-0
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-014-1479-0