skip to main content
10.1145/1177080.1177110acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
Article

Web search clickstreams

Published:25 October 2006Publication History

ABSTRACT

Search engines are a vital part of the Web and thus the Internet infrastructure. Therefore understanding the behavior of users searching the Web gives insights into trends, and enables enhancements of future search capabilities. Possible data sources for studying Web search behavior are either server-side logs or client-side logs. Unfortunately, current server-side logs are hard to obtain as they are considered proprietary by the search engine operators. Therefore we in this paper present a methodology for extracting client-side logs from the traffic exchanged between a large user group and the Internet. The added benefit of our methodology is that we do not only extract the search terms, the query sequences, and search results of each individual user but also the full clickstream, i.e., the result pages users view and the subsequently visited hyperlinked pages. We propose a finite-state Markov model that captures the user web searching and browsing behavior and allows us to deduce users' prevalent search patterns. To our knowledge, this is the first such detailed client-side analysis of clickstreams.

References

  1. Google basic search. http://www.google.com/support/bin/static.py?page=searchguides.html&ctx=basics.Google ScholarGoogle Scholar
  2. R. Atterer, M. Wnuk, and A. Schmidt. Knowing the user's every move---user activity tracking for website usability evaluation and implicit interaction. In WWW, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. Barford. Modeling, Measurement and Performance of World Wide Web Transactions. PhD thesis, Boston University, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. M. Beitzel, E. C. Jensen, A. Chowdhury, D. Grossman, and O. Frieder. Hourly analysis of a very large topically categorized web query log. In ACM SIGIR, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Chau, X. Fang, and O. R. L. Sheng. Analysis of the query logs of a web site search engine. In American Society for Information Science and Technology, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. H. Cui, J.-R. Wen, J.-Y. Nie, and W.-Y. Ma. Query expansion by mining user logs. In IEEE Trans. Knowl. Data Eng. 15(4), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Jansen and U. Pooch. Web user studies: A review and framework for future work. In American Society of Information Science and Technology, 2001.Google ScholarGoogle Scholar
  8. B. Krishnamurthy and J. Rexford. Web Protocols and Practice. Addison-Wesley, 2001.Google ScholarGoogle Scholar
  9. U. Lee, Z. Liu, and J. Cho. Automatic identification of user goals in web search. In WWW, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Luxenburger and G. Weikum. Query-log based authority analysis for web information search. In WISE, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  11. V. Paxson. Bro: A system for detecting network intruders in real-time. In Computer Networks, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. In KDD, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. X. Shen, B. Tan, and C. Zhai. Context-sensitive information retrieval using implicit feedback. In ACM SIGIR, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Silverstein, M. Henzinger, H. Marais, and M. Moricz. Analysis of a very large altavista query log. Technical report, SRC Technical Note 014, 1998.Google ScholarGoogle Scholar
  15. A. Spink, B. J. Jansen, and H. C. Ozmultu. Use of query reformulation and relevance feedback by excite users. In Internet Research: Electronic Networking Applications and Policy, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  16. A. Spink, S. Koshman, M. Park, C. Field, and B. J. Jansen. Multitasking web search on vivisimo.com. In ITCC, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Spink, D. Wolfram, B. Jansen, and T. Saracevic. Searching the web: The public and their queries. In American Society for Information Science and Technology, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Weinreich, H. Obendorf, E. Herder, and M. Mayer. Off the beaten tracks: Exploring three aspects of web navigation. In WWW, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Web search clickstreams

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            IMC '06: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
            October 2006
            356 pages
            ISBN:1595935614
            DOI:10.1145/1177080

            Copyright © 2006 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 25 October 2006

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            Overall Acceptance Rate277of1,083submissions,26%

            Upcoming Conference

            IMC '24
            ACM Internet Measurement Conference
            November 4 - 6, 2024
            Madrid , AA , Spain

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader