ABSTRACT
User modeling on the Web has rested on the fundamental assumption of Markovian behavior --- a user's next action depends only on her current state, and not the history leading up to the current state. This forms the underpinning of PageRank web ranking, as well as a number of techniques for targeting advertising to users. In this work we examine the validity of this assumption, using data from a number of Web settings. Our main result invokes statistical order estimation tests for Markov chains to establish that Web users are not, in fact, Markovian. We study the extent to which the Markovian assumption is invalid, and derive a number of avenues for further research.
- P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna. The query-flow graph: Model and applications. In 17th CIKM, 2008. Google ScholarDigital Library
- J. Borges and M. Levene. Evaluating variable-length Markov chain models for analysis of user web navigation sessions. IEEE TKDE, 2007. Google ScholarDigital Library
- P. Buhlmann and A. Wyner. Variable length Markov chains. Annals of Statistics, 1999.Google Scholar
- H. Cao, D. Jiang, J. Pei, E. Chen, and H. Li. Towards context-aware search by learning a very large variable length hidden Markov model from search logs. In 18th WWW, 2009. Google ScholarDigital Library
- N. Craswell and M. Szummer. Random walks on the click graph. In 30th SIGIR, 2007. Google ScholarDigital Library
- I. Csiszár and P. Shields. The consistency of the BIC Markov order estimator. Annals of Statistics, 2000.Google Scholar
- D. Dalevi, D. Dubhashi, and M. Hermansson. A new order estimator for fixed and variable length Markov models with applications to DNA sequence similarity. Statistical Applications in Genetics and Molecular Biology, 2006.Google ScholarCross Ref
- B. Davison. Learning web request patterns. Web Dynamics, 2004.Google ScholarCross Ref
- M. Deshpande and G. Karypis. Selective Markov models for predicting web page accesses. ACM TOIT, 2004. Google ScholarDigital Library
- I. Holyer. The NP-completeness of some edge-partition problems. SICOMP, 1981.Google ScholarCross Ref
- J. Kemeny and J. Snell. Finite Markov Chains. van Nostrand, 1960.Google Scholar
- R. Lempel and S. Moran. SALSA: The stochastic approach for link-structure analysis. ACM TOIS, 2001. Google ScholarDigital Library
- Z. Li and J. Tian. Testing the suitability of Markov chains as web usage models. In COMPSAC 2003, 2003. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999.Google Scholar
- Y. Peres and P. Shields. Two new Markov order estimators. Arxiv Preprint Math/0506080, 2005.Google Scholar
- P. Pirolli and J. Pitkow. Distributions of surfers' paths through the World Wide Web: Empirical characterizations. WWW, 1999. Google ScholarDigital Library
- J. Rissanen. A universal data compression system. IEEE Trans. on Inf. Theory, 1983.Google Scholar
- D. Ron, Y. Singer, and N. Tishby. The power of amnesia: Learning probabilistic automata with variable memory length. Machine Learning, 1996. Google ScholarDigital Library
- R. Sarukkai. Link prediction and path analysis using Markov chains. Computer Networks, 2000. Google ScholarDigital Library
- R. Sen and M. Hansen. Predicting web users' next access based on log data. JCGS, 2003.Google ScholarCross Ref
- I. Zukerman, D. Albrecht, and A. Nicholson. Predicting users' requests on the WWW. In 7th UM, 1999. Google ScholarDigital Library
Index Terms
- Are web users really Markovian?
Recommendations
Web designers and web users: Influence of the ergonomic quality of the web site on the information search
Despite rapid growth in the number of web sites, there is still a significant number of ergonomic problems which hinder web users. Many studies focus on analysing cognitive processes and difficulties experienced by web users, but very few are interested ...
Web browsing behavior recording system
KES'11: Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part IVIn this paper, we introduce a Web browsing behavior recording system for research. Web browsing behavior data can help us to provide sophisticated services for human activities, because the data must indicate characteristics of Web users. We discuss the ...
Users' conceptions of web security: a comparative study
CHI EA '02: CHI '02 Extended Abstracts on Human Factors in Computing SystemsThis study characterizes users' conceptions of web security. Seventy-two individuals, 24 each from a rural community in Maine, a suburban professional community in New Jersey, and a high-technology community in California, participated in an extensive (...
Comments