ABSTRACT
The Social Web is successfully established, and steadily growing in terms of users, content and services. People generate and consume data in real-time within social networking services, such as Twitter, and increasingly rely upon continuous streams of messages for real-time access to fresh knowledge about current affairs. In this paper, we focus on analyzing social streams in real-time for personalized topic recommendation and discovery. We consider collaborative filtering as an online ranking problem and present Stream Ranking Matrix Factorization - RMFX -, which uses a pairwise approach to matrix factorization in order to optimize the personalized ranking of topics. Our novel approach follows a selective sampling strategy to perform online model updates based on active learning principles, that closely simulates the task of identifying relevant items from a pool of mostly uninteresting ones. RMFX is particularly suitable for large scale applications and experiments on the "476 million Twitter tweets" dataset show that our online approach largely outperforms recommendations based on Twitter's global trend, and it is also able to deliver highly competitive Top-N recommendations faster while using less space than Weighted Regularized Matrix Factorization (WRMF), a state-of-the-art matrix factorization technique for Collaborative Filtering, demonstrating the efficacy of our approach.
- D. Agarwal, B.-C. Chen, and P. Elango. Fast online learning through offline initialization for time-sensitive recommendation. In Proceedings of the ACM KDD conference, 2010. Google ScholarDigital Library
- L. Bottou. Online Algorithms and Stochastic Approximations. In Online Learning and Neural Networks. 1998. Google ScholarDigital Library
- P. Cremonesi, Y. Koren, and R. Turrin. Performance of Recommender Algorithms on Top-N recommendation Tasks. In Proceedings of the ACM RecSys conference, 2010. Google ScholarDigital Library
- A. S. Das, M. Datar, A. Garg, and S. Rajaram. Google news personalization: scalable online collaborative filtering. In Proceedings of the World Wide Web, 2007. Google ScholarDigital Library
- M. Deshpande and G. Karypis. Item-based top-n recommendation algorithms. ACM Trans. Inf. Syst., 22:143--177, January 2004. Google ScholarDigital Library
- S. Ertekin, J. Huang, L. Bottou, and L. Giles. Learning on the Border: Active Learning in Imbalanced Data Classification. In Proceedings of the ACM CIKM conference, 2007. Google ScholarDigital Library
- Z. Gantner, S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme. MyMediaLite: A free recommender system library. In Proceedings of the ACM RecSys conference, 2011. Google ScholarDigital Library
- Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In Proceedings of the IEEE ICDM conference, 2008. Google ScholarDigital Library
- T. Joachims. Optimizing Search Engines Using Clickthrough Data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '02, 2002. Google ScholarDigital Library
- R. Karimi, C. Freudenthaler, A. Nanopoulos, and L. Schmidt-Thieme. Towards optimal active learning for matrix factorization in recommender systems. In IEEE ICTAI conference, 2011. Google ScholarDigital Library
- Y. Koren, R. Bell, and C. Volinsky. Matrix Factorization Techniques for Recommender Systems. Computer, August 2009. Google ScholarDigital Library
- S. Muthukrishnan. Data streams: algorithms and applications. Now Publishers, 2005.Google ScholarCross Ref
- S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme. BPR: Bayesian Personalized Ranking from Implicit Feedback. In Proceedings of UAI conference, 2009. Google ScholarDigital Library
- S. Rendle and L. Schmidt-Thieme. Online-updating regularized kernel matrix factorization models for large-scale recommender systems. In Proceedings of the ACM RecSys conference, 2008. Google ScholarDigital Library
- D. Sculley. Combined regression and ranking. In Proceedings of the ACM KDD conference, 2010. Google ScholarDigital Library
- A. J. Smola and B. Sch olkopf. Sparse greedy matrix approximation for machine learning. In Proceedings of the International Conference on Machine Learning, 2000. Google ScholarDigital Library
- S. Tong and D. Koller. Support vector machine active learning with applications to text classification. J. Mach. Learn. Res., 2:45--66, Mar. 2002. Google ScholarDigital Library
- twittereng. 200 million tweets per day. Twitter Blog. http://goo.gl/eybp0, June 2011.Google Scholar
- V. N. Vapnik. The nature of statistical learning theory. Springer-Verlag New York, Inc., New York, NY, USA, 1995. Google ScholarCross Ref
- J. S. Vitter. Random sampling with a reservoir. ACM Trans. Math. Softw., 11:37--57, March 1985. Google ScholarDigital Library
- J. Yang and J. Leskovec. Patters of temporal variation in online media. In Proceedings of the ACM WSDM conference, 2011. Google ScholarDigital Library
- H. Yu. SVM Selective Sampling for Ranking with Application to Data Retrieval. In Proceedings of the eleventh ACM KDD conference, 2005. Google ScholarDigital Library
- P. Zhao, S. Hoi, R. Jin, and T. Yang. Online AUC Maximization. In Proceedings of the International Conference on Machine Learning, 2011.Google Scholar
Index Terms
- Real-time top-n recommendation in social streams
Recommendations
Serendipitous Personalized Ranking for Top-N Recommendation
WI-IAT '12: Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01Serendipitous recommendation has benefitted both e-retailers and users. It tends to suggest items which are both unexpected and useful to users. These items are not only profitable to the retailers but also surprisingly suitable to consumers' tastes. ...
Unifying rating-oriented and ranking-oriented collaborative filtering for improved recommendation
We propose a novel unified recommendation model, URM, which combines a rating-oriented collaborative filtering (CF) approach, i.e., probabilistic matrix factorization (PMF), and a ranking-oriented CF approach, i.e., list-wise learning-to-rank with ...
Improving Top-N Recommendation for Cold-Start Users via Cross-Domain Information
Making accurate recommendations for cold-start users is a challenging yet important problem in recommendation systems. Including more information from other domains is a natural solution to improve the recommendations. However, most previous work in ...
Comments