Abstract
Academic search engines have been widely used to access academic papers, where users’ information needs are explicitly represented as search queries. Some modern recommender systems have taken one step further by predicting users’ information needs without the presence of an explicit query. In this article, we examine an academic paper recommender that sends out paper recommendations in email newsletters, based on the users’ browsing history on the academic search engine. Specifically, we look at users who regularly browse papers on the search engine, and we sign up for the recommendation newsletters for the first time. We address the task of reranking the recommendation candidates that are generated by a production system for such users.
We face the challenge that the users on whom we focus have not interacted with the recommender system before, which is a common scenario that every recommender system encounters when new users sign up. We propose an approach to reranking candidate recommendations that utilizes both paper content and user behavior. The approach is designed to suit the characteristics unique to our academic recommendation setting. For instance, content similarity measures can be used to find the closest match between candidate recommendations and the papers previously browsed by the user. To this end, we use a knowledge graph derived from paper metadata to compare entity similarities (papers, authors, and journals) in the embedding space. Since the users on whom we focus have no prior interactions with the recommender system, we propose a model to learn a mapping from users’ browsed articles to user clicks on the recommendations. We combine both content and behavior into a hybrid reranking model that outperforms the production baseline significantly, providing a relative 13% increase in Mean Average Precision and 28% in Precision@1.
Moreover, we provide a detailed analysis of the model components, highlighting where the performance boost comes from. The obtained insights reveal useful components for the reranking process and can be generalized to other academic recommendation settings as well, such as the utility of graph embedding similarity. Also, recent papers browsed by users provide stronger evidence for recommendation than historical ones.
- Fabio Aiolli. 2013. A preliminary study on a recommender system for the million songs dataset challenge. In Proceedings of the 4th Italian Information Retrieval Workshop (CEUR Workshop Proceedings), Vol. 964. CEUR-WS.org, 73--83. Retrieved from http://ceur-ws.org/Vol-964/paper12.pdf.Google Scholar
- Joeran Beel, Akiko Aizawa, Corinna Breitinger, and Bela Gipp. 2017. Mr. DLib: Recommendations-as-a-Service (RaaS) for academia. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL’17). IEEE, 1--2. Google ScholarDigital Library
- Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Adv. Neural Info. Process. Syst. 2787--2795. Google ScholarDigital Library
- Christopher J. C. Burges. 2010. From RankNet to LambdaRank to LambdaMart: An overview. Learning 11, 23--581 (2010), 81.Google Scholar
- Laurent Charlin, Richard S. Zemel, and Hugo Larochelle. 2014. Leveraging user libraries to bootstrap collaborative filtering. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 173--182. Google ScholarDigital Library
- Tianqi Chen, Weinan Zhang, Qiuxia Lu, Kailong Chen, Zhao Zheng, and Yong Yu. 2012. SVDFeature: A toolkit for feature-based collaborative filtering. J. Mach. Learn. Res. 13 (2012), 3619--3622. Google ScholarDigital Library
- Yao Cheng, Li’ang Yin, and Yong Yu. 2014. LorSLIM: Low rank sparse linear methods for top-N recommendations. In Proceedings of the 2014 IEEE International Conference on Data Mining (ICDM’14). IEEE Computer Society, 90--99. Google ScholarDigital Library
- Evangelia Christakopoulou and George Karypis. 2016. Local item-item models for top-N recommendation. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 67--74. Google ScholarDigital Library
- Fan R. K. Chung. 1997. Spectral Graph Theory. Number 92. American Mathematical Soc.Google Scholar
- Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for YouTube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 191--198. Google ScholarDigital Library
- Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the 2010 ACM Conference on Recommender Systems, RecSys 2010. ACM, 39--46. Google ScholarDigital Library
- Van Dang. 2018. The Lemur Project-Wiki-RankLib. Lemur Project. Retrieved from https://sourceforge.net/projects/lemur/.Google Scholar
- James Davidson, Benjamin Liebald, Junning Liu, Palash Nandy, Taylor Van Vleet, Ullas Gargi, Sujoy Gupta, Yu He, Mike Lambert, Blake Livingston, and Dasarathi Sampath. 2010. The YouTube video recommendation system. In Proceedings of the 4th ACM Conference on Recommender Systems. ACM, 293--296. Google ScholarDigital Library
- Mukund Deshpande and George Karypis. 2004. Item-based top-N recommendation algorithms. ACM Trans. Info. Syst. 22, 1 (2004), 143--177. Google ScholarDigital Library
- John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12 (2011), 2121--2159. Google ScholarDigital Library
- Travis Ebesu and Yi Fang. 2017. Neural citation network for context-aware citation recommendation. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1093--1096. Google ScholarDigital Library
- Michael D. Ekstrand, Praveen Kannan, James A. Stemper, John T. Butler, Joseph A. Konstan, and John T. Riedl. 2010. Automatically building research reading lists. In Proceedings of the 4th ACM Conference on Recommender Systems. ACM, 159--166. Google ScholarDigital Library
- Asmaa Elbadrawy and George Karypis. 2015. User-specific feature-based similarity models for top-n recommendation of new items. ACM Trans. Intell. Syst. Technol. 6, 3 (2015), 33:1--33:20. Google ScholarDigital Library
- Felice Ferrara, Nirmala Pudota, and Carlo Tasso. 2011. A keyphrase-based paper recommender system. In Italian Research Conference on Digital Libraries. Springer, 14--25.Google ScholarCross Ref
- Google Scholar. 2018. Retrieved from https://scholar.google.com/.Google Scholar
- Qi He, Daniel Kifer, Jian Pei, Prasenjit Mitra, and C. Lee Giles. 2011. Citation recommendation without author supervision. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining. ACM, 755--764. Google ScholarDigital Library
- Xiangnan He and Tat-Seng Chua. 2017. Neural factorization machines for sparse predictive analytics. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 355--364. Google ScholarDigital Library
- Maya Hristakeva, Daniel Kershaw, Marco Rossetti, Petr Knoth, Benjamin Pettit, Saúl Vargas, and Kris Jack. 2017. Building recommender systems for scholarly information. In Proceedings of the 1st Workshop on Scholarly Web Mining. ACM, 25--32. Google ScholarDigital Library
- Wenyi Huang, Zhaohui Wu, Liang Chen, Prasenjit Mitra, and C. Lee Giles. 2015. A neural probabilistic model for context-based citation recommendation. In Proceedings of the 29th AAAI Conference on Artificial Intelligence. AAAI, 2404--2410. Google ScholarDigital Library
- Yichen Jiang, Aixia Jia, Yansong Feng, and Dongyan Zhao. 2012. Recommending academic papers via users’ reading purposes. In Proceedings of the 6th ACM Conference on Recommender Systems. ACM, 241--244. Google ScholarDigital Library
- Thorsten Joachims. 2006. Training linear SVMs in linear time. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 217--226. Google ScholarDigital Library
- Santosh Kabbur, Xia Ning, and George Karypis. 2013. FISM: Factored item similarity models for top-N recommender systems. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 659--667. Google ScholarDigital Library
- Zhao Kang and Qiang Cheng. 2016. Top-N recommendation with novel rank approximation. In Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM, 126--134.Google ScholarCross Ref
- Hao-Ren Ke, Rolf Kwakkelaar, Yu-Min Tai, and Li-Chun Chen. 2002. Exploring behavior of e-journal users in science and technology: Transaction log analysis of elsevier’s sciencedirect OnSite in Taiwan. Library Info. Sci. Res. 24, 3 (2002), 265--291.Google Scholar
- Madian Khabsa, Zhaohui Wu, and C. Lee Giles. 2016. Towards better understanding of academic search. In Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries. ACM, 111--114. Google ScholarDigital Library
- Taraneh Khazaei and Orland Hoeber. 2017. Supporting academic search tasks through citation visualization and exploration. Int. J. Dig. Libraries 18, 1 (2017), 59--72. Google ScholarDigital Library
- Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Onur Küçüktunç, Erik Saule, Kamer Kaya, and Ümit V. Çatalyürek. 2012. Recommendation on academic networks using direction aware citation analysis. arXiv preprint arXiv:1205.1143 (2012).Google Scholar
- Damien Lefortier, Pavel Serdyukov, and Maarten de Rijke. 2014. Online exploration for detecting shifts in fresh intent. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. ACM, 589--598. Google ScholarDigital Library
- Huajing Li, Isaac Councill, Wang-Chien Lee, and C. Lee Giles. 2006. CiteSeerx: An architecture and web service design for an academic document search engine. In Proceedings of the 15th International Conference on World Wide Web. ACM, 883--884. Google ScholarDigital Library
- Xinyi Li and Maarten de Rijke. 2017. Academic search in response to major scientific events. In Proceedings of the 5th International Workshop on Bibliometric-enhanced Information Retrieval.Google Scholar
- Xinyi Li and Maarten de Rijke. 2017. Do topic shift and query reformulation patterns correlate in academic search? In Proceedings of the 39th European Conference on IR Research. Springer, 146--159.Google ScholarCross Ref
- Xinyi Li and Maarten de Rijke. 2019. Characterizing and predicting downloads in academic search. Info. Process. Manage. 56, 3 (2019), 394--407.Google ScholarCross Ref
- Xinyi Li, Bob J. A. Schijvenaars, and Maarten de Rijke. 2017. Investigating queries and search failures in academic search. Info. Process. Manage. 53, 3 (May 2017), 666--683. Google ScholarDigital Library
- Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, Vol. 15. 2181--2187. Google ScholarDigital Library
- Greg Linden, Brent Smith, and Jeremy York. 2003. Amazon.Com recommendations: Item-to-item collaborative filtering. IEEE Internet Comput. 7, 1 (Jan. 2003), 76--80. Google ScholarDigital Library
- Haifeng Liu, Xiangjie Kong, Xiaomei Bai, Wei Wang, Teshome Megersa Bekele, and Feng Xia. 2015. Context-based collaborative filtering for citation recommendation. IEEE Access 3 (2015), 1695--1703.Google ScholarCross Ref
- Hao Ma, Dengyong Zhou, Chao Liu, Michael R. Lyu, and Irwin King. 2011. Recommender systems with social regularization. In Proceedings of the 4th International Conference on Web Search and Web Data Mining. ACM, 287--296. Google ScholarDigital Library
- Anasua Mitra and Amit Awekar. 2017. On low overlap among search results of academic search engines. In Proceedings of the 26th International Conference on World Wide Web Companion. ACM, 823--824. Google ScholarDigital Library
- Taesup Moon, Wei Chu, Lihong Li, Zhaohui Zheng, and Yi Chang. 2012. An online learning framework for refining recency search results with user click feedback. ACM Trans. Info. Syst. 30, 4 (2012), 20:1--20:28. Google ScholarDigital Library
- Cristiano Nascimento, Alberto H. F. Laender, Altigran S. da Silva, and Marcos André Gonçalves. 2011. A source independent framework for research paper recommendation. In Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries. ACM, 297--306. Google ScholarDigital Library
- Xia Ning and George Karypis. 2011. SLIM: Sparse linear methods for top-N recommender systems. In Proceedings of the 11th IEEE International Conference on Data Mining. IEEE Computer Society, 497--506. Google ScholarDigital Library
- Xi Niu and Bradley M. Hemminger. 2012. A study of factors that affect the information-seeking behavior of academic scientists. J. Amer. Soc. Info. Sci. Technol. 63, 2 (2012), 336--353. Google ScholarDigital Library
- Zhen Pan, Enhong Chen, Qi Liu, Tong Xu, Haiping Ma, and Hongjie Lin. 2016. Sparse factorization machines for click-through rate prediction. In Proceedings of the 16th International Conference on Data Mining. IEEE Computer Society, 400--409.Google ScholarCross Ref
- David M. Pennock, Eric Horvitz, Steve Lawrence, and C. Lee Giles. 2000. Collaborative filtering by personality diagnosis: A hybrid memory-and model-based approach. In Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc., 473--480. Google ScholarDigital Library
- Sheila Pontis and Ann Blandford. 2015. Understanding “Influence”: An exploratory study of academics’ processes of knowledge construction through iterative and interactive information seeking. J. Assoc. Info. Sci. Technol. 66, 8 (2015), 1576--1593. Google ScholarDigital Library
- Sheila Pontis, Ann Blandford, Elke Greifeneder, Hesham Attalla, and David Neal. 2015. Keeping up to date: An academic researcher’s information journey. J. Amer. Soc. Info. Sci. Technol. 68, 1 (2015), 22--35.Google ScholarDigital Library
- Steffen Rendle. 2012. Factorization machines with libFM. ACM Trans. Intell. Syst. Technol. 3, 3 (2012), 57:1--57:22. Google ScholarDigital Library
- Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. AUAI Press, 452--461. Google ScholarDigital Library
- Francesco Ricci, Lior Rokach, and Bracha Shapira (Eds.). 2015. Recommender Systems Handbook. Springer. Google ScholarDigital Library
- Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2000. Analysis of recommendation algorithms for e-commerce. In Proceedings of the 2nd ACM Conference on Electronic Commerce. ACM, 158--167. Google ScholarDigital Library
- Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web. ACM, 285--295. Google ScholarDigital Library
- Martin Saveski and Amin Mantrach. 2014. Item cold-start recommendations: Learning local collective embeddings. In Proceedings of the 8th ACM Conference on Recommender Systems. ACM, 89--96. Google ScholarDigital Library
- ScienceDirect. 2015. Retrieved from https://sciencedirect.com.Google Scholar
- ScienceDirect. 2016. Retrieved from https://www.elsevier.com/solutions/sciencedirect/features.Google Scholar
- Semantic Scholar. 2018. Retrieved from https://www.semanticscholar.org/.Google Scholar
- Aravind Sesagiri Raamkumar, Schubert Foo, and Natalie Pang. 2018. Can I have more of these please? assisting researchers in finding similar research papers from a seed basket of papers. Emerald Publishing Limited.Google Scholar
- Guocong Song. 2014. Point-wise approach for yandex personalized web search challenge. In Proceedings of the WSDM 2014 Workshop on Web Search Click Data. ACM.Google Scholar
- Trevor Strohman, W. Bruce Croft, and David Jensen. 2007. Recommending citations for academic papers. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 705--706. Google ScholarDigital Library
- Kazunari Sugiyama and Min-Yen Kan. 2010. Scholarly paper recommendation via user’s recent research interests. In Proceedings of the 10th Annual Joint Conference on Digital Libraries. ACM, 29--38. Google ScholarDigital Library
- Jie Tang. 2016. AMiner: Toward understanding big scholar data. In Proceedings of the 9th ACM International Conference on Web Search and Data Mining. ACM, 467--467. Google ScholarDigital Library
- Jie Tang, Ruoming Jin, and Jing Zhang. 2008. A topic modeling approach and its integration into the random walk framework for academic search. In Proceedings of the 8th IEEE International Conference on Data Mining. IEEE, 1055--1060. Google ScholarDigital Library
- Roberto Torres, Sean M. McNee, Mara Abel, Joseph A. Konstan, and John Riedl. 2004. Enhancing digital libraries with TechLens+. In Proceedings of the 4th ACM/IEEE-CS Joint Conference on Digital Libraries. ACM, 228--236. Google ScholarDigital Library
- Chong Wang and David M. Blei. 2011. Collaborative topic modeling for recommending scientific articles. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 448--456. Google ScholarDigital Library
- Yao Wu, Christopher DuBois, Alice X. Zheng, and Martin Ester. 2016. Collaborative denoising auto-encoders for top-N recommender systems. In Proceedings of the 9th ACM International Conference on Web Search and Data Mining. ACM, 153--162. Google ScholarDigital Library
- Zhibo Xiao, Feng Che, Enuo Miao, and Mingyu Lu. 2014. Increasing serendipity of recommender system with ranking topic model. Appl. Math. Info. Sci. 8, 4 (2014), 2041.Google ScholarCross Ref
- Chenyan Xiong, Russell Power, and Jamie Callan. 2017. Explicit semantic ranking for academic search via knowledge graph embedding. In Proceedings of the 26th International Conference on World Wide Web. ACM, 1271--1279. Google ScholarDigital Library
- Feipeng Zhao and Yuhong Guo. 2016. Improving top-N recommendation with heterogeneous loss. In Proceedings of the 25th International Joint Conference on Artificial Intelligence. IJCAI/AAAI Press, 2378--2384. Google ScholarDigital Library
- Masrour Zoghi, Tomáš Tunys, Lihong Li, Damien Jose, Junyan Chen, Chun Ming Chin, and Maarten de Rijke. 2016. Click-based hot fixes for underperforming torso queries. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 195--204. Google ScholarDigital Library
Index Terms
- Personalised Reranking of Paper Recommendations Using Paper Content and User Behavior
Recommendations
Multi-Level Interaction Reranking with User Behavior History
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information RetrievalAs the final stage of the multi-stage recommender system (MRS), reranking directly affects users' experience and satisfaction, thus playing a critical role in MRS. Despite the improvement achieved in the existing work, three issues are yet to be solved. ...
Sections-based bibliographic coupling for research paper recommendation
Digital libraries suffer from the problem of information overload due to immense proliferation of research papers in journals and conference papers. This makes it challenging for researchers to access the relevant research papers. Fortunately, research ...
Expert agreement and content based reranking in a meta search environment using Mearf
WWW '02: Proceedings of the 11th international conference on World Wide WebRecent increase in the number of search engines on the Web and the availability of meta search engines that can query multiple search engines makes it important to find effective methods for combining results coming from different sources. In this paper ...
Comments