skip to main content
10.1145/3292500.3330745acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open Access

Fairness in Recommendation Ranking through Pairwise Comparisons

Published:25 July 2019Publication History

ABSTRACT

Recommender systems are one of the most pervasive applications of machine learning in industry, with many services using them to match users to products or information. As such it is important to ask: what are the possible fairness risks, how can we quantify them, and how should we address them? In this paper we offer a set of novel metrics for evaluating algorithmic fairness concerns in recommender systems. In particular we show how measuring fairness based on pairwise comparisons from randomized experiments provides a tractable means to reason about fairness in rankings from recommender systems. Building on this metric, we offer a new regularizer to encourage improving this metric during model training and thus improve fairness in the resulting rankings. We apply this pairwise regularization to a large-scale, production recommender system and show that we are able to significantly improve the system's pairwise fairness.

References

  1. Ryan Prescott Adams and Richard S Zemel. 2011. Ranking via Sinkhorn Propagation. arXiv preprint arXiv:1106.1925 (2011).Google ScholarGoogle Scholar
  2. Alekh Agarwal, Alina Beygelzimer, Miroslav Dud'ik, John Langford, and Hanna Wallach. 2018a. A reductions approach to fair classification. arXiv preprint arXiv:1803.02453 (2018).Google ScholarGoogle Scholar
  3. Aman Agarwal, Ivan Zaitsev, Xuanhui Wang, Cheng Li, Marc Najork, and Thorsten Joachims. 2018b. Estimating Position Bias without Intrusive Interventions. arXiv preprint arXiv:1812.05161 (2018). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Eytan Bakshy, Solomon Messing, and Lada A Adamic. 2015. Exposure to ideologically diverse news and opinion on Facebook. Science , Vol. 348, 6239 (2015), 1130--1132.Google ScholarGoogle Scholar
  5. Y Bechavod and K Ligett. 2017. Penalizing unfairness in binary classification. arXiv preprint arXiv:1707.00044 (2017).Google ScholarGoogle Scholar
  6. Irwan Bello, Sayali Kulkarni, Sagar Jain, Craig Boutilier, Ed Chi, Elad Eban, Xiyang Luo, Alan Mackey, and Ofer Meshi. 2018. Seq2Slate: Re-ranking and Slate Optimization with RNNs. arXiv preprint arXiv:1810.02019 (2018).Google ScholarGoogle Scholar
  7. Alex Beutel, Jilin Chen, Tulsee Doshi, Hai Qian, Allison Woodruff, Christine Luu, Pierre Kreitmann, Jonathan Bischof, and Ed H Chi. 2019. Putting Fairness Principles into Practice: Challenges, Metrics, and Improvements. In AIES . Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Alex Beutel, Jilin Chen, Zhe Zhao, and Ed H Chi. 2017a. Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075 (2017).Google ScholarGoogle Scholar
  9. Alex Beutel, Ed H Chi, Zhiyuan Cheng, Hubert Pham, and John Anderson. 2017b. Beyond globally optimal: Focused learning for improved recommendations. In WWW . 203--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Alex Beutel, Paul Covington, Sagar Jain, Can Xu, Jia Li, Vince Gatto, and Ed H Chi. 2018. Latent Cross: Making Use of Context in Recurrent Recommender Systems. In WSDM . 46--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Asia J. Biega, Krishna P. Gummadi, and Gerhard Weikum. 2018. Equity of Attention: Amortizing Individual Fairness in Rankings. In SIGIR . 405--414. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Daniel Borkan, Lucas Dixon, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. 2019. Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification. (2019).Google ScholarGoogle Scholar
  13. Toon Calders and Sicco Verwer. 2010. Three naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery , Vol. 21, 2 (2010), 277--292. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: from pairwise approach to listwise approach. In ICML . Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H. Chi. 2019. Top-K Off-Policy Correction for a REINFORCE Recommender System. In WSDM . 456--464. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In RecSys. 191--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Cynthia S Crowson, Elizabeth J Atkinson, and Terry M Therneau. 2016. Assessing calibration of prognostic risk scores. Statistical methods in medical research , Vol. 25, 4 (2016), 1692--1706.Google ScholarGoogle Scholar
  18. Lucas Dixon, John Li, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. 2018. Measuring and mitigating unintended bias in text classification. (2018).Google ScholarGoogle Scholar
  19. Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. ACM, 214--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Harrison Edwards and Amos Storkey. 2015. Censoring representations with an adversary. arXiv preprint arXiv:1511.05897 (2015).Google ScholarGoogle Scholar
  21. Michael D Ekstrand, Mucun Tian, Mohammed R Imran Kazi, Hoda Mehrpouyan, and Daniel Kluver. 2018. Exploring author gender in book rating and recommendation. In RecSys . 242--250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Gabriel Goh, Andrew Cotter, Maya R. Gupta, and Michael P. Friedlander. 2016. Satisfying Real-world Goals with Dataset Constraints. In Advances in Neural Information Processing Systems. 2415--2423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems . Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. 173--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Xinran He, Junfeng Pan, Ou Jin, Tianbing Xu, Bo Liu, Tao Xu, Yanxin Shi, Antoine Atallah, Ralf Herbrich, Stuart Bowers, et almbox. 2014. Practical lessons from predicting clicks on ads at facebook. In Proceedings of the Eighth International Workshop on Data Mining for Online Advertising. ACM, 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).Google ScholarGoogle Scholar
  27. Ray Jiang, Silvia Chiappa, Tor Lattimore, Andras Agyorgy, and Pushmeet Kohli. 2019. Degenerate Feedback Loops in Recommender Systems. (2019).Google ScholarGoogle Scholar
  28. Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In KDD. 133--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased learning-to-rank with biased feedback. In WSDM. 781--789. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. 2011. Fairness-aware learning through regularization approach. In 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW). IEEE, 643--650. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2016. Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807 (2016).Google ScholarGoogle Scholar
  32. Jon Kleinberg and Manish Raghavan. 2018. Selection Problems in the Presence of Implicit Bias. arXiv preprint arXiv:1801.03533 (2018).Google ScholarGoogle Scholar
  33. Yehuda Koren. 2009. Collaborative filtering with temporal dynamics. In KDD. 447--456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 8 (2009), 30--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, and Richard S. Zemel. 2016. The Variational Fair Autoencoder. In ICRL .Google ScholarGoogle Scholar
  36. Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, and Ed H Chi. 2018. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In KDD . 1930--1939. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. David Madras, Elliot Creager, Toniann Pitassi, and Richard S. Zemel. 2018. Learning Adversarially Fair and Transferable Representations. In ICML .Google ScholarGoogle Scholar
  38. H Brendan McMahan, Gary Holt, David Sculley, Michael Young, Dietmar Ebner, Julian Grady, Lan Nie, Todd Phillips, Eugene Davydov, Daniel Golovin, et almbox. 2013. Ad click prediction: a view from the trenches. In KDD . Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Rishabh Mehrotra, James McInerney, Hugues Bouchard, Mounia Lalmas, and Fernando Diaz. 2018. Towards a Fair Marketplace: Counterfactual Evaluation of the trade-off between Relevance, Fairness & Satisfaction in Recommendation Systems. In CIKM . 2243--2251. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Geoff Pleiss, Manish Raghavan, Felix Wu, Jon Kleinberg, and Kilian Q Weinberger. 2017. On fairness and calibration. In Advances in Neural Information Processing Systems. 5680--5689. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Ya'acov Ritov, Yuekai Sun, and Ruofei Zhao. 2017. On conditional parity as a notion of non-discrimination in machine learning. arXiv preprint arXiv:1706.08519 (2017).Google ScholarGoogle Scholar
  42. Tobias Schnabel, Adith Swaminathan, Peter I Frazier, and Thorsten Joachims. 2016a. Unbiased comparative evaluation of ranking functions. In Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval. ACM, 109--118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016b. Recommendations as treatments: Debiasing learning and evaluation. arXiv preprint arXiv:1602.05352 (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Ashudeep Singh and Thorsten Joachims. 2018. Fairness of Exposure in Rankings. In KDD. 2219--2228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Ashudeep Singh and Thorsten Joachims. 2019. Policy Learning for Fairness in Ranking . arXiv preprint arXiv:1902.04056 (2019).Google ScholarGoogle Scholar
  46. Julia Stoyanovich, Ke Yang, and HV Jagadish. 2018. Online set selection with fairness and diversity constraints. In EDBT .Google ScholarGoogle Scholar
  47. Lidan Wang, Jimmy Lin, and Donald Metzler. 2011. A cascade ranking model for efficient ranked retrieval. In SIGIR. 105--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Chao-Yuan Wu, Amr Ahmed, Alex Beutel, Alexander J Smola, and How Jing. 2017. Recurrent recommender networks. In WSDM. 495--503. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Sirui Yao and Bert Huang. 2017. Beyond parity: Fairness objectives for collaborative filtering. In Advances in Neural Information Processing Systems . Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Xing Yi, Liangjie Hong, Erheng Zhong, Nanthan Nan Liu, and Suju Rajan. 2014. Beyond clicks: dwell time for personalization. In RecSys. 113--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. 2015. Fairness constraints: Mechanisms for fair classification. arXiv preprint arXiv:1507.05259 (2015).Google ScholarGoogle Scholar
  52. Meike Zehlike, Francesco Bonchi, Carlos Castillo, Sara Hajian, Mohamed Megahed, and Ricardo Baeza-Yates. 2017. Fa* ir: A fair top-k ranking algorithm. In CIKM. ACM, 1569--1578. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In ICML. 325--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. 2018. Mitigating Unwanted Biases with Adversarial Learning. CoRR , Vol. abs/1801.07593 (2018). arxiv: 1801.07593Google ScholarGoogle Scholar
  55. Ziwei Zhu, Xia Hu, and James Caverlee. 2018. Fairness-Aware Tensor-Based Recommendation. In CIKM. 1153--1162. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fairness in Recommendation Ranking through Pairwise Comparisons

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader