research-article

Theoretical Understandings of Product Embedding for E-commerce Machine Learning

Authors:
Da Xu

Walmart Labs, Sunnyvale, CA, USA

Walmart Labs, Sunnyvale, CA, USA
View Profile

,
Chuanwei Ruan

Instacart, San Francisco, CA, USA

Instacart, San Francisco, CA, USA
View Profile

,
Evren Korpeoglu

Walmart Labs, Sunnyvale, CA, USA

Walmart Labs, Sunnyvale, CA, USA
View Profile

,
Sushant Kumar

Walmart Labs, Sunnyvale, CA, USA

Walmart Labs, Sunnyvale, CA, USA
View Profile

,
Kannan Achan

Walmart Labs, Sunnyvale, CA, USA

Walmart Labs, Sunnyvale, CA, USA
View Profile

WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data MiningMarch 2021Pages 256–264https://doi.org/10.1145/3437963.3441736

Published:08 March 2021Publication History

WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining

Pages 256–264

ABSTRACT

Product embeddings have been heavily investigated in the past few years, serving as the cornerstone for a broad range of machine learning applications in e-commerce. Despite the empirical success of product embeddings, little is known on how and why they work from the theoretical standpoint. Analogous results from the natural language processing (NLP) often rely on domain-specific properties that are not transferable to the e-commerce setting, and the downstream tasks often focus on different aspects of the embeddings. We take an e-commerce-oriented view of the product embeddings and reveal a complete theoretical view from both the representation learning and the learning theory perspective. We prove that product embeddings trained by the widely-adopted skip-gram negative sampling algorithm and its variants are sufficient dimension reduction regarding a critical product relatedness measure. The generalization performance in the downstream machine learning task is controlled by the alignment between the embeddings and the product relatedness measure. Following the theoretical discoveries, we conduct exploratory experiments that supports our theoretical insights for the product embeddings.

References

Carl Allen, Ivana Balazevic, and Timothy Hospedales. 2019. What the vec? towards probabilistically grounded embeddings. In Advances in Neural Information Processing Systems. 7467--7477.Google Scholar
Carl Allen and Timothy Hospedales. 2019. Analogies explained: Towards understanding word embeddings. arXiv preprint arXiv:1901.09813 (2019).Google Scholar
Sanjeev Arora, Yuanzhi Li, Yingyu Liang, Tengyu Ma, and Andrej Risteski. 2016. A latent variable model approach to pmi-based word embeddings. Transactions of the Association for Computational Linguistics, Vol. 4 (2016), 385--399.Google ScholarCross Ref
Oren Barkan and Noam Koenigstein. 2016. Item2vec: neural item embedding for collaborative filtering. In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 1--6.Google ScholarCross Ref
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7--10.Google ScholarDigital Library
Ryan Cotterell, Adam Poliak, Benjamin Van Durme, and Jason Eisner. 2017. Explaining and generalizing skip-gram through exponential family principal component analysis. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. 175--181.Google ScholarCross Ref
Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM conference on recommender systems. 191--198.Google ScholarDigital Library
Maurizio Ferrari Dacrema, Paolo Cremonesi, and Dietmar Jannach. 2019. Are we really making much progress? A worrying analysis of recent neural recommendation approaches. In Proceedings of the 13th ACM Conference on Recommender Systems. 101--109.Google ScholarDigital Library
Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 135--144.Google ScholarDigital Library
Amir Globerson and Naftali Tishby. 2003. Sufficient dimensionality reduction. Journal of Machine Learning Research, Vol. 3, Mar (2003), 1307--1331.Google ScholarDigital Library
Yoav Goldberg and Omer Levy. 2014. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722 (2014).Google Scholar
Mihajlo Grbovic, Vladan Radosavljevic, Nemanja Djuric, Narayan Bhamidipati, Jaikit Savla, Varun Bhagwan, and Doug Sharp. 2015. E-commerce in your inbox: Product recommendations at scale. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 1809--1818.Google ScholarDigital Library
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855--864.Google ScholarDigital Library
David Guthrie, Ben Allison, Wei Liu, Louise Guthrie, and Yorick Wilks. 2006. A closer look at skip-gram modelling.. In LREC, Vol. 6. 1222--1225.Google Scholar
Jui-Ting Huang, Ashish Sharma, Shuying Sun, Li Xia, David Zhang, Philip Pronin, Janani Padmanabhan, Giuseppe Ottaviano, and Linjun Yang. 2020. Embedding-based Retrieval in Facebook Search. arXiv preprint arXiv:2006.11632 (2020).Google Scholar
Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Advances in neural information processing systems. 2177--2185.Google Scholar
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.Google Scholar
Kyösti Pennanen, Taina Kaapu, and Minna-Kristiina Paakki. 2006. Trust, risk, privacy, and security in ecommerce. In Proceedings of the ICEB+eBRF Conference .Google Scholar
Steffen Rendle. 2010. Factorization machines. In 2010 IEEE International Conference on Data Mining. IEEE, 995--1000.Google ScholarDigital Library
Steffen Rendle, Walid Krichene, Li Zhang, and John Anderson. 2020. Neural Collaborative Filtering vs. Matrix Factorization Revisited. arXiv preprint arXiv:2005.09683 (2020).Google Scholar
Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web. 285--295.Google ScholarDigital Library
J Ben Schafer, Dan Frankowski, Jon Herlocker, and Shilad Sen. 2007. Collaborative filtering recommender systems. In The adaptive web. Springer, 291--324.Google ScholarDigital Library
Karl Stratos, Michael Collins, and Daniel Hsu. 2015. Model-based word embeddings from decompositions of count matrices. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1282--1291.Google ScholarCross Ref
Flavian Vasile, Elena Smirnova, and Alexis Conneau. 2016. Meta-prod2vec: Product embeddings using side-information for recommendation. In Proceedings of the 10th ACM Conference on Recommender Systems. 225--232.Google ScholarDigital Library
Martin J Wainwright. 2019. High-dimensional statistics: A non-asymptotic viewpoint. Vol. 48. Cambridge University Press.Google Scholar
Mengting Wan, Di Wang, Jie Liu, Paul Bennett, and Julian McAuley. 2018. Representing and recommending shopping baskets with complementarity, compatibility and loyalty. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 1133--1142.Google ScholarDigital Library
Jizhe Wang, Pipei Huang, Huan Zhao, Zhibo Zhang, Binqiang Zhao, and Dik Lun Lee. 2018. Billion-scale commodity embedding for e-commerce recommendation in alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 839--848.Google ScholarDigital Library
Da Xu, Chuanwei Ruan, Jason Cho, Evren Korpeoglu, Sushant Kumar, and Kannan Achan. 2020 a. Knowledge-aware Complementary Product Representation Learning. In Proceedings of the 13th International Conference on Web Search and Data Mining. 681--689.Google ScholarDigital Library
Da Xu, Chuanwei Ruan, Evren Korpeoglu, Sushant Kumar, and Kannan Achan. 2020 b. Product knowledge graph embedding for e-commerce. In Proceedings of the 13th International Conference on Web Search and Data Mining. 672--680.Google ScholarDigital Library
Rashad Yazdanifard, Noor Al-Huda Edres, and Arash Pour Seyedi. 2011. Security and privacy issues as a potential risk for further ecommerce development. In International Conference on Information Communication and Management-IPCSIT, Vol. 16.Google Scholar

Index Terms

Theoretical Understandings of Product Embedding for E-commerce Machine Learning
1. Information systems
  1. Information retrieval
2. Mathematics of computing
  1. Probability and statistics

Recommendations

Product Knowledge Graph Embedding for E-commerce
WSDM '20: Proceedings of the 13th International Conference on Web Search and Data Mining

In this paper, we propose a new product knowledge graph (PKG) embedding approach for learning the intrinsic product relations as product knowledge for e-commerce. We define the key entities and summarize the pivotal product relations that are critical ...
Read More
Towards Understanding the Mechanism of Contrastive Learning via Similarity Structure: A Theoretical Analysis
Machine Learning and Knowledge Discovery in Databases: Research Track
Abstract
Contrastive learning is an efficient approach to self-supervised representation learning. Although recent studies have made progress in the theoretical understanding of contrastive learning, the investigation of how to characterize the clusters of ...
Read More
Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why?
WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023

The use of pretrained embeddings has become widespread in modern e-commerce machine learning (ML) systems. In practice, however, we have encountered several key issues when using pretrained embedding in a real-world production system, many of which ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining
March 2021
1192 pages
ISBN:9781450382977
DOI:10.1145/3437963
General Chairs:
Liane Lewin-Eytan
Amazon, Israel
,
David Carmel
Amazon, Israel
,
Elad Yom-Tov
Microsoft, Israel
,
Program Chairs:
Eugene Agichtein
Emory University and Amazon, USA
,
Evgeniy Gabrilovich
Google Health, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 March 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
information theory
machine learning theory
product relation
representation learning
sufficient dimension reduction
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate498of2,863submissions,17%
Upcoming Conference
WSDM '25

Sponsor:

sigir

sigir

sigir

sigir

The Eighteenth ACM International Conference on Web Search and Data Mining

April 7 - 11, 2025

Hannover , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 476
  Total Downloads
- Downloads (Last 12 months)48
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Theoretical Understandings of Product Embedding for E-commerce Machine Learning

WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Product Knowledge Graph Embedding for E-commerce

Towards Understanding the Mechanism of Contrastive Learning via Similarity Structure: A Theoretical Analysis

Pretrained Embeddings for E-commerce Machine Learning: When it Fails and Why?