ABSTRACT
Google Drive is a cloud storage and collaboration service used by hundreds of millions of users around the world. Quick Access is a new feature in Google Drive that surfaces the most relevant documents when a user visits the home screen. Our metrics show that users locate their documents in half the time with this feature compared to previous approaches. The development of Quick Access illustrates many general challenges and constraints associated with practical machine learning such as protecting user privacy, working with data services that are not designed with machine learning in mind, and evolving product definitions. We believe that the lessons learned from this experience will be useful to practitioners tackling a wide range of applied machine learning problems.
Supplemental Material
- Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI). 265--283.Google Scholar
- Martín Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep Learning with Differential Privacy. In 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS). 308--318. Google ScholarDigital Library
- Alekh Agarwal, Olivier Chapelle, Miroslav Dudík, and John Langford. 2014. A Reliable Effective Terascale Linear Learning System. Journal of Machine Learning Research 15, 1 (2014), 1111--1133.Google ScholarDigital Library
- Rakesh Agrawal and Ramakrishnan Srikant. 2000. Privacy-preserving Data Mining. In 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD). 439--450. Google ScholarDigital Library
- Ashton Anderson, Ravi Kumar, Andrew Tomkins, and Sergei Vassilvitskii. 2014. The Dynamics of Repeat Consumption. In 23rd International World Wide Web Conference (WWW). 419--430. Google ScholarDigital Library
- Michael Bendersky, Xuanhui Wang, Donald Metzler, and Marc Najork. 2017. Learning from User Interactions in Personal Search via Attribute Parameterization. In 10th ACM International Conference on Web Search and Data Mining (WSDM). 791--799. Google ScholarDigital Library
- James Bennett, Charles Elkan, Bing Liu, Padhraic Smyth, and Domonkos Tikk. 2007. KDD Cup and Workshop 2007. SIGKDD Explor. Newsl. 9, 2 (2007), 51--52. Google ScholarDigital Library
- Austin R. Benson, Ravi Kumar, and Andrew Tomkins. 2016. Modeling User Consumption Sequences. In 25th International Conference on World Wide Web (WWW). 519--529. Google ScholarDigital Library
- Craig Chambers, Ashish Raniwala, Frances Perry, Stephen Adams, Robert R. Henry, Robert Bradshaw, and Nathan Weizenbaum. 2010. FlumeJava: Easy, Efficient Data-parallel Pipelines. In 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 363--375. Google ScholarDigital Library
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C Hsieh, Deborah A Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E Gruber. 2006. Bigtable: A distributed storage system for structured data. (2006), 205--2018.Google Scholar
- Jun Chen, Chaokun Wang, and Jianmin Wang. 2015. Will You "Reconsume" the Near Past? Fast Prediction on Short-term Reconsumption Behaviors. In 29th AAAI Conference on Artificial Intelligence (AAAI). 23--29.Google Scholar
- Michael Chui, James Manyika, Jacques Bughin, Richard Dobbs, Charles Roxburgh, Hugo Sarrazin, Geoffrey Sands, and Magdalena Westergren. 2012. The social economy: Unlocking value and productivity through social technologies. McKinsey Global Institute.Google Scholar
- Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In 10th ACM Conference on Recommender Systems (RecSys). 191--198. Google ScholarDigital Library
- James Davidson, Benjamin Liebald, Junning Liu, Palash Nandy, Taylor Van Vleet, Ullas Gargi, Sujoy Gupta, Yu He, Mike Lambert, Blake Livingston, and Dasarathi Sampath. 2010. The YouTube Video Recommendation System. In 4th ACM Conference on Recommender Systems (RecSys). 293--296. Google ScholarDigital Library
- Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc-Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, and Andrew Y. Ng. 2012. Large Scale Distributed Deep Networks. In Advances in Neural Information Processing Systems 26 (NIPS). 1223--1231.Google ScholarDigital Library
- John C. Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. Journal of Machine Learning Research 12 (2011), 2121--2159.Google ScholarDigital Library
- Cynthia Dwork. 2006. Differential Privacy. In 33rd International Conference on Automata, Languages and Programming - Volume Part II (ICALP). 1--12. Google ScholarDigital Library
- Geoffrey Hinton, Li Deng, Dong Yu, George E Dahl, Abdelrahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N Sainath, and Brian Kingsbury. 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine 29, 6 (2012), 82--97. Google ScholarCross Ref
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 26 (NIPS). 1097--1105.Google Scholar
- Quoc V Le. 2013. Building high-level features using large scale unsupervised learning. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 8595--8598.Google ScholarCross Ref
- Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521 (2015), 436--444. Google ScholarCross Ref
- Steffen Rendle, Dennis Fetterly, Eugene J. Shekita, and Bor-Yiing Su. 2016. Robust Large-Scale Machine Learning in the Cloud. In 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 1125--1134. Google ScholarDigital Library
- D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. 2015. Hidden Technical Debt in Machine Learning Systems. In Advances in Neural Information Processing Systems 29 (NIPS). 2503--2511.Google Scholar
- Reza Shokri and Vitaly Shmatikov. 2015. Privacy-Preserving Deep Learning. In 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS). 1310--1321.Google Scholar
- Latanya Sweeney. 2002. K-anonymity: A Model for Protecting Privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10, 5 (2002), 557--570. Google ScholarDigital Library
Index Terms
- Quick Access: Building a Smart Experience for Google Drive
Recommendations
Architecture for user-controlled e-privacy
SAC '03: Proceedings of the 2003 ACM symposium on Applied computingEmpowering users to make informed decision-making over online release of private data is a challenge in today's society. A large majority of users has rejected many e-privacy business models including Lumeria's, Zero-Knowledge's, and Microsoft's ...
Comments