ABSTRACT
Nowadays Internet services have dramatically changed the way people interact with each other and many of our daily activities are supported by those services. Statistical indicators show that more than half of the world's population uses the Internet generating about 2.5 quintillion bytes of data on daily basis. While such a huge amount of data is useful in a number of fields, such as in medical and transportation systems, it also poses unprecedented threats for user's privacy. This is aggravated by the excessive data collection and user profiling activities of service providers. Yet, regulation require service providers to inform users about their data collection and processing practices. The de facto way of informing users about these practices is through the use of privacy policies. Unfortunately, privacy policies suffer from bad readability and other complexities which make them unusable for the intended purpose. To address this issue, we introduce PrivacyGuide, a privacy policy summarization tool inspired by the European Union (EU) General Data Protection Regulation (GDPR) and based on machine learning and natural language processing techniques. Our results show that PrivacyGuide is able to classify privacy policy content into eleven privacy aspects with a weighted average accuracy of 74% and further shed light on the associated risk level with an accuracy of 90%.
- Gaurav Bansal and Fatemeh Zahedi. 2008. The moderating influence of privacy concern on the efficacy of privacy assurance mechanisms for building trust: A multiple-context investigation. ICIS 2008 Proceedings (2008), 7.Google Scholar
- Christoph Bier, Kay Kühne, and Jürgen Beyerer. 2016. PrivacyInsight: the next generation privacy dashboard Annual Privacy Forum. Springer, 135--152.Google Scholar
- Rochelle A Cadogan. 2011. An imbalance of power: the readability of internet privacy policies. Journal of Business & Economics Research (JBER), Vol. 2, 3 (2011).Google ScholarCross Ref
- Elisa Costante, Yuanhao Sun, Milan Petković, and Jerry den Hartog. 2012. A machine learning solution to assess privacy policy completeness:(short paper) Proceedings of the 2012 ACM workshop on Privacy in the electronic society. ACM, 91--96. Google ScholarDigital Library
- Lorrie Faith Cranor. 2012. Necessary but not sufficient: Standardized mechanisms for privacy notice and choice. J. on Telecomm. & High Tech. L. Vol. 10 (2012), 273.Google Scholar
- Lorrie Faith Cranor, Praveen Guduru, and Manjula Arjula. 2006. User interfaces for privacy agents. ACM Transactions on Computer-Human Interaction (TOCHI), Vol. 13, 2 (2006), 135--178. Google ScholarDigital Library
- EC. 2013. Proposal for a DIRECTIVE OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL concerning measures to ensure a high common level of netw ork and information security across the Union. (2013). http://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:52013PC0048&from=ENGoogle Scholar
- EU. 2016. REGULATION (EU) 2016/679 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). (2016).Google Scholar
- Henry Farrell. 2003. Constructing the international foundations of e-commerce--The EU-US Safe Harbor Arrangement. International Organization Vol. 57, 2 (2003), 277--306.Google ScholarCross Ref
- Joshua Gluck, Florian Schaub, Amy Friedman, Hana Habib, Norman Sadeh, Lorrie Faith Cranor, and Yuvraj Agarwal. 2016. How Short Is Too Short? Implications of Length and Framing on the Effectiveness of Privacy Notices Symposium on Usable Privacy and Security (SOUPS).Google Scholar
- Niharika Guntamukkala, Rozita Dara, and Gary Grewal. 2015. A Machine-Learning Based Approach for Measuring the Completeness of Online Privacy Policies Machine Learning and Applications (ICMLA), 2015 IEEE 14th International Conference on. IEEE, 289--294.Google Scholar
- Patrick Gage Kelley, Lucian Cesca, Joanna Bresee, and Lorrie Faith Cranor. 2010. Standardizing privacy notices: an online study of the nutrition label approach Proceedings of the SIGCHI Conference on Human factors in Computing Systems. ACM, 1573--1582. Google ScholarDigital Library
- Ron Kohavi. 2001. Mining e-commerce data: the good, the bad, and the ugly Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 8--13. Google ScholarDigital Library
- Christian Kohlschütter, Peter Fankhauser, and Wolfgang Nejdl. 2010. Boilerplate detection using shallow text features. Proceedings of the third ACM international conference on Web search and data mining. ACM, 441--450. Google ScholarDigital Library
- Aleecia M McDonald and Lorrie Faith Cranor. 2008. The cost of reading privacy policies. ISJLP Vol. 4 (2008), 543.Google Scholar
- George R Milne and Mary J Culnan. 2004. Strategies for reducing online privacy risks: Why consumers read (or don't read) online privacy notices. Journal of Interactive Marketing Vol. 18, 3 (2004), 15--29.Google ScholarCross Ref
- Toru Nakamura, Welderufael B Tesfay, Shinsaku Kiyomoto, and Jetzabel Serna. 2017. Default Privacy Setting Prediction by Grouping User's Attributes and Settings Preferences. Data Privacy Management, Cryptocurrencies and Blockchain Technology. Springer, 107--123.Google Scholar
- Robert W Proctor, M Athar Ali, and Kim-Phuong L Vu. 2008. Examining usability of web privacy policies. Intl. Journal of Human-Computer Interaction, Vol. 24, 3 (2008), 307--328.Google ScholarCross Ref
- KA Ross, CS Jensen, R Snodgrass, CE Dyreson, CS Jensen, R Snodgrass, and L Chen. 2009. Cross-Validation. Encyclopedia of Database Systems. (2009).Google Scholar
- Nili Steinfeld. 2016. "I agree to the terms and conditions":(How) do users read privacy policies online? An eye-tracking experiment. Computers in human behavior Vol. 55 (2016), 992--1000. Google ScholarDigital Library
- Ali Sunyaev, Tobias Dehling, Patrick L Taylor, and Kenneth D Mandl. 2014. Availability and quality of mobile health app privacy policies. Journal of the American Medical Informatics Association, Vol. 22, e1 (2014), e28--e33.Google ScholarCross Ref
- Shomir Wilson, Florian Schaub, Aswarth Abhilash Dara, Frederick Liu, Sushain Cherivirala, Pedro Giovanni Leon, Mads Schaarup Andersen, Sebastian Zimmeck, Kanthashree Mysore Sathyendra, N Cameron Russell, et al. 2016. The Creation and Analysis of a Website Privacy Policy Corpus. ACL (1).Google Scholar
- Razieh Nokhbeh Zaeem, Rachel L German, and K Suzanne Barber. {n. d.}. PrivacyCheck: Automatic Summarization of Privacy Policies Using Data Mining. (. {n. d.}).Google Scholar
- Sebastian Zimmeck and Steven M Bellovin. 2014. Privee: An Architecture for Automatically Analyzing Web Privacy Policies. USENIX Security Symposium. 1--16. Google ScholarDigital Library
Index Terms
- PrivacyGuide: Towards an Implementation of the EU GDPR on Internet Privacy Policy Evaluation
Recommendations
An analytical framework for online privacy research
An analytical framework is suggested for interdisciplinary online privacy research.Websites managers views and knowledge is a neglected topic in privacy research.Websites managers indicate that their own websites do not violate users privacy.The younger ...
Capturing P3P semantics using an enforceable lattice-based structure
PAIS '11: Proceedings of the 4th International Workshop on Privacy and Anonymity in the Information SocietyWith the increasing amount of data collected by service providers, privacy concerns increase for data owners who must provide private data to receive services. Legislative acts require service providers to protect the privacy of customers. Privacy ...
Privacy policies verification in composite services using OWL
Privacy has been an important issue for online services collecting customer data. P3P is a privacy policy language with a fixed vocabulary to express privacy practices of online services. The matching between the privacy practices (P3P policies) and ...
Comments