survey

A Survey on Differential Privacy for Unstructured Data Content

Authors:
Ying Zhao

Swinburne University of Technology, Melbourne, Australia

Swinburne University of Technology, Melbourne, Australia
View Profile

,
Jinjun Chen

Swinburne University of Technology, Melbourne, Australia

Swinburne University of Technology, Melbourne, Australia

0000-0003-1677-9525
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 54 Issue 10sArticle No.: 207pp 1–28https://doi.org/10.1145/3490237

Published:13 September 2022Publication History

ACM Computing Surveys

Abstract

Huge amounts of unstructured data including image, video, audio, and text are ubiquitously generated and shared, and it is a challenge to protect sensitive personal information in them, such as human faces, voiceprints, and authorships. Differential privacy is the standard privacy protection technology that provides rigorous privacy guarantees for various data. This survey summarizes and analyzes differential privacy solutions to protect unstructured data content before it is shared with untrusted parties. These differential privacy methods obfuscate unstructured data after they are represented with vectors and then reconstruct them with obfuscated vectors. We summarize specific privacy models and mechanisms together with possible challenges in them. We also discuss their privacy guarantees against AI attacks and utility losses. Finally, we discuss several possible directions for future research.

REFERENCES

[1] Acharya Jayadev, Bonawitz Kallista, Kairouz Peter, Ramage Daniel, and Sun Ziteng. 2020. Context aware local differential privacy. In 37th International Conference on Machine Learning (ICML’20). 52–62.Google ScholarDigital Library
[2] Anandan Balamurugan, Clifton Chris, Jiang Wei, Murugesan Mummoorthy, Pastrana-Camacho Pedro, and Si Luo. 2012. t-Plausibility: Generalizing words to desensitize text. Transactions on Data Privacy 5, 3 (2012), 505–534.Google ScholarDigital Library
[3] Andrés Miguel E., Bordenabe Nicolás E., Chatzikokolakis Konstantinos, and Palamidessi Catuscia. 2013. Geo-indistinguishability: Differential privacy for location-based systems. In Proc. of the 20th ACM SIGSAC Conference on Computer and Communications Security (CCS’13). 901–914.Google ScholarDigital Library
[4] Avent Brendan, Korolova Aleksandra, Zeber David, Hovden Torgeir, and Livshits Benjamin. 2017. BLENDER: Enabling local search with a hybrid differential privacy model. In 26th USENIX Security Symposium. 747–764.Google Scholar
[5] Balle Borja, Bell James, Gascon Adria, and Nissim Kobbi. 2020. Private summation in the multi-message shuffle model. In Proc. of the 27th ACM SIGSAC Conference on Computer and Communications Security (CCS’20). 657–676.Google ScholarDigital Library
[6] Blocki Jeremiah, Blum Avrim, Datta Anupam, and Sheffet Or. 2012. The Johnson-Lindenstrauss transform itself preserves differential privacy. In 53rd IEEE Annual Symposium on Foundations of Computer Science (FOCS’12). 410–419.Google Scholar
[7] Bojanowski Piotr, Grave Edouard, Joulin Armand, and Mikolov Tomas. 2017. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5 (2017), 135–146.Google ScholarCross Ref
[8] Bozkir Efe, Gunlu Onur, Fuhl Wolfgang, Schaefer Rafael F., and Kasneci Enkelejda. 2020. Differential Privacy for eye tracking with temporal correlations. Cryptology ePrint Archive, Report 2020/340. https://eprint.iacr.org/2020/340.Google Scholar
[9] Cai Xingjuan, Geng Shaojin, Wu Di, Cai Jianghui, and Chen Jinjun. 2021. A multi-cloud model based many-objective intelligent algorithm for efficient task scheduling in Internet of Things. IEEE Internet of Things Journal 8, 12 (2021), 9645–9653. DOI:Google ScholarCross Ref
[10] Cangialosi Frank, Agarwal Neil, Arun Venkat, Jiang Junchen, Narayana Srinivas, Sarwate Anand, and Netravali Ravi. 2022. Privid: Practical, privacy-preserving video analytics queries. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI). 209–228.Google Scholar
[11] Chatzikokolakis Konstantinos, Andrés Miguel E., Bordenabe Nicolás Emilio, and Palamidessi Catuscia. 2013. Broadening the scope of differential privacy using metrics. In The 13th Privacy Enhancing Technologies Symposium (PETS’13). 82–102.Google Scholar
[12] Chatzikokolakis Konstantinos, Palamidessi Catuscia, and Stronati Marco. 2015. Constructing elastic distinguishability metrics for location privacy. Proceedings on Privacy Enhancing Technologies. 2 (2015), 156–170.Google ScholarCross Ref
[13] Chen Jia-Wei, Chen Li-Ju, Yu Chia-Mu, and Lu Chun-Shien. 2021. Perceptual Indistinguishability-Net (PI-Net): Facial image obfuscation with manipulable semantics. In Proc. of the 34th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 6478–6487.Google ScholarCross Ref
[14] Chetty Raj and Friedman John N.. 2019. A practical method to reduce privacy loss when disclosing statistics based on small samples. AEA Papers and Proceedings 109 (2019), 414–20.Google ScholarCross Ref
[15] Ruegg Marcus Cuda Christoph and Gael Jurgen Van. 2009. Distance metrics. [EB/OL]. https://numerics.mathdotnet.com/Distance.html.Google Scholar
[16] Cormode Graham, Jha Somesh, Kulkarni Tejas, Li Ninghui, Srivastava Divesh, and Wang Tianhao. 2018. Privacy at scale: Local differential privacy in practice. In Proc. of the 44th International Conference on Management of Data (SIGMOD’18). 1655–1658.Google ScholarDigital Library
[17] Cui Zhihua, Xue Fei, Zhang Shiqiang, Cai Xingjuan, Cao Yang, Zhang Wensheng, and Chen Jinjun. 2020. A hybrid blockchain-based identity authentication scheme for multi-WSN. IEEE Transactions on Services Computing 13, 2 (2020), 241–251.Google Scholar
[18] Dankar Fida Kamal and Emam Khaled El. 2013. Practicing differential privacy in health care: A review. Transactions on Data Privacy 6, 1 (2013), 35–67.Google ScholarDigital Library
[19] Desfontaines Damien and Pejó Balázs. 2020. SoK: Differential privacies. Proceedings on Privacy Enhancing Technologies 2020, 2 (2020), 288–313.Google ScholarCross Ref
[20] Duchi John C., Jordan Michael I., and Wainwright Martin J.. 2013. Local privacy and statistical minimax rates. In 54th IEEE Annual Symposium on Foundations of Computer Science (FOCS’13). 429–438.Google Scholar
[21] Dwork Cynthia. 2008. Differential privacy: A survey of results. Theory and Applications of Models of Computation 4978 (2008), 1–19.Google ScholarCross Ref
[22] Dwork Cynthia. 2009. The differential privacy frontier. In The 6th Theory of Cryptography Conference. 496–502.Google ScholarDigital Library
[23] Dwork Cynthia, McSherry Frank, Nissim Kobbi, and Smith Adam. 2006. Calibrating noise to sensitivity in private data analysis. In The 3th Theory of Cryptography Conference (TCC’06). 265–284.Google ScholarDigital Library
[24] Dwork Cynthia, Smith Adam, Steinke Thomas, and Ullman Jonathan. 2017. Exposed! A survey of attacks on private data. Annual Review of Statistics and Its Application 4, 1 (2017), 61–84.Google ScholarCross Ref
[25] Erlingsson Úlfar, Pihur Vasyl, and Korolova Aleksandra. 2014. RAPPOR: Randomized aggregatable privacy-preserving ordinal response. In Proc. of the 21st ACM SIGSAC Conference on Computer and Communications Security (CCS’14). 1054–1067.Google ScholarDigital Library
[26] Fan Liyue. 2018. Image pixelization with differential privacy. In 32nd Annual IFIP WG 11.3 Conference on Data and Applications Security and Privacy (DBSec’18). 148–162.Google ScholarDigital Library
[27] Fan Liyue. 2019. Practical image obfuscation with provable privacy. In 20th IEEE International Conference on Multimedia and Expo (ICME’19). 784–789.Google ScholarCross Ref
[28] Fan Liyue. 2020. A survey of differentially private generative adversarial networks. In The AAAI Workshop on Privacy-Preserving Artificial Intelligence.Google Scholar
[29] Fan Liyue and Bonomi Luca. 2018. Time series sanitization with metric-based privacy. In 6th IEEE International Congress on Big Data. 264–267.Google Scholar
[30] Fang Fuming, Wang Xin, Yamagishi Junichi, Echizen Isao, Todisco Massimiliano, Evans Nicholas, and Bonastre Jean-Francois. 2019. Speaker anonymization using x-vector and neural waveform models. In The 10th ISCA Speech Synthesis Workshop. 155–160.Google Scholar
[31] Fernandes Natasha, Dras Mark, and McIver Annabelle. 2019. Generalised differential privacy for text document processing. In The 8th International Conference on Principles of Security and Trust (POST’19). 123–148.Google ScholarCross Ref
[32] Fernandes Natasha, Kawamoto Yusuke, and Murakami Takao. 2020. Locality sensitive hashing with extended differential privacy. arXiv preprint, arXiv:2010.09393 (2020).Google Scholar
[33] Feyisetan Oluwaseyi, Aggarwal Abhinav, Xu Zekun, and Teissier Nathanael. 2020. Research challenges in designing differentially private text generation mechanisms. arXiv preprint, arXiv:2012.05403 (2020).Google Scholar
[34] Feyisetan Oluwaseyi, Balle Borja, Drake Thomas, and Diethe Tom. 2020. Privacy-and utility-preserving textual analysis via calibrated multivariate perturbations. In The 13th ACM International Conference on Web Search and Data Mining (WSDM’20). 178–186.Google ScholarDigital Library
[35] Feyisetan Oluwaseyi, Diethe Tom, and Drake Thomas. 2019. Leveraging hierarchical representations for preserving privacy and utility in text. In 19th IEEE International Conference on Data Mining (ICDM’19). 210–219.Google ScholarCross Ref
[36] Feyisetan Oluwaseyi and Kasiviswanathan Shiva. 2021. Private release of text embedding vectors. In Proc. of the 1st Workshop on Trustworthy Natural Language Processing. 15–27.Google Scholar
[37] Fletcher Sam and Islam Md Zahidul. 2019. Decision tree classification with differential privacy: A survey. ACM Computing Surveys 52, 4 (2019), 1–33.Google ScholarDigital Library
[38] Frome Andrea, Cheung German, Abdulkader Ahmad, Zennaro Marco, Wu Bo, Bissacco Alessandro, Adam Hartwig, Neven Hartmut, and Vincent Luc. 2009. Large-scale privacy protection in Google street view. In 12th IEEE International Conference on Computer Vision (ICCV’09). 2373–2380.Google ScholarCross Ref
[39] Ganea Octavian, Becigneul Gary, and Hofmann Thomas. 2018. Hyperbolic entailment cones for learning hierarchical embeddings. In The 35th International Conference on Machine Learning (ICML’18). 1646–1655.Google Scholar
[40] Graves Alex, Fernández Santiago, Gomez Faustino, and Schmidhuber Jürgen. 2006. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. In The 23rd International Conference on Machine Learning (ICML’06). 369–376.Google ScholarDigital Library
[41] Gu Xiaolan, Li Ming, Xiong Li, and Cao Yang. 2020. Providing input-discriminative protection for local differential privacy. In 36th IEEE International Conference on Data Engineering (ICDE’20). 505–516.Google ScholarCross Ref
[42] Gursoy Mehmet Emre, Tamersoy Acar, Truex Stacey, Wei Wenqi, and Liu Ling. 2021. Secure and utility-aware data collection with condensed local differential privacy. IEEE Transactions on Dependable and Secure Computing 18, 5 (2021), 2365–2378. DOI:Google ScholarDigital Library
[43] Hamm Jihun. 2017. Minimax filter: Learning to preserve privacy from inference attacks. Journal of Machine Learning Research 18, 1 (2017), 4704–4734.Google ScholarDigital Library
[44] Han Yaowei, Li Sheng, Cao Yang, Ma Qiang, and Yoshikawa Masatoshi. 2020. Voice-indistinguishability: Protecting voiceprint in privacy-preserving speech data release. In 21st IEEE International Conference on Multimedia and Expo (ICME’20). 1–6.Google ScholarCross Ref
[45] Hassan Muneeb Ul, Rehmani Mubashir Husain, and Chen Jinjun. 2020. DEAL: Differentially private auction for blockchain-based microgrids energy trading. IEEE Transactions on Services Computing 13, 2 (2020), 263–275.Google Scholar
[46] Hassan Muneeb Ul, Rehmani Mubashir Husain, and Chen Jinjun. 2020. Differential privacy techniques for cyber physical systems: A survey. IEEE Communications Surveys & Tutorials 22, 1 (2020), 746–789.Google ScholarDigital Library
[47] Hill Steven, Zhou Zhimin, Saul Lawrence, and Shacham Hovav. 2016. On the (in) effectiveness of mosaicing and blurring as tools for document redaction. Proceedings on Privacy Enhancing Technologies 2016, 4 (2016), 403–417.Google ScholarCross Ref
[48] Holohan Naoise, Antonatos Spiros, Braghin Stefano, and Aonghusa Pól Mac. 2020. The bounded Laplace mechanism in differential privacy. Journal of Privacy and Confidentiality 10, 1 (2020), 1–15. DOI:Google ScholarCross Ref
[49] Jiang Honglu, Pei Jian, Yu Dongxiao, Yu Jiguo, Gong Bei, and Cheng Xiuzhen. 2021. Applications of differential privacy in social network analysis: A survey. IEEE Transactions on Knowledge & Data Engineering (TKDE’21).DOI:Google ScholarCross Ref
[50] Jung Woohwan, Kwon Suyong, and Shim Kyuseok. 2021. TIDY: Publishing a time interval dataset with differential privacy. IEEE Transactions on Knowledge and Data Engineering (TKDE) 33, 5 (2021), 2280–2294.Google ScholarCross Ref
[51] Justin Tadej, Štruc Vitomir, Dobrišek Simon, Vesnicer Boštjan, Ipšić Ivo, and Mihelič France. 2015. Speaker de-identification using diphone recognition and speech synthesis. In The 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG’15). 1–7.Google ScholarCross Ref
[52] Kamalaruban Parameswaran, Perrier Victor, Asghar Hassan Jameel, and Kaafar Mohamed Ali. 2020. Not all attributes are created equal: \( d_\mathcal {X} \)-private mechanisms for linear queries. Proceedings on Privacy Enhancing Technologies 2020, 1 (2020), 103–125.Google ScholarCross Ref
[53] Kasiviswanathan Shiva Prasad, Lee Homin K., Nissim Kobbi, Raskhodnikova Sofya, and Smith Adam. 2011. What can we learn privately? SIAM Journal on Computing 40, 3 (2011), 793–826.Google ScholarDigital Library
[54] Kenthapadi Krishnaram, Korolova Aleksandra, Mironov Ilya, and Mishra Nina. 2013. Privacy via the Johnson-Lindenstrauss transform. Journal of Privacy and Confidentiality 5, 1 (2013), 39–71.Google ScholarCross Ref
[55] Koppel Moshe, Schler Jonathan, and Argamon Shlomo. 2011. Authorship attribution in the wild. Language Resources and Evaluation 45, 1 (2011), 83–94.Google ScholarDigital Library
[56] Koufogiannis Fragkiskos and Pappas George J.. 2016. Location-dependent privacy. In 55th IEEE Conference on Decision and Control (CDC’16). 7586–7591.Google ScholarDigital Library
[57] Koufogiannis Fragkiskos and Pappas George J.. 2016. Multi-owner multi-user privacy. In 55th IEEE Conference on Decision and Control (CDC’16). 1787–1793.Google ScholarDigital Library
[58] Li Jingjie, Chowdhury Amrita Roy, Fawaz Kassem, and Kim Younghyun. 2021. Kal\( \varepsilon \)ido: Real-time privacy control for eye-tracking systems. In The 30th USENIX Security Symposium.Google Scholar
[59] Li Tao and Clifton Chris. 2021. Differentially private imaging via latent space manipulation. arXiv preprint arXiv: 2103.05472 (2021).Google Scholar
[60] Liu Ao, Xia Lirong, Duchowski Andrew, Bailey Reynold, Holmqvist Kenneth, and Jain Eakta. 2019. Differential privacy for eye-tracking data. In The 11th ACM Symposium on Eye Tracking Research & Applications (ETRA’19). 28:1–28:10.Google Scholar
[61] Liu Fang. 2018. Generalized Gaussian mechanism for differential privacy. IEEE Transactions on Knowledge and Data Engineering (TKDE) 31, 4 (2018), 747–756.Google ScholarDigital Library
[62] Mangat Naurang S.. 1994. An improved randomized response strategy. Journal of the Royal Statistical Society: Series B (Methodological) 56, 1 (1994), 93–95.Google ScholarCross Ref
[63] McPherson Richard, Shokri Reza, and Shmatikov Vitaly. 2016. Defeating image obfuscation with deep learning. arXiv preprint, arXiv:1609.00408 (2016).Google Scholar
[64] McSherry Frank and Talwar Kunal. 2007. Mechanism design via differential privacy. In The 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07). 94–103.Google Scholar
[65] McSherry Frank D.. 2009. Privacy integrated queries: An extensible platform for privacy-preserving data analysis. In Proc. of the 35th ACM SIGMOD International Conference on Management of Data (SIGMOD’09). 19–30.Google ScholarDigital Library
[66] Mendes Ricardo, Cunha Mariana, and Vilela João P.. 2020. Impact of frequency of location reports on the privacy level of geo-indistinguishability. Proceedings on Privacy Enhancing Technologies 2020, 2 (2020), 379–396.Google ScholarCross Ref
[67] Mikolov Tomas, Chen Kai, Corrado Greg, and Dean Jeffrey. 2013. Efficient estimation of word representations in vector space. arXiv preprint, arXiv:1301.3781 (2013).Google Scholar
[68] Mikolov Tomas, Sutskever Ilya, Chen Kai, Corrado Gregory S., and Dean Jeffrey. 2013. Distributed representations of words and phrases and their compositionality. In Proc. of the 27th Annual Conference on Neural Information Processing Systems (NeurIPS’13). 3111–3119.Google Scholar
[69] Murakami Takao and Kawamoto Yusuke. 2019. Utility-optimized local differential privacy mechanisms for distribution estimation. In The 28th USENIX Security Symposium. 1877–1894.Google Scholar
[70] Nautsch A., Jasserand C., Kindt Els, Todisco M., Trancoso I., and Evans N.. 2019. The GDPR & speech data: Reflections of legal and technology communities, first steps towards a common understanding. In Proc. of the 20th Annual Conference of the International Speech Communication Association (INTERSPEECH’19). 3695–3699.Google ScholarCross Ref
[71] Nelson Boel and Reuben Jenni. 2019. Chasing accuracy and privacy, and catching both: A literature survey on differentially private histogram publication. arXiv preprint, arXiv:1910.14028 (2019).Google Scholar
[72] Nickel Maximillian and Kiela Douwe. 2017. Poincaré embeddings for learning hierarchical representations. In The 31st Annual Conference on Neural Information Processing Systems (NeurIPS’17). 6338–6347.Google Scholar
[73] Pennington Jeffrey, Socher Richard, and Manning Christopher D.. 2014. Glove: Global vectors for word representation. In The 19th Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532–1543.Google ScholarCross Ref
[74] Poddar Rishabh, Ananthanarayanan Ganesh, Setty Srinath, Volos Stavros, and Popa Raluca Ada. 2020. Visor: Privacy-preserving video analytics as a cloud service. In The 29th USENIX Security Symposium. 1039–1056.Google Scholar
[75] Qi Lianyong, Zhang Xuyun, Dou Wanchun, Hu Chunhua, Yang Chi, and Chen Jinjun. 2018. A two-stage locality-sensitive hashing based approach for privacy-preserving mobile service recommendation in cross-platform edge environment. Future Generation Computer Systems 88 (2018), 636–643.Google ScholarDigital Library
[76] Qian Jianwei, Du Haohua, Hou Jiahui, Chen Linlin, Jung Taeho, and Li Xiangyang. 2019. Speech sanitizer: Speech content desensitization and voice anonymization. IEEE Transactions on Dependable and Secure Computing 18, 6 (2019), 2631–2642. DOI:Google ScholarDigital Library
[77] Qian Jianwei, Du Haohua, Hou Jiahui, Chen Linlin, Jung Taeho, and Li Xiang-Yang. 2018. Hidebehind: Enjoy voice input with voiceprint unclonability and anonymity. In The 16th ACM Conference on Embedded Networked Sensor Systems (SenSys’18). 82–94.Google ScholarDigital Library
[78] Qu Youyang, Yu Shui, Zhou Wanlei, Chen Shiping, and Wu Jun. 2021. Customizable reliable privacy-preserving data sharing in cyber-physical social network. IEEE Transactions on Network Science and Engineering 8, 1 (2021), 269–281.Google ScholarCross Ref
[79] Qu Youyang, Yu Shui, Zhou Wanlei, and Tian Yonghong. 2020. GAN-driven personalized spatial-temporal private data sharing in cyber-physical social systems. IEEE Transactions on Network Science and Engineering 7, 4 (2020), 2576–2586.Google ScholarCross Ref
[80] Rastogi Vibhor and Nath Suman. 2010. Differentially private aggregation of distributed time-series with transformation and encryption. In Proc. of the 36th ACM SIGMOD International Conference on Management of Data (SIGMOD’10). 735–746.Google ScholarDigital Library
[81] Rodriguez-Garcia Mercedes, Batet Montserrat, and Sánchez David. 2015. Semantic noise: Privacy-protection of nominal microdata through uncorrelated noise addition. In The 27th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’15). 1106–1113.Google ScholarDigital Library
[82] Saini Mukesh, Atrey Pradeep K., Mehrotra Sharad, and Kankanhalli Mohan. 2014. W3-privacy: Understanding what, when, and where inference channels in multi-camera surveillance video. Multimedia Tools and Applications 68, 1 (2014), 135–158.Google ScholarDigital Library
[83] Sánchez David and Batet Montserrat. 2016. C-sanitized: A privacy model for document redaction and sanitization. Journal of the Association for Information Science and Technology 67, 1 (2016), 148–163.Google ScholarDigital Library
[84] Schein Aaron, Wu Zhiwei Steven, Schofield Alexandra, Zhou Mingyuan, and Wallach Hanna. 2019. Locally private Bayesian inference for count models. In The 36th International Conference on Machine Learning. 5638–5648.Google Scholar
[85] Senior Andrew, Pankanti Sharath, Hampapur Arun, Brown Lisa, Tian Ying-Li, Ekin Ahmet, Connell Jonathan, Shu Chiao Fe, and Lu Max. 2005. Enabling video privacy through computer vision. IEEE Security & Privacy 3, 3 (2005), 50–57.Google ScholarDigital Library
[86] Shokri Reza and Shmatikov Vitaly. 2015. Privacy-preserving deep learning. In Proc. of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS’15). 1310–1321.Google ScholarDigital Library
[87] Shokri Reza, Stronati Marco, Song Congzheng, and Shmatikov Vitaly. 2017. Membership inference attacks against machine learning models. In 38th IEEE Symposium on Security and Privacy (S&P’17). 3–18.Google Scholar
[88] Snyder David, Garcia-Romero Daniel, Sell Gregory, Povey Daniel, and Khudanpur Sanjeev. 2018. x-vectors: Robust DNN embeddings for speaker recognition. In 43rd IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’18). 5329–5333.Google ScholarDigital Library
[89] Song Congzheng and Shmatikov Vitaly. 2019. Auditing data provenance in text-generation models. In The 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD’19). 196–206.Google ScholarDigital Library
[90] Srivastava Brij Mohan Lal, Vauquier Nathalie, Sahidullah Md, Bellet Aurélien, Tommasi Marc, and Vincent Emmanuel. 2020. Evaluating voice conversion-based privacy protection against informed attackers. In The 45th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’20). 2802–2806.Google ScholarCross Ref
[91] Stausholm Nina Mesing. 2021. Improved differentially private Euclidean distance approximation. In Proc. of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS’21). 42–56.Google Scholar
[92] Steil Julian, Hagestedt Inken, Huang Michael Xuelin, and Bulling Andreas. 2019. Privacy-aware eye tracking using differential privacy. In The 11th ACM Symposium on Eye Tracking Research & Applications (ETRA’19). 27:1–27:9.Google Scholar
[93] Sun Mingxuan, Wang Qing, and Liu Zicheng. 2020. Human action image generation with differential privacy. In The 21st IEEE International Conference on Multimedia and Expo (ICME’20). 1–6.Google ScholarCross Ref
[94] Sun Qianru, Ma Liqian, Oh Seong Joon, Gool Luc Van, Schiele Bernt, and Fritz Mario. 2018. Natural and effective obfuscation by head inpainting. In Proc. of the 21st IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 5050–5059.Google ScholarCross Ref
[95] Takagi Shun, Cao Yang, Asano Yasuhito, and Yoshikawa Masatoshi. 2019. Geo-graph-indistinguishability: Protecting location privacy for LBS over road networks. In 33rd IFIP Annual WG 11.3 Conference on Data and Applications Security and Privacy (DBSec’19). 143–163.Google ScholarDigital Library
[96] Terrovitis Manolis, Mamoulis Nikos, and Kalnis Panos. 2008. Privacy-preserving anonymization of set-valued data. Proceedings of the VLDB Endowment 1, 1 (2008), 115–125.Google ScholarDigital Library
[97] Tramér Florian, Huang Zhicong, Hubaux Jean-Pierre, and Ayday Erman. 2015. Differential privacy with bounded priors: Reconciling utility and privacy in genome-wide association studies. In Proc. of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS’15). 1286–1297.Google ScholarDigital Library
[98] Tschantz Michael Carl, Sen Shayak, and Datta Anupam. 2020. SoK: Differential privacy as a causal property. In 41st IEEE Symposium on Security and Privacy (S&P’20). 354–371.Google Scholar
[99] Vickrey William. 1961. Counterspeculation, auctions, and competitive sealed tenders. Journal of Finance 16, 1 (1961), 8–37.Google ScholarCross Ref
[100] Villalba Jesús, Chen Nanxin, Snyder David, Garcia-Romero Daniel, McCree Alan, Sell Gregory, Borgstrom Jonas, García-Perera Leibny Paola, Richardson Fred, Dehak Réda, et al. 2020. State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and speakers in the wild evaluations. Computer Speech & Language 60 (2020), 101026. DOI:Google ScholarDigital Library
[101] Wagh Sameer, He Xi, Machanavajjhala Ashwin, and Mittal Prateek. 2021. DP-cryptography: Marrying differential privacy and cryptography in emerging applications. Communications of the ACM 64, 2 (2021), 84–93.Google ScholarDigital Library
[102] Wagner Isabel and Eckhoff David. 2018. Technical privacy metrics: A systematic survey. ACM Computing Surveys 51, 3 (2018), 1–38.Google ScholarDigital Library
[103] Wang Han, Hong Yuan, Kong Yu, and Vaidya Jaideep. 2020. Publishing video data with indistinguishable objects. In Proc. of the 23rd International Conference on Extending Database Technology (EDBT’20). 323–334.Google Scholar
[104] Wang Han, Xie Shangyu, and Hong Yuan. 2020. VideoDP: A universal platform for video analytics with differential privacy. Proceedings on Privacy Enhancing Technologies 2020, 4 (2020), 277–296.Google ScholarCross Ref
[105] Wang Tao, Zheng Zhigao, Bashir Ali Kashif, Jolfaei Alireza, and Xu Yanyan. 2021. FinPrivacy: A privacy-preserving mechanism for fingerprint identification. ACM Transactions on Internet Technology 21, 3 (2021). DOI:Google ScholarDigital Library
[106] Wang Zhou, Bovik Alan C., Sheikh Hamid R., and Simoncelli Eero P.. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612.Google ScholarDigital Library
[107] Wang Zhibo, Guo Hengchang, Zhang Zhifei, Song Mengkai, Zheng Siyan, Wang Qian, and Niu Ben. 2020. Towards compression-resistant privacy-preserving photo sharing on social networks. In The 21st International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing (Mobihoc’20). 81–90.Google Scholar
[108] Warner Stanley L.. 1965. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association 60, 309 (1965), 63–69.Google ScholarCross Ref
[109] Weggenmann Benjamin and Kerschbaum Florian. 2018. SynTF: Synthetic and differentially private term frequency vectors for privacy-preserving text mining. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR’18). 305–314.Google ScholarDigital Library
[110] Wu Bingzhe, Zhao Shiwan, Sun Guangyu, Zhang Xiaolu, Su Zhong, Zeng Caihong, and Liu Zhihong. 2019. P3SGD: Patient privacy preserving SGD for regularizing deep CNNs in pathological image classification. In Proc. of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). 2099–2108.Google ScholarCross Ref
[111] Wu Zhenyu, Wang Zhangyang, Wang Zhaowen, and Jin Hailin. 2018. Towards privacy-preserving visual recognition via adversarial training: A pilot study. In The 15th European Conference on Computer Vision (ECCV’18). 606–624.Google ScholarCross Ref
[112] Xiang Zhuolun, Ding Bolin, He Xi, and Zhou Jingren. 2020. Linear and range counting under metric-based local differential privacy. In 17th IEEE International Symposium on Information Theory (ISIT’20). 908–913.Google Scholar
[113] Xu Chuan, Luo Li, Ding Yingyi, Zhao Guofeng, and Yu Shui. 2020. Personalized location privacy protection for location-based services in vehicular networks. IEEE Wireless Communications Letters 9, 10 (2020), 1633–1637.Google ScholarCross Ref
[114] Xu Nan, Feyisetan Oluwaseyi, Aggarwal Abhinav, Xu Zekun, and Teissier Nathanael. 2021. Density-aware differentially private textual perturbations using truncated Gumbel noise. Proceedings of FLAIRS 34, 1 (2021), 1–8. DOI:Google ScholarCross Ref
[115] Xu Zekun, Aggarwal Abhinav, Feyisetan Oluwaseyi, and Teissier Nathanael. 2020. A differentially private text perturbation method using a regularized Mahalanobis metric. In Proc. of the 2nd Workshop on PrivateNLP at the 25th Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 7–17.Google ScholarCross Ref
[116] Xu Zekun, Aggarwal Abhinav, Feyisetan Oluwaseyi, and Teissier Nathanael. 2021. On a utilitarian approach to privacy preserving text generation. In Proc. of the 3rd Workshop on Privacy in Natural Language Processing. 11–20.Google Scholar
[117] Xue Wanli, Vatsalan Dinusha, Hu Wen, and Seneviratne Aruna. 2020. Sequence data matching and beyond: New privacy-preserving primitives based on Bloom filters. IEEE Transactions on Information Forensics and Security (TIFS) 15 (2020), 2973–2987.Google ScholarCross Ref
[118] Yang Jimei, Price Brian, Cohen Scott, Lee Honglak, and Yang Ming-Hsuan. 2016. Object contour detection with a fully convolutional encoder-decoder network. In Proc. of the 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 193–202.Google ScholarCross Ref
[119] Ye Qingqing, Hu Haibo, Meng Xiaofeng, and Zheng Huadi. 2019. PrivKV: Key-value data collection with local differential privacy. In 40th IEEE Symposium on Security and Privacy (S&P’19). 317–331.Google Scholar
[120] Yue Xiang, Du Minxin, Wang Tianhao, Li Yaliang, Sun Huan, and Chow Sherman S. M.. 2021. Differential privacy for text analytics via natural text sanitization. In The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP).Google Scholar
[121] Zhang Xuyun, Leckie Christopher, Dou Wanchun, Chen Jinjun, Kotagiri Ramamohanarao, and Salcic Zoran. 2016. Scalable local-recoding anonymization using locality sensitive hashing for big data privacy preservation. In The 25th ACM International on Conference on Information and Knowledge Management (CIKM’16). 1793–1802.Google ScholarDigital Library
[122] Zhu Tianqing, Li Gang, Zhou Wanlei, and Philip S. Yu. 2017. Differentially private data publishing and analysis: A survey. IEEE Transactions on Knowledge and Data Engineering (TKDE) 29, 8 (2017), 1619–1638.Google ScholarDigital Library

Index Terms

A Survey on Differential Privacy for Unstructured Data Content

Recommendations

A Novel Differential Privacy Approach that Enhances Classification Accuracy
C3S2E '16: Proceedings of the Ninth International C* Conference on Computer Science & Software Engineering

In the recent past, there has been a tremendous increase of large repositories of data, examples being in healthcare data, consumer data from retailers, and airline passenger data. These data are continually being shared with interested parties, either ...
Read More
Differential privacy for eye-tracking data
ETRA '19: Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications

As large eye-tracking datasets are created, data privacy is a pressing concern for the eye-tracking community. De-identifying data does not guarantee privacy because multiple datasets can be linked for inferences. A common belief is that aggregating ...
Read More
Enhancing data utility in differential privacy via microaggregation-based k-anonymity

It is not uncommon in the data anonymization literature to oppose the "old" k-anonymity model to the "new" differential privacy model, which offers more robust privacy guarantees. Yet, it is often disregarded that the utility of the anonymized results ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Computing Surveys Volume 54, Issue 10s
January 2022
831 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3551649
Editor:
Albert Zomaya
University of Sydney, Australia
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 September 2022
- Online AM: 6 January 2022
- Accepted: 25 September 2021
- Revised: 22 July 2021
- Received: 17 January 2021
Published in csur Volume 54, Issue 10s

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Differential privacy
unstructured data content privacy
privacy protected unstructured data
image
voiceprint
text
video
Qualifiers
- survey
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 49
  Total Citations
  View Citations
- 6,034
  Total Downloads
- Downloads (Last 12 months)2,949
- Downloads (Last 6 weeks)344
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

HTML Format

View this article in HTML Format .

View HTML Format

A Survey on Differential Privacy for Unstructured Data Content

ACM Computing Surveys

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

A Novel Differential Privacy Approach that Enhances Classification Accuracy

Differential privacy for eye-tracking data

Enhancing data utility in differential privacy via microaggregation-based k-anonymity