research-article

Privacy preservation by disassociation

Authors:
Manolis Terrovitis

IMIS, Research Center 'Athena'

IMIS, Research Center 'Athena'
View Profile

,
Nikos Mamoulis

Univ. of Hong Kong

Univ. of Hong Kong
View Profile

,
John Liagouris

NTUA

NTUA
View Profile

,
Spiros Skiadopoulos

Univ. of Peloponnese

Univ. of Peloponnese
View Profile

Proceedings of the VLDB Endowment Volume 5 Issue 10pp 944–955https://doi.org/10.14778/2336664.2336668

Published:01 June 2012Publication History

Proceedings of the VLDB Endowment

Abstract

In this work, we focus on protection against identity disclosure in the publication of sparse multidimensional data. Existing multidimensional anonymization techniques (a) protect the privacy of users either by altering the set of quasi-identifiers of the original data (e.g., by generalization or suppression) or by adding noise (e.g., using differential privacy) and/or (b) assume a clear distinction between sensitive and non-sensitive information and sever the possible linkage. In many real world applications the above techniques are not applicable. For instance, consider web search query logs. Suppressing or generalizing anonymization methods would remove the most valuable information in the dataset: the original query terms. Additionally, web search query logs contain millions of query terms which cannot be categorized as sensitive or non-sensitive since a term may be sensitive for a user and non-sensitive for another. Motivated by this observation, we propose an anonymization technique termed disassociation that preserves the original terms but hides the fact that two or more different terms appear in the same record. We protect the users' privacy by disassociating record terms that participate in identifying combinations. This way the adversary cannot associate with high probability a record with a rare combination of terms. To the best of our knowledge, our proposal is the first to employ such a technique to provide protection against identity disclosure. We propose an anonymization algorithm based on our approach and evaluate its performance on real and synthetic datasets, comparing it against other state-of-the-art methods based on generalization and differential privacy.

References

C. Aggarwal. On k-anonymity and the curse of dimensionality. In VLDB, pp. 901--909, 2005. Google ScholarDigital Library
M. Atzori, F. Bonchi, F. Giannotti, and D. Pedreschi. Anonymity preserving pattern discovery. VLDB Journal, 17(4): 703--727, 2008. Google ScholarDigital Library
M. Barbaro and T. Zeller. A face is exposed for AOL searcher no. 4417749. New York Times, 2006.Google Scholar
T. Burghardt, K. Böhm, A. Guttmann, and C. Clifton. Anonymous search histories featuring personalized advertisement - balancing privacy with economic interests. TDP, 4(1): 31--50, 2011. Google ScholarDigital Library
J. Cao, P. Karras, C. Raissi, and K.-L. Tan. ρ-uncertainty: inference-proof transaction anonymization. PVLDB, 3(1-2): 1033--1044, 2010. Google ScholarDigital Library
R. Chen, M. Noman, B. C. Fung, B. C. Desai, and L. Xiong. Publishing set-valued data via differential privacy. PVLDB, 4(11): 1087--1098, 2011.Google ScholarDigital Library
V. Ciriani, S. D. C. di Vimercati, S. Foresti, S. Jajodia, S. Paraboschi, and P. Samarati. Combining fragmentation and encryption to protect privacy in data storage. TISSEC, 13(3): 1--33, 2010. Google ScholarDigital Library
G. Cormode, D. Srivastava, T. Yu, and Q. Zhang. Anonymizing bipartite graph data using safe groupings. PVLDB, 1(1): 833--844, 2008. Google ScholarDigital Library
N. N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. In VLDB, pp. 864--875, 2004. Google ScholarDigital Library
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. TCC, pp. 265--284, 2006. Google ScholarDigital Library
G. Ghinita, Y. Tao, and P. Kalnis. On the anonymization of sparse high-dimensional data. In ICDE, pp. 715--724, 2008. Google ScholarDigital Library
J. Han and Y. Fu. Discovery of multiple-level association rules from large databases. In VLDB, pp. 420--431, 1995. Google ScholarDigital Library
Y. He and J. F. Naughton. Anonymization of set-valued data via top-down, local generalization. PVLDB, 2(1): 934--945, 2009. Google ScholarDigital Library
A. Korolova, K. Kenthapadi, N. Mishra, and A. Ntoulas. Releasing search queries and clicks privately. In WWW, pp. 171--180, 2009. Google ScholarDigital Library
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: efficient full-domain k-anonymity. In SIGMOD, pp. 49--60, 2005. Google ScholarDigital Library
K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Mondrian multidimensional k-anonymity. In ICDE, pp. 25, 2006. Google ScholarDigital Library
J. Li, R. C.-W Wong, A. W.-C. Fu, and J. Pei. Anonymization by local recoding in data with attribute hierarchical taxonomies. TKDE, 20(9): 1181--1194, 2008. Google ScholarDigital Library
T. Li, N. Li, J. Zhang, and I. Molloy. Slicing: a new approach to privacy preserving data publishing. TKDE, 24(3): 561--574, 2012. Google ScholarDigital Library
G. Loukides, A. Gkoulalas-Divanis, and B. Malin. Anonymization of electronic medical records for validating genome-wide association studies. PNAS, 17: 7898--7903, 2010.Google ScholarCross Ref
A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam. l-diversity: privacy beyond k-anonymity. In ICDE, pp. 24, 2006. Google ScholarDigital Library
M. Nergiz and C. Clifton. Thoughts on k-anonymization. DKE, 63(3): 622--645, 2007. Google ScholarDigital Library
M. Nergiz, C. Clifton, and A. Nergiz. Multirelational k-anonymity. In ICDE, pp. 1417--1421, 2007.Google ScholarCross Ref
Netflix Prize FAQ. http://www.netflixprize.com/faq, 2009.Google Scholar
H. Pang, X. Ding, and X. Xiao. Embellishing text search queries to protect user privacy. PVLDB, 3(1--2): 598--607, 2010. Google ScholarDigital Library
P. Samarati. Protecting respondents' identities in microdata release. TKDE, 13(6): 1010--1027, 2001. Google ScholarDigital Library
L. Sweeney. k-anonymity: a model for protecting privacy. IJUFKS, 10(5): 557--570, 2002. Google ScholarDigital Library
M. Terrovitis, N. Mamoulis, and P. Kalnis. Privacy-preserving anonymization of set-valued data. PVLDB, 1(1): 115--125, 2008. Google ScholarDigital Library
M. Terrovitis, N. Mamoulis, and P. Kalnis. Local and global recoding methods for anonymizing set-valued data. VLDB Journal, 20(1): 83--106, 2010. Google ScholarDigital Library
K. Wang, C. Xu, and B. Liu. Clustering transactions using large items. In CIKM, pp. 483--490, 1999. Google ScholarDigital Library
X. Xiao and Y Tao. Anatomy: simple and effective privacy preservation. In VLDB, pp. 139--150, 2006. Google ScholarDigital Library
Y. Xu, K. Wang, A. W.-C. Fu, and P. S. Yu. Anonymizing transaction databases for publication. In KDD, pp. 767--775, 2008. Google ScholarDigital Library
R. Yarovoy, F. Bonchi, L. V. S. Lakshmanan, and W. H. Wang. Anonymizing moving objects: how to hide a mob in a crowd? In EDBT, pp. 72--83, 2009. Google ScholarDigital Library
Z. Zheng, R. Kohavi, and L. Mason. Real world performance of association rule algorithms. In KDD, pp. 401--406, 2001. Google ScholarDigital Library

Recommendations

Privacy Preservation Techniques for Sequential Data Releasing
IAIT '21: Proceedings of the 12th International Conference on Advances in Information Technology

Privacy violation is a serious issue that must be considered when datasets are released for public use. To address this issue, a well-known privacy preservation model, l-Diversity, is proposed. Unfortunately, l-Diversity is generally proposed to ...
Read More
Privacy Preservation through Uniformity
WiSec '18: Proceedings of the 11th ACM Conference on Security & Privacy in Wireless and Mobile Networks

Inter-vehicle communications disclose rich information about vehicle whereabouts. Pseudonymous authentication secures communication while enhancing user privacy thanks to a set of anonymized certificates, termed pseudonyms. Vehicles switch the ...
Read More
t-Closeness through Microaggregation: Strict Privacy with Enhanced Utility Preservation
Microaggregation is a technique for disclosure limitation aimed at protecting the privacy of data subjects in microdata releases. It has been used as an alternative to generalization and suppression to generate k-anonymous data sets, where the identity of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the VLDB Endowment Volume 5, Issue 10
June 2012
180 pages
ISSN:2150-8097
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 June 2012
Published in pvldb Volume 5, Issue 10
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 21
  Total Citations
  View Citations
- 322
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Privacy preservation by disassociation

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Privacy Preservation Techniques for Sequential Data Releasing

Privacy Preservation through Uniformity

t-Closeness through Microaggregation: Strict Privacy with Enhanced Utility Preservation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Privacy preservation by disassociation

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Privacy Preservation Techniques for Sequential Data Releasing

Privacy Preservation through Uniformity

t-Closeness through Microaggregation: Strict Privacy with Enhanced Utility Preservation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media