Article

On the complexity of optimal K-anonymity

Authors:
Adam Meyerson

University of California, Los Angeles, CA

University of California, Los Angeles, CA
View Profile

,
Ryan Williams

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsJune 2004Pages 223–228https://doi.org/10.1145/1055558.1055591

Published:14 June 2004Publication History

PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

Pages 223–228

ABSTRACT

The technique of k-anonymization has been proposed in the literature as an alternative way to release public information, while ensuring both data privacy and data integrity. We prove that two general versions of optimal k-anonymization of relations are NP-hard, including the suppression version which amounts to choosing a minimum number of entries to delete from the relation. We also present a polynomial time algorithm for optimal k-anonymity that achieves an approximation ratio independent of the size of the database, when k is constant. In particular, it is a O(k log k)-approximation where the constant in the big-O is no more than 4, However, the runtime of the algorithm is exponential in k. A slightly more clever algorithm removes this condition, but is a O(k log m)-approximation, where m is the degree of the relation. We believe this algorithm could potentially be quite fast in practice.

References

R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu. Hippocratic databases. In Proc. of the 28th International Conference on Very Large Databases, 143--154, 2002. Google ScholarDigital Library
R. Agrawal and S. Ramakrishnan. Privacy Preserving Data Mining. In Proc. of ACM International Conference on Management of Data, 439--450, 2000. Google ScholarDigital Library
D. Agrawal and C. C. Aggarwal. On the design and quantification of privacy preserving data mining algorithms. In Proc. of ACM Symposium on Principles of Database Systems, 2001. Google ScholarDigital Library
I. Dinur and K. Nissim. Revealing Information while Preserving Privacy. In Proc. of ACM Symposium on Principles of Database Systems, 202--210, 2003. Google ScholarDigital Library
A. Evfimievski, J. E. Gehrke, and R. Srikant. Limiting Privacy Breaches in Privacy Preserving Data Mining. In Proc. of ACM Symposium on Principles of Database Systems, 211--222, 2003. Google ScholarDigital Library
D. S. Johnson. Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences 9:256--278, 1974.Google ScholarDigital Library
J. Kleinberg, C. Papadimitriou, P. Raghavan. Auditing Boolean Attributes. In Proc. of ACM Symposium on Principles of Database Systems, 86--91, 2000. Google ScholarDigital Library
L. Sweeney. Optimal anonymity using k-similar, a new clustering algorithm. Under review, 2003.Google Scholar
L. Sweeney. k-anonymity: a model for protecting privacy. International Journal on Uncertainty,Fuzziness and Knowledge-based Systems 10(5), 557--570, 2002. Google ScholarDigital Library
P. Samarati and L. Sweeney. Generalizing Data to Provide Anonymity when Disclosing Information (Abstract). In Proc. of ACM Symposium on Principles of Database Systems, 188, 1998. Google ScholarDigital Library

Recommendations

(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

Privacy preservation is an important issue in the release of data for mining purposes. The k-anonymity model has been introduced for protecting individual identification. Recent studies show that a more sophisticated model is necessary to protect the ...
Read More
Parameterized complexity of k-anonymity: hardness and tractability

The problem of publishing personal data without giving up privacy is becoming increasingly important. A precise formalization that has been recently proposed is the k-anonymity, where the rows of a table are partitioned into clusters of sizes at least k ...
Read More
k-anonymity: a model for protecting privacy

Consider a data holder, such as a hospital or a bank, that has a privately held collection of person-specific, field structured data. Suppose the data holder wants to share a version of the data with researchers. How can a data holder release a version ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2004
350 pages
ISBN:158113858X
DOI:10.1145/1055558
Conference Chair:
Catriel Beeri
Hebrew University of Jerusalem
Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 June 2004
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate642of2,707submissions,24%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 508
  Total Citations
  View Citations
- 1,855
  Total Downloads
- Downloads (Last 12 months)145
- Downloads (Last 6 weeks)26
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

On the complexity of optimal K-anonymity

PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

ABSTRACT

References

Cited By

Recommendations

(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing

Parameterized complexity of k-anonymity: hardness and tractability

k-anonymity: a model for protecting privacy

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

On the complexity of optimal K-anonymity

PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

ABSTRACT

References

Cited By

Recommendations

(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing

Parameterized complexity of k-anonymity: hardness and tractability

k-anonymity: a model for protecting privacy

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media