skip to main content
10.1145/1055558.1055591acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
Article

On the complexity of optimal K-anonymity

Published:14 June 2004Publication History

ABSTRACT

The technique of k-anonymization has been proposed in the literature as an alternative way to release public information, while ensuring both data privacy and data integrity. We prove that two general versions of optimal k-anonymization of relations are NP-hard, including the suppression version which amounts to choosing a minimum number of entries to delete from the relation. We also present a polynomial time algorithm for optimal k-anonymity that achieves an approximation ratio independent of the size of the database, when k is constant. In particular, it is a O(k log k)-approximation where the constant in the big-O is no more than 4, However, the runtime of the algorithm is exponential in k. A slightly more clever algorithm removes this condition, but is a O(k log m)-approximation, where m is the degree of the relation. We believe this algorithm could potentially be quite fast in practice.

References

  1. R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu. Hippocratic databases. In Proc. of the 28th International Conference on Very Large Databases, 143--154, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Agrawal and S. Ramakrishnan. Privacy Preserving Data Mining. In Proc. of ACM International Conference on Management of Data, 439--450, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Agrawal and C. C. Aggarwal. On the design and quantification of privacy preserving data mining algorithms. In Proc. of ACM Symposium on Principles of Database Systems, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. I. Dinur and K. Nissim. Revealing Information while Preserving Privacy. In Proc. of ACM Symposium on Principles of Database Systems, 202--210, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Evfimievski, J. E. Gehrke, and R. Srikant. Limiting Privacy Breaches in Privacy Preserving Data Mining. In Proc. of ACM Symposium on Principles of Database Systems, 211--222, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. S. Johnson. Approximation algorithms for combinatorial problems. Journal of Computer and System Sciences 9:256--278, 1974.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Kleinberg, C. Papadimitriou, P. Raghavan. Auditing Boolean Attributes. In Proc. of ACM Symposium on Principles of Database Systems, 86--91, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Sweeney. Optimal anonymity using k-similar, a new clustering algorithm. Under review, 2003.Google ScholarGoogle Scholar
  9. L. Sweeney. k-anonymity: a model for protecting privacy. International Journal on Uncertainty,Fuzziness and Knowledge-based Systems 10(5), 557--570, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Samarati and L. Sweeney. Generalizing Data to Provide Anonymity when Disclosing Information (Abstract). In Proc. of ACM Symposium on Principles of Database Systems, 188, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
    June 2004
    350 pages
    ISBN:158113858X
    DOI:10.1145/1055558

    Copyright © 2004 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 14 June 2004

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate642of2,707submissions,24%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader