skip to main content
10.1145/2339530.2339696acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Anonymizing set-valued data by nonreciprocal recoding

Published:12 August 2012Publication History

ABSTRACT

Today there is a strong interest in publishing set-valued data in a privacy-preserving manner. Such data associate individuals to sets of values (e.g., preferences, shopping items, symptoms, query logs). In addition, an individual can be associated with a sensitive label (e.g., marital status, religious or political conviction). Anonymizing such data implies ensuring that an adversary should not be able to (1) identify an individual's record, and (2) infer a sensitive label, if such exists. Existing research on this problem either perturbs the data, publishes them in disjoint groups disassociated from their sensitive labels, or generalizes their values by assuming the availability of a generalization hierarchy. In this paper, we propose a novel alternative. Our publication method also puts data in a generalized form, but does not require that published records form disjoint groups and does not assume a hierarchy either; instead, it employs generalized bitmaps and recasts data values in a nonreciprocal manner; formally, the bipartite graph from original to anonymized records does not have to be composed of disjoint complete subgraphs. We configure our schemes to provide popular privacy guarantees while resisting attacks proposed in recent research, and demonstrate experimentally that we gain a clear utility advantage over the previous state of the art.

Skip Supplemental Material Section

Supplemental Material

307_t_talk_11.mp4

mp4

369.6 MB

References

  1. C. C. Aggarwal and P. S. Yu. On privacy-preservation of text and sparse binary data with sketches. In SDM, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  2. S. Agrawal, J. R. Haritsa, and B. A. Prakash. FRAPP: A framework for high-accuracy privacy-preserving mining. Data Min. Knowl. Discov., 18(1):101--139, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Brickell and V. Shmatikov. The cost of privacy: destruction of data-mining utility in anonymized data publishing. In KDD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Cao, P. Karras, C. Raïssi, and K.-L. Tan. ρ-uncertainty: Inference-proof transaction anonymization. PVLDB, 3(1):1033--1044, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. Cormode, N. Li, T. Li, and D. Srivastava. Minimizing minimality and maximizing utility: Analyzing method-based attacks on anonymized data. PVLDB, 3(1):1045--1056, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Dasseni, V. S. Verykios, A. K. Elmagarmid, and E. Bertino. Hiding association rules by using confidence and support. In IHW, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data mining. In PODS, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Evfimievski, R. Srikant, R. Agrawal, and J. Gehrke. Privacy preserving mining of association rules. In KDD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. Ghinita, P. Kalnis, and Y. Tao. Anonymous publication of sensitive transactional data. IEEE TKDE, 23(2):161--174, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Ghinita, P. Karras, P. Kalnis, and N. Mamoulis. A framework for efficient data anonymization under privacy and accuracy constraints. ACM TODS, 34(2):1--47, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. Ghinita, Y. Tao, and P. Kalnis. On the anonymization of sparse high-dimensional data. In ICDE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Gionis, A. Mazza, and T. Tassa. κ-anonymization revisited. In ICDE, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. F. Gray. Pulse code communication. US Patent 2632058, 1953.Google ScholarGoogle Scholar
  14. Y. He and J. F. Naughton. Anonymization of set-valued data via top-down, local generalization. PVLDB, 2(1):934--945, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Hong, X. He, J. Vaidya, N. R. Adam, and V. Atluri. Effective anonymization of query logs. In CIKM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Kifer. Attacks on privacy and deFinetti's theorem. In SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Workload-aware anonymization techniques for large-scale datasets. ACM TODS, 33(3):17:1--17:47, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. l-diversity: Privacy beyond κ-anonymity. ACM TKDD, 1(1):3, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. J. Rizvi and J. R. Haritsa. Maintaining data privacy in association rule mining. In VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. Samarati. Protecting respondents' identities in microdata release. IEEE TKDE, 13(6):1010--1027, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Saygin, V. S. Verykios, and C. Clifton. Using unknowns to prevent discovery of association rules. SIGMOD Rec., 30(4):45--54, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. H. Sengoku and I. Yoshihara. A fast TSP solver using GA on JAVA. In AROB, 1998.Google ScholarGoogle Scholar
  23. M. Terrovitis, N. Mamoulis, and P. Kalnis. Local and global recoding methods for anonymizing set-valued data. The VLDB Journal, 20(1):83--106, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. K. Wong, N. Mamoulis, and D. W. L. Cheung. Non-homogeneous generalization in privacy preserving data publishing. In SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Xu, K. Wang, A. W.-C. Fu, and P. S. Yu. Anonymizing transaction databases for publication. In KDD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Anonymizing set-valued data by nonreciprocal recoding

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
              August 2012
              1616 pages
              ISBN:9781450314626
              DOI:10.1145/2339530

              Copyright © 2012 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 12 August 2012

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate1,133of8,635submissions,13%

              Upcoming Conference

              KDD '24

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader