Skip to main content

Anonymisation of Social Networks and Rough Set Approach

  • Chapter
  • First Online:
Computational Social Networks

Abstract

Scientific study of network data can reveal many important behaviors of the elements involved and social trends. It also provides insight for suitable changes in the social structure and roles of individuals in it. There are many evidences (HIPAA (2002) Health insurance portability and accountability act. Available online http://www.hhs.gov/ocr/hipaa; Lambert, J Off Stat 9:313–331, 1993; Xu (2006) Utility based anonymisation using local recording. In: KDD’06, Philadelphia) which indicate the precious value of social network data in shedding light on social behavior, health, and well-being of the general public. For this purpose, the social network information needs to be published publicly or before a specialized group. But, depending upon the privacy model considered, this information may involve some sensitive data of individual participants in the social network, which are undesirable to be disclosed. Due to this problem, social network data need to be anonymized before its publication in order to prevent potential reidentification attacks. Data anonymization techniques are abundantly used in relational databases (Aggarwal et al. J Priv Technol, 2005; Backstrom et al. (2007) Wherefore art thou R3579X? Anonymized social networks, hidden patterns, and structural steganography. In: International world wide web conference (WWW). ACM, New York, pp 181–190; Bayardo and Agrawal (2005) Data privacy through optimal k-anonymisation. In: IEEE 21st international conference on data engineering, April 2005; Bamba et al. (2008) Supporting anonymous location queries in mobile environments with privacy grid. In: ACM world wide web conference; Byun et al. (2007) Efficient k-anonymisation using clustering techniques. In: International conference on database systems for advanced applications (DASFAA), pp 188–200; Campan and Truta (2008) A clustering approach for data and structural anonymity in social networks. In: ACM SIGKDD workshop on privacy, security, and trust in KDD (PinKDD), Las Vegas; Chakrabarti et al. (2004) R-MAT: a recursive model for graph mining. In: SIAM international conference on data mining; Chawla et al. (2005) Toward privacy in public databases. In: Proceedings of the theory of cryptography conference, Cambridge, MA; Evfimievski et al. (2003) Limiting privacy breaches in privacy preserving data mining. In: ACM principles of database systems (PODS). ACM, New York, pp 211–222; Getoor and Diehl, A surv SIGKDD Explore Newsl 7(2):3–12, 2005; Ghinita et al. (2007) Fast data anonymisation with low information loss. In: Very large data base conference (VLDB), Vienna, pp 758–769; Lefebvre et al. (2006) Mondrian multidimensional K-anonymity. In: IEEE international conference of data engineering (ICDE), p 25; Liu and Terzi (2008) Towards identity anonymisation on graphs. In: Wang (ed.) SIGMOD conference. ACM, New York, pp 93–106; Lunacek et al. (2006) A crossover operator for the k-anonymity problem. In: Genetic and evolutionary computation conference (GECCO), Seattle, Washington, pp 1713–1720; Machanavajjhala et al. (2006) L-diversity: privacy beyond K-anonymity. In: IEEE international conference on data engineering (ICDE), Atlanta, p 24; Malin, J Am Med Inform Assoc 12(1):28–34, 2004; Nergiz and Clifton (2006) Thoughts on k-anonymisation. In: IEEE 22nd international conference on data engineering workshops (ICDEW), Atlanta, April 2006, p 96; Nergiz and Clifton (2007) Multirelational k-anonymity. In: IEEE 23rd international conference on data engineering posters, April 2007). However, most of the known anonymisation approaches such as suppression or generalization do not directly apply to social network data. One major challenge in social network anonymization is the complexity. In (Gross and Yellen (2006) Graph theory and its applications. CRC, Boca Raton), it has been proved that a particular k-anonymity problem trying to minimize the structural change to the original social network is NP-hard. Research in anonymization of social networks is a relatively new field. In this chapter, we provide a systematic study of different approaches and studies done so far in this direction. There is no doubt that social network nodes can have imprecise data as their attributes. So, normal methods proposed for anonymization are not suitable for such type of social networks. Recently, a very efficient rough set-based algorithm was established in (Tripathy and Prakash Kumar, Int J Rapid Manuf 1(2):189–207, 2009) to handle clustering of tuples in relational models. We shall describe how this algorithm can be used for anonymization of social networks. Also, we shall present some recent algorithms which use isomorphism of graphs for anonymization of social networks. In the end, we shall discuss the current status of research on anonymization of social networks and present some related problems for further study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R. , Panigrahy, R., Thomas, D., Zhu, A.: Approximation algorithms for k-anonymity. J. Priv. Technol. pp. 1–8 (2005)

    Google Scholar 

  2. Backstrom, L., Dwork, C., Kleinberg, J.: Wherefore art thou R3579X? Anonymized social networks, hidden patterns, and structural steganography. In: International World Wide Web Conference (WWW), pp. 181–190. ACM, New York (2007)

    Google Scholar 

  3. Bader, D.A., Madduri, K.: GTGraph: a synthetic graph generator suite. Available online http://www.cc.gatech.edu/~kamesh/GTgraph/ (2006)

  4. Bamba, B., Liu, L., Pesti, P., Wang, T.: Supporting anonymous location queries in mobile environments with privacy grid. In: ACM World Wide Web Conference, Beijing, China (2008)

    Google Scholar 

  5. Bayardo, R., Agrawal, R.: Data privacy through optimal k-anonymisation. In: IEEE 21st International Conference on Data Engineering, Tokyo, Japan, April 2005

    Google Scholar 

  6. Byun, J.W., Kamra, A., Bertino, E., Li, N.: Efficient k- anonymisation using clustering techniques. In: International Conference on Database Systems for Advanced Applications (DASFAA), pp. 188–200. Bangkok, Thailand (2007)

    Google Scholar 

  7. Campan, A., Truta, T.M.: A clustering approach for data and structural anonymity in social networks. In: ACM SIGKDD Workshop on Privacy, Security, and Trust in KDD (PinKDD), Las Vegas, (2008)

    Google Scholar 

  8. Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: a recursive model for graph mining. In: SIAM International Conference on Data Mining, Florida, USA (2004)

    Google Scholar 

  9. Chawla, S., Dwork, C., Mcsherry, F., Smith, A., Wee, H.: Toward privacy in public databases. In: Proceedings of the Theory of Cryptography Conference, Cambridge, MA, (2005)

    Google Scholar 

  10. Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: ACM Principles of Database Systems (PODS), pp. 211–222. San Diego, CA (2003)

    Google Scholar 

  11. Getoor, L., Diehl, C.P.: Link mining. A surv. SIGKDD Explore Newsl. 7(2), 3–12 (2005)

    Article  Google Scholar 

  12. Ghinita, G., Karras, P., Kalinis, P., Mamoulis, N.: Fast data anonymisation with low information loss, In: Very Large Data Base Conference (VLDB), pp. 758–769, Vienna (2007)

    Google Scholar 

  13. Gross, J., Yellen, J.: Graph Theory and Its Applications. CRC, Boca Raton (2006)

    MATH  Google Scholar 

  14. Han, J., Kamber, M.: Data Mining, Concepts and Techniques, 2nd edn. Morgan Kaffmann, San Francisco, CA (2006)

    MATH  Google Scholar 

  15. Hay, M., Miklau, G., Jensen, D., Weiss, P., Srivastava, S.: Anonymising social networks. Technical Report No. 07–19, University of Massachusetts Amherst (2007)

    Google Scholar 

  16. HIPAA: Health insurance portability and accountability act. Available online http://www.hhs.gov/ocr/hipaa (2002)

  17. Horowitz, E., Sahani, S., Rajasekaran, S.: Fundamentals of Computer Algorithms. Galgotia Publications, Darya Ganj, Galgotia Publications, New Delhi, India (2004)

    Google Scholar 

  18. Lambert, D.: Measures of disclosure risk and harm. J. Off. Stat. 9, 313–331 (1993)

    Google Scholar 

  19. Lefebvre, K., DeWitt, D., Ramakrishnan, R.: Mondrian multidimensional K-anonymity. In: IEEE International Conference of Data Engineering (ICDE), p. 25, Atlanta, Georgia, USA (2006)

    Google Scholar 

  20. Li, N., Li, T., Venkitasubramaniam, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of the 23rd International Conference on Data Engineering (ICDE), Istanbul, pp. 106–115, Istanbul, Turkey (2007)

    Google Scholar 

  21. Lin, J.-L., Wei, M.-C.: An efficient clustering method for k-anonymization. In: Proceedings of the 2008 International Workshop on Privacy and Anonymity in Information Society, Nantes, pp. 29–29, March 2008

    Google Scholar 

  22. Liu, K., Terzi, E.: Towards identity anonymisation on graphs. In: Wang, J.T.L., (ed.) SIGMOD Conference, pp. 93–106. ACM, New York (2008)

    Chapter  Google Scholar 

  23. Lunacek, M., Whitley, D., Ray, I.: A crossover operator for the k-anonymity problem. In: Genetic and Evolutionary Computation Conference (GECCO), Seattle, Washington, pp. 1713–1720 (2006)

    Google Scholar 

  24. Machanavajjhala, A., Gehrke, J., Kifer, D.: L-diversity: privacy beyond K-anonymity. In: IEEE International Conference on Data Engineering (ICDE), Atlanta, p. 24, Georgia, USA (2006)

    Google Scholar 

  25. Malin, B.: An evaluation of the current state of genomic data privacy protection technology and a roadmap for the future. J Am. Med. Inform. Assoc. 12(1), 28–34 (2004)

    Article  Google Scholar 

  26. Miklau, G., Suciu, D.: A formal analysis of information disclosure in data exchange. In: ACM Conference on Management of Data (SIGMOD), Paris, pp. 575–586 (2004)

    Google Scholar 

  27. Nergiz, M.E., Clifton, C.: Thoughts on k-anonymisation. In: IEEE 22nd International Conference on Data Engineering Workshops (ICDEW), Atlanta, Istanbul, Turkey, April 2006, p. 96

    Google Scholar 

  28. Nergiz, M.E., Clifton, C.: Multirelational k-anonymity. In: IEEE 23rd International Conference on Data Engineering Posters, Istanbul, Turkey, April 2007

    Google Scholar 

  29. Nergiz, M.E., Atzori, M., Clifton, C.: Hiding the presence of individuals from shared databases. In: 26th ACM SIGMOD International Conference on Management of Data, Beijing, June 2007

    Google Scholar 

  30. Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases. Available online at: www.ics.uci.edu/~mlearn/MLRepository.html (1998)

  31. Pang, R., Paxson, V.: A high-level programming environment for packet trace anonymisation and transformation. In: ACM SIGSOMM, Karlsruhe, Germany (2003)

    Google Scholar 

  32. Parmar, D., Wu, T., Blackhurst, J.: MMR: an algorithm for clustering categorical data using rough set theory. Data Knowl. Eng. 63, 879–893 (2007)

    Article  Google Scholar 

  33. Pawlak, Z.: Rough sets. Int. J. Comput. Inf. Sci. 11, 341–356 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  34. Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer, Dordrecht (1991)

    MATH  Google Scholar 

  35. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Kaufmann, San Mateo (1988)

    Google Scholar 

  36. Samarati, P.: Protecting respondents identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)

    Article  Google Scholar 

  37. Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information. In: PODS’98, Seattle, Washington (1998)

    Google Scholar 

  38. Singliar, T., Hauskrecht, M. Noisy-or component analysis and its application to link analysis. J Mach. Learn. Res. 7, 2189–2213 (2006)

    MathSciNet  MATH  Google Scholar 

  39. Stein, R.: Social networks’ sway may be underestimated. Washington Post, 26 May 2008

    Google Scholar 

  40. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 557–570 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  41. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 571–588 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  42. Thompson, B., Yao, D.: The union-split algorithm and cluster-based anonymisation of social networks. ASIACCS’09, Sydney, 10–12 March 2009

    Google Scholar 

  43. Tripathy, B.K., Lakshmi Janaki, K., Jain, N.: Security against neighborhood attacks in social networks. In: Proceedings of the National Conference on Recent Trends in Soft Computing (NCRTSC’09), pp. 216–223, Bangalore, India (2009)

    Google Scholar 

  44. Tripathy, B.K., Panda, G.K.: A new approach to manage security against neighborhood attacks in social networks. In: Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, Odense, pp. 264–269 (2010). DOI 10.1109/ASNOM.2010.69

    Google Scholar 

  45. Tripathy, B.K., Panda, G.K., Kumaran, K.: A rough set approach to develop an efficient l-diversity algorithm based on clustering, accepted for presentation at the 2nd IIMA international conference on “Advanced Data Analysis, Business Analytics and Intelligence”, Ahmedabad, January 2011

    Google Scholar 

  46. Tripathy, B.K., Panda, G.K., Kumaran, K.: A fast l – Diversity anonymisation algorithm, accepted for presentation at ICCMS 2011, Mumbai, 7–9 January 2011

    Google Scholar 

  47. Tripathy, B.K., Prakash Kumar, Ch.: MMeR: an algorithm for clustering heterogeneous data using rough set theory. Int. J. Rapid Manuf. 1(2), 189–207 (2009)

    Google Scholar 

  48. Truta, T.M., Bindu, V.: Privacy protection: P-sensitive k-anonymity property. In: PDM Workshop, with IEEE International Conference on Data Engineering (ICDE), Atlanta, p. 94 (2006)

    Google Scholar 

  49. Wang, T., Liu, L.: Butterfly: protecting output privacy in stream mining. In: IEEE International Conference on Data Engineering (ICDE), Cancun, pp. 1170–1179 (2008)

    Google Scholar 

  50. Wasserman, S., Faust, K.: Social Network Analysis, Cambridge University Press, Cambridge/ New York (1994)

    Google Scholar 

  51. Wong, R.C.W., Li, J., Fu, A.W.C., Wang, K.: (α, k)- anonymity: an enhanced k-anonymity model for privacy-preserving data publishing. In: SIGKDD, pp. 754–759, Philadelphia, PA (2006)

    Google Scholar 

  52. Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.W.C: Utility based anonymisation using local recording. In: KDD’06, Philadelphia, PA (2006)

    Google Scholar 

  53. Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., and Fu, A. W.C.: gspan: graph-based substructure pattern mining. In: ICDM’02, Maebashi City (2002)

    Google Scholar 

  54. Zheleva, E., Getoor, L.: Preserving the privacy of sensitive relationships in graph data. In: ACM SIGKDD Workshop on Privacy, Security, and Trust in KDD (PinKDD), San Jose, pp. 153–177 (2007)

    Google Scholar 

  55. Zhou, B., Pei, J.: Preserving privacy in social networks against neighborhood attacks. In: Proceedings of the 24th International Conference on Data Engineering (ICDE), Cancún, pp. 506–515, Simon Fraser University (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bala Krishna Tripathy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag London

About this chapter

Cite this chapter

Tripathy, B.K. (2012). Anonymisation of Social Networks and Rough Set Approach. In: Abraham, A. (eds) Computational Social Networks. Springer, London. https://doi.org/10.1007/978-1-4471-4051-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-4051-1_12

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-4050-4

  • Online ISBN: 978-1-4471-4051-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics