Anonymisation of Social Networks and Rough Set Approach

Tripathy, Bala Krishna

doi:10.1007/978-1-4471-4051-1_12

Bala Krishna Tripathy²

2171 Accesses
4 Citations

Abstract

Scientific study of network data can reveal many important behaviors of the elements involved and social trends. It also provides insight for suitable changes in the social structure and roles of individuals in it. There are many evidences (HIPAA (2002) Health insurance portability and accountability act. Available online http://www.hhs.gov/ocr/hipaa; Lambert, J Off Stat 9:313–331, 1993; Xu (2006) Utility based anonymisation using local recording. In: KDD’06, Philadelphia) which indicate the precious value of social network data in shedding light on social behavior, health, and well-being of the general public. For this purpose, the social network information needs to be published publicly or before a specialized group. But, depending upon the privacy model considered, this information may involve some sensitive data of individual participants in the social network, which are undesirable to be disclosed. Due to this problem, social network data need to be anonymized before its publication in order to prevent potential reidentification attacks. Data anonymization techniques are abundantly used in relational databases (Aggarwal et al. J Priv Technol, 2005; Backstrom et al. (2007) Wherefore art thou R3579X? Anonymized social networks, hidden patterns, and structural steganography. In: International world wide web conference (WWW). ACM, New York, pp 181–190; Bayardo and Agrawal (2005) Data privacy through optimal k-anonymisation. In: IEEE 21st international conference on data engineering, April 2005; Bamba et al. (2008) Supporting anonymous location queries in mobile environments with privacy grid. In: ACM world wide web conference; Byun et al. (2007) Efficient k-anonymisation using clustering techniques. In: International conference on database systems for advanced applications (DASFAA), pp 188–200; Campan and Truta (2008) A clustering approach for data and structural anonymity in social networks. In: ACM SIGKDD workshop on privacy, security, and trust in KDD (PinKDD), Las Vegas; Chakrabarti et al. (2004) R-MAT: a recursive model for graph mining. In: SIAM international conference on data mining; Chawla et al. (2005) Toward privacy in public databases. In: Proceedings of the theory of cryptography conference, Cambridge, MA; Evfimievski et al. (2003) Limiting privacy breaches in privacy preserving data mining. In: ACM principles of database systems (PODS). ACM, New York, pp 211–222; Getoor and Diehl, A surv SIGKDD Explore Newsl 7(2):3–12, 2005; Ghinita et al. (2007) Fast data anonymisation with low information loss. In: Very large data base conference (VLDB), Vienna, pp 758–769; Lefebvre et al. (2006) Mondrian multidimensional K-anonymity. In: IEEE international conference of data engineering (ICDE), p 25; Liu and Terzi (2008) Towards identity anonymisation on graphs. In: Wang (ed.) SIGMOD conference. ACM, New York, pp 93–106; Lunacek et al. (2006) A crossover operator for the k-anonymity problem. In: Genetic and evolutionary computation conference (GECCO), Seattle, Washington, pp 1713–1720; Machanavajjhala et al. (2006) L-diversity: privacy beyond K-anonymity. In: IEEE international conference on data engineering (ICDE), Atlanta, p 24; Malin, J Am Med Inform Assoc 12(1):28–34, 2004; Nergiz and Clifton (2006) Thoughts on k-anonymisation. In: IEEE 22nd international conference on data engineering workshops (ICDEW), Atlanta, April 2006, p 96; Nergiz and Clifton (2007) Multirelational k-anonymity. In: IEEE 23rd international conference on data engineering posters, April 2007). However, most of the known anonymisation approaches such as suppression or generalization do not directly apply to social network data. One major challenge in social network anonymization is the complexity. In (Gross and Yellen (2006) Graph theory and its applications. CRC, Boca Raton), it has been proved that a particular k-anonymity problem trying to minimize the structural change to the original social network is NP-hard. Research in anonymization of social networks is a relatively new field. In this chapter, we provide a systematic study of different approaches and studies done so far in this direction. There is no doubt that social network nodes can have imprecise data as their attributes. So, normal methods proposed for anonymization are not suitable for such type of social networks. Recently, a very efficient rough set-based algorithm was established in (Tripathy and Prakash Kumar, Int J Rapid Manuf 1(2):189–207, 2009) to handle clustering of tuples in relational models. We shall describe how this algorithm can be used for anonymization of social networks. Also, we shall present some recent algorithms which use isomorphism of graphs for anonymization of social networks. In the end, we shall discuss the current status of research on anonymization of social networks and present some related problems for further study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R. , Panigrahy, R., Thomas, D., Zhu, A.: Approximation algorithms for k-anonymity. J. Priv. Technol. pp. 1–8 (2005)
Google Scholar
Backstrom, L., Dwork, C., Kleinberg, J.: Wherefore art thou R3579X? Anonymized social networks, hidden patterns, and structural steganography. In: International World Wide Web Conference (WWW), pp. 181–190. ACM, New York (2007)
Google Scholar
Bader, D.A., Madduri, K.: GTGraph: a synthetic graph generator suite. Available online http://www.cc.gatech.edu/~kamesh/GTgraph/ (2006)
Bamba, B., Liu, L., Pesti, P., Wang, T.: Supporting anonymous location queries in mobile environments with privacy grid. In: ACM World Wide Web Conference, Beijing, China (2008)
Google Scholar
Bayardo, R., Agrawal, R.: Data privacy through optimal k-anonymisation. In: IEEE 21st International Conference on Data Engineering, Tokyo, Japan, April 2005
Google Scholar
Byun, J.W., Kamra, A., Bertino, E., Li, N.: Efficient k- anonymisation using clustering techniques. In: International Conference on Database Systems for Advanced Applications (DASFAA), pp. 188–200. Bangkok, Thailand (2007)
Google Scholar
Campan, A., Truta, T.M.: A clustering approach for data and structural anonymity in social networks. In: ACM SIGKDD Workshop on Privacy, Security, and Trust in KDD (PinKDD), Las Vegas, (2008)
Google Scholar
Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: a recursive model for graph mining. In: SIAM International Conference on Data Mining, Florida, USA (2004)
Google Scholar
Chawla, S., Dwork, C., Mcsherry, F., Smith, A., Wee, H.: Toward privacy in public databases. In: Proceedings of the Theory of Cryptography Conference, Cambridge, MA, (2005)
Google Scholar
Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: ACM Principles of Database Systems (PODS), pp. 211–222. San Diego, CA (2003)
Google Scholar
Getoor, L., Diehl, C.P.: Link mining. A surv. SIGKDD Explore Newsl. 7(2), 3–12 (2005)
Article Google Scholar
Ghinita, G., Karras, P., Kalinis, P., Mamoulis, N.: Fast data anonymisation with low information loss, In: Very Large Data Base Conference (VLDB), pp. 758–769, Vienna (2007)
Google Scholar
Gross, J., Yellen, J.: Graph Theory and Its Applications. CRC, Boca Raton (2006)
MATH Google Scholar
Han, J., Kamber, M.: Data Mining, Concepts and Techniques, 2nd edn. Morgan Kaffmann, San Francisco, CA (2006)
MATH Google Scholar
Hay, M., Miklau, G., Jensen, D., Weiss, P., Srivastava, S.: Anonymising social networks. Technical Report No. 07–19, University of Massachusetts Amherst (2007)
Google Scholar
HIPAA: Health insurance portability and accountability act. Available online http://www.hhs.gov/ocr/hipaa (2002)
Horowitz, E., Sahani, S., Rajasekaran, S.: Fundamentals of Computer Algorithms. Galgotia Publications, Darya Ganj, Galgotia Publications, New Delhi, India (2004)
Google Scholar
Lambert, D.: Measures of disclosure risk and harm. J. Off. Stat. 9, 313–331 (1993)
Google Scholar
Lefebvre, K., DeWitt, D., Ramakrishnan, R.: Mondrian multidimensional K-anonymity. In: IEEE International Conference of Data Engineering (ICDE), p. 25, Atlanta, Georgia, USA (2006)
Google Scholar
Li, N., Li, T., Venkitasubramaniam, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of the 23rd International Conference on Data Engineering (ICDE), Istanbul, pp. 106–115, Istanbul, Turkey (2007)
Google Scholar
Lin, J.-L., Wei, M.-C.: An efficient clustering method for k-anonymization. In: Proceedings of the 2008 International Workshop on Privacy and Anonymity in Information Society, Nantes, pp. 29–29, March 2008
Google Scholar
Liu, K., Terzi, E.: Towards identity anonymisation on graphs. In: Wang, J.T.L., (ed.) SIGMOD Conference, pp. 93–106. ACM, New York (2008)
Chapter Google Scholar
Lunacek, M., Whitley, D., Ray, I.: A crossover operator for the k-anonymity problem. In: Genetic and Evolutionary Computation Conference (GECCO), Seattle, Washington, pp. 1713–1720 (2006)
Google Scholar
Machanavajjhala, A., Gehrke, J., Kifer, D.: L-diversity: privacy beyond K-anonymity. In: IEEE International Conference on Data Engineering (ICDE), Atlanta, p. 24, Georgia, USA (2006)
Google Scholar
Malin, B.: An evaluation of the current state of genomic data privacy protection technology and a roadmap for the future. J Am. Med. Inform. Assoc. 12(1), 28–34 (2004)
Article Google Scholar
Miklau, G., Suciu, D.: A formal analysis of information disclosure in data exchange. In: ACM Conference on Management of Data (SIGMOD), Paris, pp. 575–586 (2004)
Google Scholar
Nergiz, M.E., Clifton, C.: Thoughts on k-anonymisation. In: IEEE 22nd International Conference on Data Engineering Workshops (ICDEW), Atlanta, Istanbul, Turkey, April 2006, p. 96
Google Scholar
Nergiz, M.E., Clifton, C.: Multirelational k-anonymity. In: IEEE 23rd International Conference on Data Engineering Posters, Istanbul, Turkey, April 2007
Google Scholar
Nergiz, M.E., Atzori, M., Clifton, C.: Hiding the presence of individuals from shared databases. In: 26th ACM SIGMOD International Conference on Management of Data, Beijing, June 2007
Google Scholar
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases. Available online at: www.ics.uci.edu/~mlearn/MLRepository.html (1998)
Pang, R., Paxson, V.: A high-level programming environment for packet trace anonymisation and transformation. In: ACM SIGSOMM, Karlsruhe, Germany (2003)
Google Scholar
Parmar, D., Wu, T., Blackhurst, J.: MMR: an algorithm for clustering categorical data using rough set theory. Data Knowl. Eng. 63, 879–893 (2007)
Article Google Scholar
Pawlak, Z.: Rough sets. Int. J. Comput. Inf. Sci. 11, 341–356 (1982)
Article MathSciNet MATH Google Scholar
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer, Dordrecht (1991)
MATH Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Kaufmann, San Mateo (1988)
Google Scholar
Samarati, P.: Protecting respondents identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)
Article Google Scholar
Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information. In: PODS’98, Seattle, Washington (1998)
Google Scholar
Singliar, T., Hauskrecht, M. Noisy-or component analysis and its application to link analysis. J Mach. Learn. Res. 7, 2189–2213 (2006)
MathSciNet MATH Google Scholar
Stein, R.: Social networks’ sway may be underestimated. Washington Post, 26 May 2008
Google Scholar
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 557–570 (2002)
Article MathSciNet MATH Google Scholar
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 571–588 (2002)
Article MathSciNet MATH Google Scholar
Thompson, B., Yao, D.: The union-split algorithm and cluster-based anonymisation of social networks. ASIACCS’09, Sydney, 10–12 March 2009
Google Scholar
Tripathy, B.K., Lakshmi Janaki, K., Jain, N.: Security against neighborhood attacks in social networks. In: Proceedings of the National Conference on Recent Trends in Soft Computing (NCRTSC’09), pp. 216–223, Bangalore, India (2009)
Google Scholar
Tripathy, B.K., Panda, G.K.: A new approach to manage security against neighborhood attacks in social networks. In: Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, Odense, pp. 264–269 (2010). DOI 10.1109/ASNOM.2010.69
Google Scholar
Tripathy, B.K., Panda, G.K., Kumaran, K.: A rough set approach to develop an efficient l-diversity algorithm based on clustering, accepted for presentation at the 2nd IIMA international conference on “Advanced Data Analysis, Business Analytics and Intelligence”, Ahmedabad, January 2011
Google Scholar
Tripathy, B.K., Panda, G.K., Kumaran, K.: A fast l – Diversity anonymisation algorithm, accepted for presentation at ICCMS 2011, Mumbai, 7–9 January 2011
Google Scholar
Tripathy, B.K., Prakash Kumar, Ch.: MMeR: an algorithm for clustering heterogeneous data using rough set theory. Int. J. Rapid Manuf. 1(2), 189–207 (2009)
Google Scholar
Truta, T.M., Bindu, V.: Privacy protection: P-sensitive k-anonymity property. In: PDM Workshop, with IEEE International Conference on Data Engineering (ICDE), Atlanta, p. 94 (2006)
Google Scholar
Wang, T., Liu, L.: Butterfly: protecting output privacy in stream mining. In: IEEE International Conference on Data Engineering (ICDE), Cancun, pp. 1170–1179 (2008)
Google Scholar
Wasserman, S., Faust, K.: Social Network Analysis, Cambridge University Press, Cambridge/ New York (1994)
Google Scholar
Wong, R.C.W., Li, J., Fu, A.W.C., Wang, K.: (α, k)- anonymity: an enhanced k-anonymity model for privacy-preserving data publishing. In: SIGKDD, pp. 754–759, Philadelphia, PA (2006)
Google Scholar
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.W.C: Utility based anonymisation using local recording. In: KDD’06, Philadelphia, PA (2006)
Google Scholar
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., and Fu, A. W.C.: gspan: graph-based substructure pattern mining. In: ICDM’02, Maebashi City (2002)
Google Scholar
Zheleva, E., Getoor, L.: Preserving the privacy of sensitive relationships in graph data. In: ACM SIGKDD Workshop on Privacy, Security, and Trust in KDD (PinKDD), San Jose, pp. 153–177 (2007)
Google Scholar
Zhou, B., Pei, J.: Preserving privacy in social networks against neighborhood attacks. In: Proceedings of the 24th International Conference on Data Engineering (ICDE), Cancún, pp. 506–515, Simon Fraser University (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing Science and Engineering, VIT University, Vellore, Tamil Nadu, 632 014, India
Bala Krishna Tripathy (Senior Professor)

Authors

Bala Krishna Tripathy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bala Krishna Tripathy .

Editor information

Editors and Affiliations

Auburn, 98071, USA
Ajith Abraham

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tripathy, B.K. (2012). Anonymisation of Social Networks and Rough Set Approach. In: Abraham, A. (eds) Computational Social Networks. Springer, London. https://doi.org/10.1007/978-1-4471-4051-1_12

Download citation

DOI: https://doi.org/10.1007/978-1-4471-4051-1_12
Published: 20 June 2012
Publisher Name: Springer, London
Print ISBN: 978-1-4471-4050-4
Online ISBN: 978-1-4471-4051-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics