Abstract
Scientific study of network data can reveal many important behaviors of the elements involved and social trends. It also provides insight for suitable changes in the social structure and roles of individuals in it. There are many evidences (HIPAA (2002) Health insurance portability and accountability act. Available online http://www.hhs.gov/ocr/hipaa; Lambert, J Off Stat 9:313–331, 1993; Xu (2006) Utility based anonymisation using local recording. In: KDD’06, Philadelphia) which indicate the precious value of social network data in shedding light on social behavior, health, and well-being of the general public. For this purpose, the social network information needs to be published publicly or before a specialized group. But, depending upon the privacy model considered, this information may involve some sensitive data of individual participants in the social network, which are undesirable to be disclosed. Due to this problem, social network data need to be anonymized before its publication in order to prevent potential reidentification attacks. Data anonymization techniques are abundantly used in relational databases (Aggarwal et al. J Priv Technol, 2005; Backstrom et al. (2007) Wherefore art thou R3579X? Anonymized social networks, hidden patterns, and structural steganography. In: International world wide web conference (WWW). ACM, New York, pp 181–190; Bayardo and Agrawal (2005) Data privacy through optimal k-anonymisation. In: IEEE 21st international conference on data engineering, April 2005; Bamba et al. (2008) Supporting anonymous location queries in mobile environments with privacy grid. In: ACM world wide web conference; Byun et al. (2007) Efficient k-anonymisation using clustering techniques. In: International conference on database systems for advanced applications (DASFAA), pp 188–200; Campan and Truta (2008) A clustering approach for data and structural anonymity in social networks. In: ACM SIGKDD workshop on privacy, security, and trust in KDD (PinKDD), Las Vegas; Chakrabarti et al. (2004) R-MAT: a recursive model for graph mining. In: SIAM international conference on data mining; Chawla et al. (2005) Toward privacy in public databases. In: Proceedings of the theory of cryptography conference, Cambridge, MA; Evfimievski et al. (2003) Limiting privacy breaches in privacy preserving data mining. In: ACM principles of database systems (PODS). ACM, New York, pp 211–222; Getoor and Diehl, A surv SIGKDD Explore Newsl 7(2):3–12, 2005; Ghinita et al. (2007) Fast data anonymisation with low information loss. In: Very large data base conference (VLDB), Vienna, pp 758–769; Lefebvre et al. (2006) Mondrian multidimensional K-anonymity. In: IEEE international conference of data engineering (ICDE), p 25; Liu and Terzi (2008) Towards identity anonymisation on graphs. In: Wang (ed.) SIGMOD conference. ACM, New York, pp 93–106; Lunacek et al. (2006) A crossover operator for the k-anonymity problem. In: Genetic and evolutionary computation conference (GECCO), Seattle, Washington, pp 1713–1720; Machanavajjhala et al. (2006) L-diversity: privacy beyond K-anonymity. In: IEEE international conference on data engineering (ICDE), Atlanta, p 24; Malin, J Am Med Inform Assoc 12(1):28–34, 2004; Nergiz and Clifton (2006) Thoughts on k-anonymisation. In: IEEE 22nd international conference on data engineering workshops (ICDEW), Atlanta, April 2006, p 96; Nergiz and Clifton (2007) Multirelational k-anonymity. In: IEEE 23rd international conference on data engineering posters, April 2007). However, most of the known anonymisation approaches such as suppression or generalization do not directly apply to social network data. One major challenge in social network anonymization is the complexity. In (Gross and Yellen (2006) Graph theory and its applications. CRC, Boca Raton), it has been proved that a particular k-anonymity problem trying to minimize the structural change to the original social network is NP-hard. Research in anonymization of social networks is a relatively new field. In this chapter, we provide a systematic study of different approaches and studies done so far in this direction. There is no doubt that social network nodes can have imprecise data as their attributes. So, normal methods proposed for anonymization are not suitable for such type of social networks. Recently, a very efficient rough set-based algorithm was established in (Tripathy and Prakash Kumar, Int J Rapid Manuf 1(2):189–207, 2009) to handle clustering of tuples in relational models. We shall describe how this algorithm can be used for anonymization of social networks. Also, we shall present some recent algorithms which use isomorphism of graphs for anonymization of social networks. In the end, we shall discuss the current status of research on anonymization of social networks and present some related problems for further study.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R. , Panigrahy, R., Thomas, D., Zhu, A.: Approximation algorithms for k-anonymity. J. Priv. Technol. pp. 1–8 (2005)
Backstrom, L., Dwork, C., Kleinberg, J.: Wherefore art thou R3579X? Anonymized social networks, hidden patterns, and structural steganography. In: International World Wide Web Conference (WWW), pp. 181–190. ACM, New York (2007)
Bader, D.A., Madduri, K.: GTGraph: a synthetic graph generator suite. Available online http://www.cc.gatech.edu/~kamesh/GTgraph/ (2006)
Bamba, B., Liu, L., Pesti, P., Wang, T.: Supporting anonymous location queries in mobile environments with privacy grid. In: ACM World Wide Web Conference, Beijing, China (2008)
Bayardo, R., Agrawal, R.: Data privacy through optimal k-anonymisation. In: IEEE 21st International Conference on Data Engineering, Tokyo, Japan, April 2005
Byun, J.W., Kamra, A., Bertino, E., Li, N.: Efficient k- anonymisation using clustering techniques. In: International Conference on Database Systems for Advanced Applications (DASFAA), pp. 188–200. Bangkok, Thailand (2007)
Campan, A., Truta, T.M.: A clustering approach for data and structural anonymity in social networks. In: ACM SIGKDD Workshop on Privacy, Security, and Trust in KDD (PinKDD), Las Vegas, (2008)
Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: a recursive model for graph mining. In: SIAM International Conference on Data Mining, Florida, USA (2004)
Chawla, S., Dwork, C., Mcsherry, F., Smith, A., Wee, H.: Toward privacy in public databases. In: Proceedings of the Theory of Cryptography Conference, Cambridge, MA, (2005)
Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: ACM Principles of Database Systems (PODS), pp. 211–222. San Diego, CA (2003)
Getoor, L., Diehl, C.P.: Link mining. A surv. SIGKDD Explore Newsl. 7(2), 3–12 (2005)
Ghinita, G., Karras, P., Kalinis, P., Mamoulis, N.: Fast data anonymisation with low information loss, In: Very Large Data Base Conference (VLDB), pp. 758–769, Vienna (2007)
Gross, J., Yellen, J.: Graph Theory and Its Applications. CRC, Boca Raton (2006)
Han, J., Kamber, M.: Data Mining, Concepts and Techniques, 2nd edn. Morgan Kaffmann, San Francisco, CA (2006)
Hay, M., Miklau, G., Jensen, D., Weiss, P., Srivastava, S.: Anonymising social networks. Technical Report No. 07–19, University of Massachusetts Amherst (2007)
HIPAA: Health insurance portability and accountability act. Available online http://www.hhs.gov/ocr/hipaa (2002)
Horowitz, E., Sahani, S., Rajasekaran, S.: Fundamentals of Computer Algorithms. Galgotia Publications, Darya Ganj, Galgotia Publications, New Delhi, India (2004)
Lambert, D.: Measures of disclosure risk and harm. J. Off. Stat. 9, 313–331 (1993)
Lefebvre, K., DeWitt, D., Ramakrishnan, R.: Mondrian multidimensional K-anonymity. In: IEEE International Conference of Data Engineering (ICDE), p. 25, Atlanta, Georgia, USA (2006)
Li, N., Li, T., Venkitasubramaniam, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of the 23rd International Conference on Data Engineering (ICDE), Istanbul, pp. 106–115, Istanbul, Turkey (2007)
Lin, J.-L., Wei, M.-C.: An efficient clustering method for k-anonymization. In: Proceedings of the 2008 International Workshop on Privacy and Anonymity in Information Society, Nantes, pp. 29–29, March 2008
Liu, K., Terzi, E.: Towards identity anonymisation on graphs. In: Wang, J.T.L., (ed.) SIGMOD Conference, pp. 93–106. ACM, New York (2008)
Lunacek, M., Whitley, D., Ray, I.: A crossover operator for the k-anonymity problem. In: Genetic and Evolutionary Computation Conference (GECCO), Seattle, Washington, pp. 1713–1720 (2006)
Machanavajjhala, A., Gehrke, J., Kifer, D.: L-diversity: privacy beyond K-anonymity. In: IEEE International Conference on Data Engineering (ICDE), Atlanta, p. 24, Georgia, USA (2006)
Malin, B.: An evaluation of the current state of genomic data privacy protection technology and a roadmap for the future. J Am. Med. Inform. Assoc. 12(1), 28–34 (2004)
Miklau, G., Suciu, D.: A formal analysis of information disclosure in data exchange. In: ACM Conference on Management of Data (SIGMOD), Paris, pp. 575–586 (2004)
Nergiz, M.E., Clifton, C.: Thoughts on k-anonymisation. In: IEEE 22nd International Conference on Data Engineering Workshops (ICDEW), Atlanta, Istanbul, Turkey, April 2006, p. 96
Nergiz, M.E., Clifton, C.: Multirelational k-anonymity. In: IEEE 23rd International Conference on Data Engineering Posters, Istanbul, Turkey, April 2007
Nergiz, M.E., Atzori, M., Clifton, C.: Hiding the presence of individuals from shared databases. In: 26th ACM SIGMOD International Conference on Management of Data, Beijing, June 2007
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases. Available online at: www.ics.uci.edu/~mlearn/MLRepository.html (1998)
Pang, R., Paxson, V.: A high-level programming environment for packet trace anonymisation and transformation. In: ACM SIGSOMM, Karlsruhe, Germany (2003)
Parmar, D., Wu, T., Blackhurst, J.: MMR: an algorithm for clustering categorical data using rough set theory. Data Knowl. Eng. 63, 879–893 (2007)
Pawlak, Z.: Rough sets. Int. J. Comput. Inf. Sci. 11, 341–356 (1982)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer, Dordrecht (1991)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Kaufmann, San Mateo (1988)
Samarati, P.: Protecting respondents identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)
Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information. In: PODS’98, Seattle, Washington (1998)
Singliar, T., Hauskrecht, M. Noisy-or component analysis and its application to link analysis. J Mach. Learn. Res. 7, 2189–2213 (2006)
Stein, R.: Social networks’ sway may be underestimated. Washington Post, 26 May 2008
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 557–570 (2002)
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 10(5), 571–588 (2002)
Thompson, B., Yao, D.: The union-split algorithm and cluster-based anonymisation of social networks. ASIACCS’09, Sydney, 10–12 March 2009
Tripathy, B.K., Lakshmi Janaki, K., Jain, N.: Security against neighborhood attacks in social networks. In: Proceedings of the National Conference on Recent Trends in Soft Computing (NCRTSC’09), pp. 216–223, Bangalore, India (2009)
Tripathy, B.K., Panda, G.K.: A new approach to manage security against neighborhood attacks in social networks. In: Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, Odense, pp. 264–269 (2010). DOI 10.1109/ASNOM.2010.69
Tripathy, B.K., Panda, G.K., Kumaran, K.: A rough set approach to develop an efficient l-diversity algorithm based on clustering, accepted for presentation at the 2nd IIMA international conference on “Advanced Data Analysis, Business Analytics and Intelligence”, Ahmedabad, January 2011
Tripathy, B.K., Panda, G.K., Kumaran, K.: A fast l – Diversity anonymisation algorithm, accepted for presentation at ICCMS 2011, Mumbai, 7–9 January 2011
Tripathy, B.K., Prakash Kumar, Ch.: MMeR: an algorithm for clustering heterogeneous data using rough set theory. Int. J. Rapid Manuf. 1(2), 189–207 (2009)
Truta, T.M., Bindu, V.: Privacy protection: P-sensitive k-anonymity property. In: PDM Workshop, with IEEE International Conference on Data Engineering (ICDE), Atlanta, p. 94 (2006)
Wang, T., Liu, L.: Butterfly: protecting output privacy in stream mining. In: IEEE International Conference on Data Engineering (ICDE), Cancun, pp. 1170–1179 (2008)
Wasserman, S., Faust, K.: Social Network Analysis, Cambridge University Press, Cambridge/ New York (1994)
Wong, R.C.W., Li, J., Fu, A.W.C., Wang, K.: (α, k)- anonymity: an enhanced k-anonymity model for privacy-preserving data publishing. In: SIGKDD, pp. 754–759, Philadelphia, PA (2006)
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.W.C: Utility based anonymisation using local recording. In: KDD’06, Philadelphia, PA (2006)
Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., and Fu, A. W.C.: gspan: graph-based substructure pattern mining. In: ICDM’02, Maebashi City (2002)
Zheleva, E., Getoor, L.: Preserving the privacy of sensitive relationships in graph data. In: ACM SIGKDD Workshop on Privacy, Security, and Trust in KDD (PinKDD), San Jose, pp. 153–177 (2007)
Zhou, B., Pei, J.: Preserving privacy in social networks against neighborhood attacks. In: Proceedings of the 24th International Conference on Data Engineering (ICDE), Cancún, pp. 506–515, Simon Fraser University (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag London
About this chapter
Cite this chapter
Tripathy, B.K. (2012). Anonymisation of Social Networks and Rough Set Approach. In: Abraham, A. (eds) Computational Social Networks. Springer, London. https://doi.org/10.1007/978-1-4471-4051-1_12
Download citation
DOI: https://doi.org/10.1007/978-1-4471-4051-1_12
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-4050-4
Online ISBN: 978-1-4471-4051-1
eBook Packages: Computer ScienceComputer Science (R0)