Abstract
The rise of online social media is providing a wealth of social network data. Data mining techniques provide researchers and practitioners the tools needed to analyze large, complex, and frequently changing social media data. This chapter introduces the basics of data mining, reviews social media, discusses how to mine social media data, and highlights some illustrative examples with an emphasis on social networking sites and blogs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
N. Agarwal and H. Liu. Modeling and Data Mining in Blogosphere, volume 1 of Synthesis Lectures on Data Mining and Knowledge Discovery. Morgan and Claypool, 2009.
N. Agarwal, H. Liu, S. Subramanya, J. Salerno, and P. Yu. Connecting sparsely distributed similar bloggers. pages 11 –20, dec. 2009.
C. C. Aggarwal and H. Wang, editors. Managing and Mining Graph Data. Springer, 2009.
P. K. Akshay Java and T. Oates. Modeling the spread of influence on the blogosphere. Technical Report UMBC TR-CS-06-03, Universtiy of Maryland Baltimore County, 1000 Hilltop Circle Baltimore, MD, USA, March 2006.
A. Ammari and V. Zharkova. Combining tag cloud learning with svm classification to achieve intelligent search for relevant blog articles. In 1st International Workshop on Mining Social Media (MSM09-CAEPIA09), 2009.
A. Anagnostopoulos, R. Kumar, and M. Mahdian. Influence and correlation in social networks. In Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 7–15, New York, NY, USA, 2008. ACM.
E.-A. Baatarjav, S. Phithakkitnukoon, and R. Dantu. Group recommendation system for facebook. pages 211–219, 2010.
L. Backstrom, C. Dwork, and J. Kleinberg. Wherefore art thou r3579x?: Anonymized social networks, hidden patterns, and structural steganography. In Proceedings of the 16th international conference on World Wide Web, pages 181–190, New York, NY, USA, 2007. ACM.
R. Bai, X. Wang, and J. Liao. Folksonomy for the blogosphere: Blog identification and classification. volume 3, pages 631 –635, 31 2009-april 2 2009.
R. Beckmann, C. Suzanne, and R. Langer. Netnography: Rich insights from online research. Insights@CBS, pages 1–4, September 2005. Published as a supplement to Insights@CBS, nr. 14, 6. September 2005: http://frontpage.cbs.dk/insights/670005.shtml.
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, 2003.
J. Bonneau, J. Anderson, and G. Danezis. Prying data out of a social network. pages 249 –254, july 2009.
C. T. Butts. Revisiting the foundation of network analysis. Science, 325:414–416, July 2009.
S.-K. Chai. Social computing: An opportunity for mathematical sociologists. The Mathematical Sociologist, 12(2), 2008-9.
S.-K. Chai, J. J. Salerno, and P. L. Mabry, editors. Advances in Social Computing, Lecture Notes in Computer Science. Third International Conference on Social Computing, Behavorial Modeling, and Prediction, SBP 2010, Springer, March 2010.
Y. Chi, S. Zhu, K. Hino, Y. Gong, and Y. Zhang. iolap: A framework for analyzing the internet, social networks, and other networked data. Multimedia, IEEE Transactions on, 11(3):372–382, april 2009.
J. C. Cortizo, F. M. Carrero, J. M. Gomez, B. Monsalve, and P. Puertas. Introduction to mining social media. In F. M. Carrero, J. M. Gomez, B. Monsalve, P. Puertas, and J. C. a. Cortizo, editors, Proceedings of the 1st International Workshop on Mining Social Media, pages 1–3, 2009.
E. Cox. Fuzzy Modeling and Genetic Algorithms for Data Mining and Exploration. Elsevier/Morgan Kaufmann, Amsterdam, 2005.
D. Crandall, D. Cosley, D. Huttenlocher, J. Kleinberg, and S. Suri. Feedback effects between similarity and social influence in online communities. In Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 160–168, New York, NY, USA, 2008. ACM.
R. da Cunha Recuero. Information flows and social capital in weblogs: a case study in the brazilian blogosphere. In Proceedings of the nineteenth ACM conference on Hypertext and hypermedia, pages 97–106, New York, NY, USA, 2008. ACM.
P. Domingos andM. Richardson. Mining the network value of customers. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 57–66, New York, NY, USA, 2001. ACM.
N. Eagle, A. Pentland, and D. Lazer. Mobile phone data for inferring social network structure. In H. Liu, J. J. Salerno, and M. J. Young, editors, Social Computing, Behavioral Modeling, and Prediction, Computer Science, pages 79–88. Springer, April 2008.
C. Faloutsos, J. Han, and P. S. Yu., editors. Link Mining: Models, Algorithms and Applications. 2010.
P. Gloor, J. Krauss, S. Nann, K. Fischbach, and D. Schoder. Web science 2.0: Identifying trends through semantic social network analysis. volume 4, pages 215 –222, aug. 2009.
F. Gravetter and L. Wallnau. Essentials of Statistics for the Behavioral Sciences. Wadsworth, Belmont, 2002.
D. Gruhl, R. Guha, D. Liben-Nowell, and A. Tomkins. Information diffusion through blogspace. In Proceedings of the 13th International Conference on World Wide Web, pages 491–501, New York, NY, USA, 2004. ACM.
M. Hamdaqa and A. Hamou-Lhadj. Citation analysis: An approach for facilitating the understanding and the analysis of regulatory compliance documents. pages 278 –283, april 2009.
J. Han. Data Mining Concepts and Techniques. Morgan Kaufmann, San Diego, 2006.
D. Hughes and R. Kellman. Blogging’s global impact and the future of blogging. Blog, October 2009. Accessed March 24, 2010.
A. Java. Mining social media communities and content. PhD thesis, Catonsville, MD, USA, 2008. Adviser-Finin, Timothy W.
W. Jun, J. Xin, and W. Yun-peng. An empirical study of knowledge collaboration networks in virtual community: Based on wiki. pages 1092 –1097, sept. 2009.
G. C. Kane, R. G. Fichman, J. Gallaugher, and J. Glasier. Community relations 2.0. Harvard Business Review, 87(11):45–50, November 2009.
A. M. Kaplan and M. Haenlein. Users of the world, unite! the challenges and opportunities of social media. Business Horizons, 53(1):59–68, Jan 2009.
M. Kayaalp, T. Ozyer, and S. Ozyer. A collaborative and content based event recommendation system integrated with data collection scrapers and services at a social networking site. In Social Network Analysis and Mining, 2009. ASONAM ’09. International Conference on Advances in, pages 113 –118, july 2009.
E. Kim and S. Han. An analytical way to find influencers on social networks and validate their effects in disseminating social games. In Social Network Analysis and Mining, 2009. ASONAM ’09. International Conference on Advances in, pages 41 –46, july 2009.
I. King, J. Li, and K. T. Chan. A brief survey of computational approaches in social computing. In IJCNN’09: Proceedings of the 2009 international joint conference on Neural Networks, pages 2699–2706, Piscataway, NJ, USA, 2009. IEEE Press.
R. V. Kozinets. I want to believe: A netnography of the x-philes’ subculture of consumption. Advances in Consumer Research, 24:470–475, 1997.
R. V. Kozinets. The field behind the screen: Using netnography for marketing research in online communities. Journal of Marketing Research, 39(1):61–72, February 2002.
S. Kumar, N. Agarwal, M. Lim, and H. Liu. Mapping socio-cultural dynamics in indonesian blogosphere. In Proceedings of the Third International Conference on Computational Cultural Dynamics (ICCCD 2009), 2009.
S. Kumar, R. Zafarani, M. Abbasi, G. Barbier, and H. Liu. Convergence of influential bloggers for topic discovery in the blogosphere. In S. K. Chai, J. Salerno, and P. Mabry, editors, Social Computing and Behavior Modeling, volume 6007 of Lectures Notes in Computer Science, pages 406–412, Springer, 2010.
Y.-S. Kwon, S.-W. Kim, S. Park, S.-H. Lim, and J. B. Lee. The information diffusion model in the blog world. In SNA-KDD ’09: Proceedings of the 3rd Workshop on Social Network Mining and Analysis, pages 1–9, New York, NY, USA, 2009. ACM.
G. Lakshmanan and M. Oberhofer. Knowledge discovery in the blogosphere: Approaches and challenges. Internet Computing, IEEE, 14(2):24–32, march-april 2010.
D. Larose. Discovering Knowledge in Data. Wiley-Interscience, New York, 2005.
H. Lauw, J. C. Shafer, R. Agrawal, and A. Ntoulas. Homophily in the digital world: A livejournal case study. Internet Computing, IEEE, 14(2):15–23, march-april 2010.
D. Lazer, A. Pentland, L. Adamic, S. Aral, A.-L. Barabasi, D. Brewer, N. Christakis, N. Contractor, J. Fowler, M. Gutmann, T. Jebara, G. King, M. Macy, D. Roy, and M. V. Alstyne. Computational social science. Science, 323:721–723, 2009.
D. Liben-Nowell and J. Kleinberg. The link prediction problem for social networks. In CIKM ’03: Proceedings of the twelfth international conference on Information and knowledge management, pages 556–559, New York, NY, USA, 2004. ACM.
B. Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. Springer, Berlin, 2006.
Z. Liu and L. Liu. Complex network property analysis of knowledge cooperation networks. pages 544 –547, may 2009.
I.-C. Moon, Y.-M. Kim, H.-J. Lee, and A. Oh. Temporal issue trend identifications in blogs. volume 4, pages 619 –626, aug. 2009.
F. M. R. Pardo and A. P. Padilla. Detecting blogs independently from the language and content. In 1st International Workshop on Mining Social Media (MSM09-CAEPIA09), 2009.
E. Qualman. Socialnomics. Knopf Books for Young Readers, New York, 2009.
J. Ritterman, M. Osborne, and E. Klein. Using prediction markets and twitter to predict swine flu pandemic. In F. M. Carrero, J. M. Gomez, B. Monsalve, P. Puertas, and J. C. a. Cortizo, editors, Proceedings of the 1st International Workshop on Mining Social Media, pages 9–17, 2009.
D. Schuler. Social computing. Commun. ACM, 37(1):28–29, 1994.
I. Steinwart. Support Vector Machines. Westview, Boulder, 2008.
A. Stewart, L. Chen, R. Paiu, and W. Nejdl. Discovering information diffusion paths from blogosphere for online advertising. In ADKDD ’07: Proceedings of the 1st international workshop on Data mining and audience intelligence for advertising, pages 46–54, New York, NY, USA, 2007. ACM.
P.-N. Tan. Introduction to Data Mining. Pearson Addison Wesley, San Francisco, 2006.
J. Tang, J. Sun, C.Wang, and Z. Yang. Social influence analysis in largescale networks. In KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 807–816, New York, NY, USA, 2009. ACM.
L. Tang and H. Liu. Scalable learning of collective behavior based on sparse social dimensions. In CIKM’09: Proceeding of the 18th ACMconference on Information and knowledge management, pages 1107–1116, New York, NY, USA, 2009. ACM.
L. Tang and H. Liu. Toward collective behavior prediction via social dimension extraction. Intelligent Systems, IEEE, PP(99):1 –1, 2010.
L. Tang, H. Liu, J. Zhang, N. Agarwal, and J. J. Salerno. Topic taxonomy adaptation for group profiling. ACM Trans. Knowl. Discov. Data, 1(4):1–28, January 2008.
L. Tang, X. Wang, and H. Liu. Uncoverning groups via heterogeneous interaction analysis. In Data Mining, 2009. ICDM ’09. Ninth IEEE International Conference on, pages 503 –512, 6-9 2009.
L. Tang, X.Wang, and H. Liu. Understanding emerging social structures — a group profiling approach. Technical report, School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, 2010.
B. Ulicny, M. Kokar, and C. Matheus. Metrics for monitoring a socialpolitical blogosphere: A malaysian case study. Internet Computing, IEEE, 14(2):34 –44, march-april 2010.
G. Vaynerchuk. Crush It!: Why Now Is the Time to Cash in on Your Passion. HarperCollins, 10 East 53rd Street, New York, NY 10022, 1st edition, 2009.
F.-Y. Wang, K. M. Carley, D. Zeng, and W. Mao. Social computing: From social informatics to social intelligence. Intelligent Systems, IEEE, 22(2):79 –83, March-April 2007.
J. Wang, Y. Luo, Y. Zhao, and J. Le. A survey on privacy preserving data mining. pages 111 –114, april 2009.
S. Wasserman and K. Faust. Social Network Analysis: Methods and Applications. Cambridge University Press, 1994.
M. Wesch. An anthropological introduction to youtube. Presentation at the Library of Congress/Electronic, June 2008. Contributors include and The Digital Ethnography Working Group at Kansas State University; Accessed on 22 Mar 2010.
I. Witten and E. Frank. Data Mining. Morgan Kaufman, San Francisco, 2005.
D. Zhou, I. Councill, H. Zha, and C. Giles. Discovering temporal communities from social network documents. In Seventh IEEE International Conference on Data Mining, pages 745 –750, Oct. 2007.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Barbier, G., Liu, H. (2011). Data Mining in Social Media. In: Aggarwal, C. (eds) Social Network Data Analytics. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-8462-3_12
Download citation
DOI: https://doi.org/10.1007/978-1-4419-8462-3_12
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-8461-6
Online ISBN: 978-1-4419-8462-3
eBook Packages: Computer ScienceComputer Science (R0)