Abstract
Online forums are rich sources of information about user communication activity over time. Finding temporal patterns in online forum communication threads can advance our understanding of the dynamics of conversations. The main challenge of temporal analysis in this context is the complexity of forum data. There can be thousands of interacting users, who can be numerically described in many different ways. Moreover, user characteristics can evolve over time. We propose an approach that decouples temporal information about users into sequences of user events and inter-event times. We develop a new feature space to represent the event sequences as paths, and we model the distribution of the inter-event times. We study over 30,000 users across four Internet forums, and discover novel patterns in user communication. We find that users tend to exhibit consistency over time. Furthermore, in our feature space, we observe regions that represent unlikely user behaviors. Finally, we show how to derive a numerical representation for each forum, and we then use this representation to derive a novel clustering of multiple forums.
Similar content being viewed by others
References
Adar, E., Adamic, L.: Tracking information epidemics in blogspace. In: The 2005 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 207–214 (2005)
Aumayr, E., Chan, J., Hayes, C.: Reconstruction of threaded conversations in online discussion forums. In: The 5th International AAAI Conference on Weblogs and Social Media (2011)
Benevenuto, F., Rodrigues, T., Cha, M., Almeida, V.: Characterizing user behavior in online social networks. In: The 9th ACM Conference on Internet Measurement, pp. 49–62 (2009)
Bird, C., Gourley, A., Devanbu, P., Gertz, M., Swaminathan, A.: Mining email social networks. In: The 2006 ACM International Workshop on Mining Software Repositories, pp. 137–143 (2006)
Botev, Z., Grotowski, J., Kroese, D.: Kernel density estimation via diffusion. Ann. Stat. 38(5), 2916–2957 (2010)
Boutsidis, C., Mahoney, M., Drineas, P.: Unsupervised feature selection for the k-means clustering problem. In: Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 22, pp. 153–161 (2009)
Chan, J., Daly, E., Hayes, C.: Decomposing discussion forums and boards using user roles. In: The 4th International AAAI Conference on Weblogs and Social Media, pp. 215–218 (2010)
Gonzalez-Bailon, S., Kaltenbrunner, A., Banchs, R.: The structure of political discussion networks: a model for the analysis of online deliberation. J. Inf. Technol. 25(2), 230–243 (2010)
Keogh, E., Chu, S., Hart, D., Pazzani, M.: Segmenting time series: a survey and novel approach. In: Last, M., Kandel, A., Bunke, H. (eds.) Data Mining in Time Series Databases, pp. 1–21. World Scientific Publishing (2003)
Kim, S.N., Wang, L., Baldwin, T.: Tagging and linking web forum posts. In: The 14th Conference on Computational Natural Language Learning, pp. 192–202. Springer (2010)
Kumar, R., Mahdian, M., McGlohon, M.: Dynamics of conversations. In: The 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 553–562. ACM (2010)
Liao, Y., Moshtaghi, M., Han, B., Karunasekera, S., Kotagiri, R., Baldwin, T., Harwood, A., Pattison, P.: Mining micro-blogs: opportunities and challenges. In: Abraham, A., Hassanien, A.E. (eds.) Social Networks: Computational Aspects and Mining. Springer (2011)
Lin, C., Mei, Q., Jiang, Y., Han, J., Qi, S.: The joint inference of topic diffusion and evolution in social communities. In: The 11th International Conference on Data Mining, pp. 378–387. IEEE (2011)
Loekito, E., Bailey, J., Pei, J.: A binary decision diagram based approach for mining frequent subsequences. Knowl. Inf. Syst. 24(2), 235–268 (2010)
Lui, M., Baldwin, T.: Classifying user forum participants: separating the gurus from the hacks, and other tales of the internet. In: The 2010 Australasian Language Technology Workshop, vol. 49, p. 57 (2010)
Malmgren, R., Stouffer, D., Motter, A., Amaral, L.: A Poissonian explanation for heavy tails in e-mail communication. Proc. Natl. Acad. Sci. USA 105(47), 18,153–18,158 (2008)
Morzy, M.: An analysis of communities in different types of online forums. In: The 2010 International Conference on Advances in Social Networks Analysis and Mining, pp. 341–345. IEEE (2010)
Mueen, A., Keogh, E., Zhu, Q., Cash, S., Westover, B.: Exact discovery of time series motifs. In: 2009 SIAM International Conference on Data Mining, pp. 1–12 (2009)
Petrovčič, A., Vehovar, V., Žiberna, A.: Posting, quoting, and replying: a comparison of methodological approaches to measure communication ties in web forums. Qual. Quant. 46, 1–26 (2011)
Plickert, G., Côté, R., Wellman, B.: It’s not who you know, it’s how you know them: who exchanges what with whom? Soc. Netw. 29(3), 405–429 (2007)
Somol, P., Novovicová, J., Pudil, P.: Efficient feature subset selection and subset size optimization. In: Herout, A. (ed.) Pattern Recognition Recent Advances. InTech (2010)
Turner, T., Smith, M., Fisher, D., Welser, H.: Picturing usenet: mapping computer-mediated collective action. J. Comput.-Mediat. Commun. 10(4), 7 (2005)
Viégas, F., Smith, M.: Newsgroup crowds and authorlines: visualizing the activity of individuals in conversational cyberspaces. In: The 37th Annual Hawaii International Conference on System Sciences, p. 10 (2004)
Viswanath, B., Mislove, A., Cha, M., Gummadi, K.: On the evolution of user interaction in Facebook. In: The 2nd ACM Workshop on Online Social Networks, pp. 37–42 (2009)
Wang, L., S.N., K., Baldwin, T.: Thread-level analysis over technical user forum data. In: The 2010 Australasian Language Technology Workshop, vol. 49, p. 27 (2010)
Warren Liao, T.: Clustering of time series data—a survey. Pattern Recogn. 38(11), 1857–1874 (2005)
Welser, H., Gleave, E., Fisher, D., Smith, M.: Visualizing the signatures of social roles in online discussion groups. J. Soc. Struct. 8(2), 1–24 (2007)
Wu, D., Ke, Y., Yu, J.X., Yu, P.S., Chen, L.: Leadership discovery when data correlatively evolve. World Wide Web 14(1), 1–25 (2011)
Xiong, R., Donath, J.: PeopleGarden: creating data portraits for users. In: The 12th Annual ACM Symposium on User Interface Software and Technology, pp. 37–44 (1999)
Yao, J., Cui, B., Huang, Y., Zhou, Y.: Bursty event detection from collaborative tags. World Wide Web 15(2), 171–95 (2012)
Author information
Authors and Affiliations
Corresponding author
Additional information
A part of the work was carried by Andrey Kan and Jeffrey Chan while at Digital Enterprise Research Institute, National University of Ireland, Galway, Ireland.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Kan, A., Chan, J., Hayes, C. et al. A time decoupling approach for studying forum dynamics. World Wide Web 16, 595–620 (2013). https://doi.org/10.1007/s11280-012-0169-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-012-0169-1