Abstract
In this work, we present a method for classifying the quality of blog comments using Linear-Chain Conditional Random Fields (CRFs). This approach is found to yield high accuracy on binary classification of high-quality comments, with conversational features contributing strongly to the accuracy. We also present a new corpus of blog data in conversational form, complete with user-generated quality moderation labels from the science and technology news blog Slashdot.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chung, G.: Sentence retrieval for abstracts of randomized controlled trials. In: BMC Medical Informatics and Decision Making, vol. 9, p. 10 (2009)
FitzGerald, N., Carenini, G., Ng, R.: ASSESS: Abstractive Summarization System for Evaluative Statement Summarization (extended abstract), The Pacific Northwest Regional NLP Workshop (NW-NLP), Redmond (2010)
Galley, M., McKeown, K., Fosler-Lussier, E., Jing, H.: Discourse segmentation of multi-party conversation. In: 41st Annual Meeting on Association for Computational Linguistics, Stroudsburg, vol. 1 (2003)
Hirohata, K., Okazaki, N., Ananiadou, S., Ishizuka, M.: Identifying Sections in Scientific Abstracts using Conditional Random Fields. In: Third International Joint Conference on Natural Language Processing, Hyderabad, pp. 381–388 (2008)
Jurafsky, D., Martin, J.: Speech and Language Processing: an Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Pearson Prentice Hall, Upper Saddle River (2009)
Kim, S., Cavedon, L., Baldwin, T.: Classifying dialogue acts in one-on-one live chats. In: 2010 Conference on Empirical Methods in Natural Language Processing Cambridge (2010)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. 18th International Conf. on Machine Learning, pp. 282–289. Morgan Kaufmann, San Francisco (2001)
McCallum, A.: MALLET: A Machine Learning for Language Toolkit, http://mallet.cs.umass.edu
Murray, G., Carenini, G.: Summarizing Spoken and Written Conversations. In: 2008 Conference on Empirical Methods in Natural Language Processing, Waikiki (2008)
Murray, G., Carenini, G., Ng, R.: Generating Abstracts of Meeting Conversations: A User Study. In: International Conference on Natural Language Generation (2010)
Shen, D., Sun, J., Li, H., Yang, Q., Chen, Z.: Document Summarization using Conditional Random Fields. In: International Joint Conferences on Artificial Intelligence (2007)
Joty, S., Carenini, G., Murray, G., Ng, R.: Exploiting Conversation Structure in Unsupervised Topic Segmentation for Emails. In: The Conference on Empirical Methods in Natural Language Processing, Cambridge (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
FitzGerald, N., Carenini, G., Murray, G., Joty, S. (2011). Exploiting Conversational Features to Detect High-Quality Blog Comments. In: Butz, C., Lingras, P. (eds) Advances in Artificial Intelligence. Canadian AI 2011. Lecture Notes in Computer Science(), vol 6657. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21043-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-21043-3_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21042-6
Online ISBN: 978-3-642-21043-3
eBook Packages: Computer ScienceComputer Science (R0)