Skip to main content

Automated Text Analysis

  • Living reference work entry
  • First Online:
Handbook of Market Research

Abstract

The amount of text available for analysis by marketing researchers has grown exponentially in the last two decades. Consumer reviews, message board forums, and social media feeds are just a few sources of data about consumer thought, interaction, and culture. However, written language is filled with complex meaning, ambiguity, and nuance. How can marketing researchers possibly transform this rich linguistic representation into quantifiable data for statistical analysis and modeling? This chapter provides an introduction to text analysis, covering approaches that range from top-down deductive methods to bottom-up inductive methods for text mining. After covering some foundational aspects of text analysis, applications to marketing research such as sentiment analysis, topic modeling, and studying organizational communication are summarized and explored, including a case study of word-of-mouth response to a product launch.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Alba, J. W., & Hutchinson, J. W. (1987). Dimensions of consumer expertise. Journal of Consumer Research, 13(4), 411–454.

    Article  Google Scholar 

  • Arndt, J. (1967). Role of product-related conversations in the diffusion of a new product. Journal of Marketing Research, 4, 291–295.

    Article  Google Scholar 

  • Arsel, Z., & Bean, J. (2013). Taste regimes and market-mediated practice. Journal of Consumer Research, 39(5), 899–917.

    Article  Google Scholar 

  • Arvidsson, A., & Caliandro, A. (2016). Brand public. Journal of Consumer Research, 42(5), 727–748.

    Article  Google Scholar 

  • Barasch, A., & Berger, J. (2014). Broadcasting and narrowcasting: How audience size affects what people share. Journal of Marketing Research, 51(3), 286–299.

    Article  Google Scholar 

  • Belk, R. W., & Pollay, R. W. (1985). Images of ourselves: The good life in twentieth century advertising. Journal of Consumer Research, 11(4), 887.

    Article  Google Scholar 

  • Berelson, B. (1971). Content analysis in communication research. New York: Hafner.

    Google Scholar 

  • Berger, J., & Milkman, K. L. (2012). What makes online content viral? Journal of Marketing Research, 49(2), 192–205.

    Article  Google Scholar 

  • Blei, David M., Andrew Y. Ng, & Michael I. Jordan. (2003). Latent dirichlet allocation. Journal of machine Learning research 3, 993–1022.

    Google Scholar 

  • Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computer Science, 2(1), 1–8.

    Article  Google Scholar 

  • Boyd, R. L., & Pennebaker, J. W. (2015a). Away with words. In Consumer psychology in a social media world (p. 222). Abingdon: Routledge.

    Google Scholar 

  • Boyd, R. L., & Pennebaker, J. W. (2015b). Did Shakespeare write double falsehood? Identifying individuals by creating psychological signatures with text analysis. Psychological Science, 26(5), 570–582.

    Article  Google Scholar 

  • Brown, J. J., & Reingen, P. H. (1987). Social ties and word-of-mouth referral behavior. Journal of Consumer Research, 14(3), 350–362.

    Article  Google Scholar 

  • Büschken, J., & Allenby, G. M. (2016). Sentence-based text analysis for customer reviews. Marketing Science, 35(6), 953–975.

    Article  Google Scholar 

  • Carley, K. (1997). Network text analysis: The network position of concepts. In C. W. Roberts (Ed.), Text analysis for the social sciences: Methods for drawing statistical inferences from texts and transcripts. Mahwah: Lawrence Erlbaum.

    Google Scholar 

  • Chung, C. K., & Pennebaker, J. W. (2013). Counting little words in Big Data. In Social cognition and communication (p. 25). New York: Psychology Press.

    Google Scholar 

  • Collins, A. M., & Loftus, E. F. (1975). A spreading-activation theory of semantic processing. Psychological Review, 82(6), 407.

    Article  Google Scholar 

  • Constant, N., Davis, C., Potts, C., & Schwarz, F. (2009). The pragmatics of expressive content: Evidence from large corpora. Sprache und Datenverarbeitung, 33(1–2), 5–21.

    Google Scholar 

  • De Choudhury M., Sundaram H., John A., & Seligmann D. D. (2008). Can blog communication dynamics be correlated with stock market activity? In Proceedings of the nineteenth ACM conference on hypertext and hypermedia, ACM, pp. 55–60

    Google Scholar 

  • Duhachek, Adam, and Katie Kelting. (2009). Coping repertoire: Integrating a new conceptualization of coping with transactional theory. Journal of Consumer Psychology 19(3), 473–485.

    Article  Google Scholar 

  • Dunphy, D. M., Bullard, C.G., & Crossing, E.E.M. (1974). Validation of the general inquirer Harvard Iv Dictionary. Paper presented at the 1974 Pisa conference on content analysis, Pisa, Italy.

    Google Scholar 

  • Ertimur, B., & Coskuner-Balli, G. (2015). Navigating the institutional logics of markets: Implications for strategic brand management. Journal of Marketing, 79(2), 40–61.

    Article  Google Scholar 

  • Fishbein, M., & Ajzen, I. (1972). Attitudes and opinions. Annual Review of Psychology, 23(1), 487–544.

    Article  Google Scholar 

  • Fiss, P. C., & Hirsch, P. M. (2005). The discourse of globalization: Framing and sensemaking of an emerging concept. American Sociological Review, 70(1)., 24p.

    Google Scholar 

  • Gamson, W. A., & Modigliani, A. (1989). Media discourse and public opinion on nuclear power: A constructionist approach. The American Journal of Sociology, 95(1), 1–37.

    Article  Google Scholar 

  • Gandolfo, A., Tuan, A., Corciolani, M., & Dalli, D. (2016). What do emerging economy firms actually disclose in their CSR reports? A longitudinal analysis. In CSR-HR Project (Corporate Social Responsability and Human Rights Project). Research Grant of University of Pisa (PRA_2015_0082).

    Google Scholar 

  • Garrett, D. E. (1987). The effectiveness of marketing policy boycotts: Environmental opposition to marketing. Journal of Marketing, 51(2), 46–57.

    Article  Google Scholar 

  • Giesler, M. (2008). Conflict and compromise: drama in marketplace evolution. Journal of Consumer Research, 34(6), 739–753.

    Article  Google Scholar 

  • Godes, D., & Mayzlin, D. (2004). Using online conversations to study word-of-mouth communication. Marketing Science, 23(4), 545–560.

    Article  Google Scholar 

  • Godes, D., & Mayzlin, D. (2009). Firm-created word-of-mouth communication: Evidence from a field test. Marketing Science, 28(4), 721–739.

    Article  Google Scholar 

  • Grayson, K., & Rust, R. (2001). Interrater reliability assessment in content analysis. Journal of Consumer Psychology, 10(1/2), 71–73.

    Article  Google Scholar 

  • Grice, H. P. (1975). Logic and Conversation. Syntax and Semantics, vol.3 edited by P. Cole and J. Morgan, Academic Press. Reprinted as ch.2 of Grice 1989, 22–40.

    Google Scholar 

  • Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis, 21(3), 267–297.

    Article  Google Scholar 

  • Homburg, C., Ehm, L., & Artz, M. (2015). Measuring and managing consumer sentiment in an online community environment. Journal of Marketing Research, 52(5), 629–641.

    Article  Google Scholar 

  • Hong, J., & Sternthal, B. (2010). The effects of consumer prior knowledge and processing strategies on judgments. Journal of Marketing Research, 47(2), 301–311.

    Article  Google Scholar 

  • Humphreys, A. (2010). Megamarketing: The creation of markets as a social process. Journal of Marketing, 74(2), 1–19.

    Article  Google Scholar 

  • Humphreys, A., & Latour, K. A. (2013). Framing the game: Assessing the impact of cultural representations on consumer perceptions of legitimacy. Journal of Consumer Research, 40(4), 773–795.

    Article  Google Scholar 

  • Humphreys, A., & Thompson, C. J. (2014). Branding disaster: Reestablishing trust through the ideological containment of systemic risk anxieties. Journal of Consumer Research, 41(4), 877–910.

    Article  Google Scholar 

  • Humphreys, A. (2015). Social media: Enduring principles. New York/Oxford: Oxford University Press.

    Google Scholar 

  • Humphreys, A., & Wang, R. J.-H. (2018). Automated text analysis for consumer research. Journal of Consumer Research, 44(6), 1274–1306. https://doi.org/10.1093/jcr/ucx104

    Article  Google Scholar 

  • Hutto, C. J., & Gilbert, E. (2014). Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Eighth international AAAI conference on weblogs and social media.

    Google Scholar 

  • Jia, L., Clement, Y., & Meng, W. (2009). The effect of negation on sentiment analysis and retrieval effectiveness. In Proceedings of the 18th ACM conference on information and knowledge management: ACM, pp. 1827–1830.

    Google Scholar 

  • Jurafsky, D., Chahuneau, V., Routledge, B. R., & Smith, N. A. (2014). Narrative framing of consumer sentiment in online restaurant reviews. First Monday, 19(4). https://doi.org/10.5210/fm.v19i4.4944.

  • Kassarjian, H. H. (1977). Content analysis in consumer research. Journal of Consumer Research, 4(1), 8–19.

    Article  Google Scholar 

  • Kolbe, R. H., & Burnett, M. S. (1991). Content-analysis research: An examination of applications with directives for improving research reliability and objectivity. Journal of Consumer Research, 18(2), 243–250.

    Article  Google Scholar 

  • Kovács, B., Carroll, G. R., & Lehman, D. W. (2013). Authenticity and consumer value ratings: Empirical tests from the restaurant domain. Organization Science, 25(2), 458–478.

    Article  Google Scholar 

  • Kozinets, R. V. (2010). Networked narratives: Understanding word-of-mouth marketing in online communities. Journal of Marketing, 74(2), 71–89.

    Article  Google Scholar 

  • Kranz, P. (1970). Content analysis by word group. Journal of Marketing Research, 7(3), 377–380.

    Article  Google Scholar 

  • Krauss, J., Nann, S., Simon, D., Gloor, P. A., & Fischbach, K. (2008). Predicting movie success and academy awards through sentiment and social network analysis. In ECIS, pp. 2026–2037.

    Google Scholar 

  • Lasswell, H. D., & Leites, N. (1949). Language of politics; studies in quantitative semantics. New York: G. W. Stewart.

    Google Scholar 

  • Lee, F., Peterson, C., & Tiedens, L. Z. (2004). Mea culpa: Predicting stock prices from organizational attributions. Personality and Social Psychology Bulletin, 30(12), 1636–1649.

    Article  Google Scholar 

  • Lee, T. Y., & Bradlow, E. T. (2011). Automated marketing research using online customer reviews. Journal of Marketing Research, 48(5), 881–894.

    Article  Google Scholar 

  • Ludwig, S., Ko, d. R., Friedman, M., Brüggen, E. C., Wetzels, M., & Pfann, G. (2013). More than words: The influence of affective content and linguistic style matches in online reviews on conversion rates. Journal of Marketing, 77(1), 87–103.

    Article  Google Scholar 

  • Ludwig, S., Van Laer, T., De Ruyter, K., & Friedman, M. (2016). Untangling a web of lies: Exploring automated detection of deception in computer-mediated communication. Journal of Management Information Systems, 33(2), 511–541.

    Article  Google Scholar 

  • Maheswaran, D., & Sternthal, B. (1990). The effects of knowledge, motivation, and type of message on ad processing and product judgments. Journal of Consumer Research, 17(1), 66–73.

    Article  Google Scholar 

  • Maheswaran, D. (1994). Country of origin as a stereotype: Effects of consumer expertise and attribute strength on product evaluations. Journal of Consumer Research, 21(2), 354–365.

    Article  Google Scholar 

  • Maheswaran, D., Sternthal, B., & Gurhan, Z. (1996). Acquisition and impact of consumer expertise. Journal of Consumer Psychology, 5(2), 115.

    Article  Google Scholar 

  • Mahoney, J. (2003). Strategies of causal assessment in comparative historical analysis. In J. Mahoney & D. Rueschemeyer (Eds.), Comparative historical analysis in the social sciences. Cambridge, UK/New York: Cambridge University Press. pp. xix, 444.

    Chapter  Google Scholar 

  • Mankad, S., Han, H. S., Goh, J., & Gavirneni, S. (2016). Understanding online hotel reviews through automated text analysis. Service Science, 8(2), 124–138.

    Article  Google Scholar 

  • Mehl, M. R., & Gill, A. J. (2008). Automatic text analysis. In S. D. G. J. A. Johnson (Ed.), Advanced methods for behavioral research on the internet. Washington, DC: American Psychological Association.

    Google Scholar 

  • Mestyán, M., Yasseri, T., & Kertész, J. (2013). Early prediction of movie box office success based on Wikipedia activity big data. PLoS One, 8(8), e71226.

    Article  Google Scholar 

  • Meyers-Levy, J., & Tybout, A. M. (1989). Schema congruity as a basis for product evaluation. Journal of Consumer Research, 16(1), 39–54.

    Article  Google Scholar 

  • Mill, J. S. (1843). A system of logic, ratiocinative and inductive: Being a connected view of the principles of evidence, and methods of scientific investigation. London: J.W. Parker.

    Google Scholar 

  • Moe, Wendy W., and Michael Trusov. (2011). The value of social dynamics in online product ratings forums. Journal of Marketing Research 48(3), 444–456.

    Article  Google Scholar 

  • Moe, W. W., & Schweidel, D. A. (2014). Social media intelligence. Cambridge, UK/New York: Cambridge University Press.

    Book  Google Scholar 

  • Mogilner, C., Kamvar, S. D., & Aaker, J. (2010). The shifting meaning of happiness. Social Psychological and Personality Science, 2(4), 395–402.

    Article  Google Scholar 

  • Money, R. B., Gilly, M. C., & Graham, J. L. (1998). Explorations of national culture and word-of-mouth referral behavior in the purchase of industrial services in the United States and Japan. Journal of Marketing, 62, 76–87.

    Article  Google Scholar 

  • Monroe, B. L., Colaresi, M. P., & Quinn, K. M. (2009). Fightin' words: Lexical feature selection and evaluation for identifying the content of political conflict. Political Analysis, 16(4), 372–403.

    Article  Google Scholar 

  • Moore, S. G. (2015). Attitude predictability and helpfulness in online reviews: The role of explained actions and reactions. Journal of Consumer Research, 42(1), 30–44.

    Article  Google Scholar 

  • Netzer, O., Feldman, R., Goldenberg, J., & Fresko, M. (2012). Mine your own business: Market-structure surveillance through text mining. Marketing Science, 31(3), 521–543.

    Article  Google Scholar 

  • Opoku, R., Abratt, R., & Pitt, L. (2006). Communicating brand personality: Are the websites doing the talking for the top South African business schools? Journal of Brand Management, 14(1–2), 20–39.

    Article  Google Scholar 

  • Packard, G., Moore, S. G., & McFerran, B. (2014). How can “I” help “you”? The impact of personal pronoun use in customer-firm agent interactions. MSI report, pp. 14–110.

    Google Scholar 

  • Packard, G. M., & Wooten, D. B. (2013). Compensatory knowledge signaling in consumer word-of-mouth. Journal of Consumer Psychology 23(4), 434–450.

    Google Scholar 

  • Palmquist, M. E., Carley, K., & Dale, T. (2009). Analyzing maps of literary and non-literary texts. In K. Krippendorff & M. A. Bock (Eds.), The content analysis reader (pp. 4120–4415). Thousand Oaks: Sage.

    Google Scholar 

  • Pennebaker, J. W., Francis, M. E., & Booth, R. J. (2001). Linguistic inquiry and word count: Liwc 2001 (Vol. 71). Mahway: Lawrence Erlbaum Associates.

    Google Scholar 

  • Pennebaker, J. W., & King, L. A. (1999). Linguistic styles: Language use as an individual difference. Journal of Personality and Social Psychology, 77(6)., 17p.

    Google Scholar 

  • Petty, R. E., & Cacioppo, J. T. (1979). Issue involvement can increase or decrease persuasion by enhancing message-relevant cognitive responses. Journal of Personality and Social Psychology, 37(10), 1915.

    Article  Google Scholar 

  • Phelps, J. E., Lewis, R., Mobilio, L., Perry, D., & Raman, N. (2004). Viral marketing or electronic word-of-mouth advertising: Examining consumer responses and motivations to pass along email. Journal of Advertising Research, 44(4), 333–348.

    Article  Google Scholar 

  • Plaisant, C., Rose, J., Bei, Y., Auvil, L., Kirschenbaum, M. G., Smith, M. N., Clement T., & Lord G. (2006). Exploring erotics in emily Dickinson’s correspondence with text mining and visual interfaces. In Proceedings of the 6th ACM/IEEE-CS joint conference on digital libraries, ACM, pp. 141–150.

    Google Scholar 

  • Potts, C., & Schwarz, F. (2010). Affective ‘this’. Linguistic Issues in Language Technology, 3(5), 1–30.

    Google Scholar 

  • Quantcast. (2010a) Cnet monthly traffic (estimated). (www.quantcast.com/cnet.com).

  • Quantcast. (2010b). Amazon monthly traffic (estimated). (www.quantcast.com/amazon.com).

  • Rayson, P. (2009). Wmatrix: A web-based corpus processing environment. Edited by C. Department, Lancaster University, UK.

    Google Scholar 

  • Salton, Gerard, and Michael J. McGill. (1983). Introduction to modern information retrieval McGraw-Hill. New York.

    Google Scholar 

  • Schweidel, D. A., & Moe, W. W. (2014). Listening in on social media: A joint model of sentiment and venue format choice. Journal of Marketing Research, 51(4), 387–402.

    Article  Google Scholar 

  • Sennett, R. (2006). The culture of the new capitalism. New Haven: Yale University Press.

    Google Scholar 

  • Snefjella, B., & Kuperman, V. (2015). Concreteness and psychological distance in natural language use. Psychological Science, 26(9), 1449–1460.

    Article  Google Scholar 

  • Spiller, S. A., & Belogolova, L. (2016). On consumer beliefs about quality and taste. Journal of Consumer Research, 43(6), 970–991.

    Google Scholar 

  • Stephen, A. T., & Toubia, O. (2010). Deriving value from social commerce networks. Journal of Marketing Research, 47(2), 215–228.

    Article  Google Scholar 

  • Stevenson, T. H., & Swayne, L. E. (1999). The portrayal of African-Americans in business-to-business direct mail: A benchmark study. Journal of Advertising, 28(3), 25–35.

    Article  Google Scholar 

  • Stone, P. J. (1966). The general inquirer; a computer approach to content analysis. Cambridge: MIT Press.

    Google Scholar 

  • Sujan, M. (1985). Consumer knowledge: Effects on evaluation strategies mediating consumer judgments. Journal of Consumer Research, 12(1), 31–46.

    Article  Google Scholar 

  • Tirunillai, S., & Tellis, G. J. (2012). Does chatter really matter? Dynamics of user-generated content and stock performance. Marketing Science, 31(2), 198–215.

    Article  Google Scholar 

  • Tirunillai, S., & Tellis, G. J. (2014). Mining marketing meaning from online chatter: Strategic brand analysis of big data using latent Dirichlet allocation. Journal of Marketing Research, 51(4), 463–479.

    Article  Google Scholar 

  • Van de Rijt, A., Shor, E., Ward, C., & Skiena, S. (2013). Only 15 minutes? The social stratification of fame in printed media. American Sociological Review, 78(2), 266–289.

    Article  Google Scholar 

  • Van Laer, T., Escalas J. E., Ludwig S., & Van den Hende E. A. (2017). What happens in Vegas stays on TripAdvisor? Computerized text analysis of narrativity in online consumer reviews.

    Google Scholar 

  • Ordenes, V., Francisco, S. L., Ko, D. R., Grewal, D., & Wetzels, M. (2017). Unveiling what is written in the stars: Analyzing explicit, implicit, and discourse patterns of sentiment in social media. Journal of Consumer Research, 43(6), 875–894.

    Google Scholar 

  • Weber, K. (2005). A toolkit for analyzing corporate cultural toolkits. Poetics, 33(3/4), 26p.

    Google Scholar 

  • Weber, M. (1924). Towards a sociology of the press. Paper presented at the first congress of sociologists, Frankfurt.

    Google Scholar 

  • Winer, R. S. (2009). New communications approaches in marketing: Issues and research directions. Journal of Interactive Marketing, 23(2), 108–117. https://doi.org/10.1016/j.intmar.2009.02.004.

    Article  Google Scholar 

  • Zipf, G. K. (1932). Selected studies of the principle of relative frequency in language. Cambridge, MA: Harvard University Press.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashlee Humphreys .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Humphreys, A. (2019). Automated Text Analysis. In: Homburg, C., Klarmann, M., Vomberg, A. (eds) Handbook of Market Research. Springer, Cham. https://doi.org/10.1007/978-3-319-05542-8_26-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-05542-8_26-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-05542-8

  • Online ISBN: 978-3-319-05542-8

  • eBook Packages: Springer Reference Business and ManagementReference Module Humanities and Social SciencesReference Module Business, Economics and Social Sciences

Publish with us

Policies and ethics