skip to main content
10.1145/2766462.2767742acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Uncovering Crowdsourced Manipulation of Online Reviews

Published:09 August 2015Publication History

ABSTRACT

Online reviews are a cornerstone of consumer decision making. However, their authenticity and quality has proven hard to control, especially as polluters target these reviews toward promoting products or in degrading competitors. In a troubling direction, the widespread growth of crowdsourcing platforms like Mechanical Turk has created a large-scale, potentially difficult-to-detect workforce of malicious review writers. Hence, this paper tackles the challenge of uncovering crowdsourced manipulation of online reviews through a three-part effort: (i) First, we propose a novel sampling method for identifying products that have been targeted for manipulation and a seed set of deceptive reviewers who have been enlisted through crowdsourcing platforms. (ii) Second, we augment this base set of deceptive reviewers through a reviewer-reviewer graph clustering approach based on a Markov Random Field where we define individual potentials (of single reviewers) and pair potentials (between two reviewers). (iii) Finally, we embed the results of this probabilistic model into a classification framework for detecting crowd-manipulated reviews. We find that the proposed approach achieves up to 0.96 AUC, outperforming both traditional detection methods and a SimRank-based alternative clustering approach.

References

  1. Amazon. Amazon vine program. http://www.amazon.com/gp/vine/help, August 2014.Google ScholarGoogle Scholar
  2. D. Anguelov, D. Koller, H.-C. Pang, P. Srinivasan, and S. Thrun. Recovering articulated object models from 3d range data. In UAI, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. S. Bernstein, G. Little, R. C. Miller, B. Hartmann, M. S. Ackerman, D. R. Karger, D. Crowell, and K. Panovich. Soylent: A word processor with a crowd inside. In UIST, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. A. Chevalier and D. Mayzlin. The effect of word of mouth on sales: Online book reviews. Technical report, National Bureau of Economic Research, 2003, http://www.nber.org/papers/w10148.Google ScholarGoogle ScholarCross RefCross Ref
  5. C. Danescu-Niculescu-Mizil, G. Kossinets, J. Kleinberg, and L. Lee. How opinions are received by online communities: A case study on Amazon. com helpfulness votes. In WWW, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39:1--38, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  7. C. Elliott. Hotel reviews online: In bed with hope, half-truths and hype. http://www.nytimes.com/2006/02/07/business/07guides.html, Feb, 2006.Google ScholarGoogle Scholar
  8. T. Fawcett. An introduction to roc analysis. Pattern Recogn. Lett., 27(8):861--874, June 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Feng, L. Xing, A. Gogar, and Y. Choi. Distributional footprints of deceptive product reviews. In ICWSM, 2012.Google ScholarGoogle Scholar
  10. C. Forman, A. Ghose, and B. Wiesenfeld. Examining the relationship between reviews and sales: The role of reviewer identity disclosure in electronic markets. Information Systems Research, 19(3):291--313, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  11. M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh, and R. Xin. Crowddb: Answering queries with crowdsourcing. In SIGMOD, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Jeh and J. Widom. Simrank: A measure of structural-context similarity. In KDD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. N. Jindal and B. Liu. Opinion spam and analysis. In WSDM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Kleinberg and E. Tardos. Approximation algorithms for classification problems with pairwise relationships: Metric labeling and markov random fields. Journal of the ACM (JACM), 49(5):616--639, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. K. Lee, J. Caverlee, Z. Cheng, and D. Z. Sui. Campaign extraction from social media. ACM Trans. Intell. Syst. Technol., 5(1):9:1--9:28, Jan. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. K. Lee, B. D. Eoff, and J. Caverlee. Seven months with the devils: A long-term study of content polluters on twitter. In ICWSM, 2011.Google ScholarGoogle Scholar
  17. K. Lee, P. Tamilarasan, and J. Caverlee. Crowdturfers, campaigns, and social media: Tracking and revealing crowdsourced manipulation of social media. In ICWSM, 2013.Google ScholarGoogle Scholar
  18. S. Lee and J. Kim. Warningbird: Detecting suspicious urls in twitter stream. In NDSS, 2012.Google ScholarGoogle Scholar
  19. Y. Lu, P. Tsaparas, A. Ntoulas, and L. Polanyi. Exploiting social context for review quality prediction. In WWW, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. C. Miller. Company settles case of reviews it faked. http://www.nytimes.com/2009/07/15/technology/internet/15lift.html, Jul, 2009.Google ScholarGoogle Scholar
  21. A. Mukherjee, B. Liu, and N. Glance. Spotting fake reviewer groups in consumer reviews. In WWW, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Ott, C. Cardie, and J. Hancock. Estimating the prevalence of deception in online review communities. In WWW, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Ott, Y. Choi, C. Cardie, and J. T. Hancock. Finding deceptive opinion spam by any stretch of the imagination. In HLT, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. P. Sobkowicz, M. Thelwall, K. Buckley, G. Paltoglou, and A. Sobkowicz. Lognormal distributions of user post lengths in internet discussions-a consequence of the weber-fechner law EPJ Data Science, 2(1):1--20, 2013.Google ScholarGoogle Scholar
  25. D. Streitfeld. In a race to out-rave, 5-star web reviews go for$5. http://www.nytimes.com/2011/08/20/technology/finding-fake-reviews-online.html, Aug, 2011.Google ScholarGoogle Scholar
  26. VIP deals rebate letter. https://www.documentcloud.org/documents/286364-vip-deals.html, Dec, 2011.Google ScholarGoogle Scholar
  27. G. Wang, T. Wang, H. Zheng, and B. Y. Zhao. Man vs. machine: Practical adversarial detection of malicious crowdsourcing workers. In USENIX Security, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G. Wang, C. Wilson, X. Zhao, Y. Zhu, M. Mohanlal, H. Zheng, and B. Y. Zhao. Serf and turf: Crowdturfing for fun and profit. In WWW, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Xie, G. Wang, S. Lin, and P. S. Yu. Review spam detection via temporal pattern discovery. In KDD, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Uncovering Crowdsourced Manipulation of Online Reviews

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval
        August 2015
        1198 pages
        ISBN:9781450336215
        DOI:10.1145/2766462

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 9 August 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGIR '15 Paper Acceptance Rate70of351submissions,20%Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader