ABSTRACT
Online reviews are a cornerstone of consumer decision making. However, their authenticity and quality has proven hard to control, especially as polluters target these reviews toward promoting products or in degrading competitors. In a troubling direction, the widespread growth of crowdsourcing platforms like Mechanical Turk has created a large-scale, potentially difficult-to-detect workforce of malicious review writers. Hence, this paper tackles the challenge of uncovering crowdsourced manipulation of online reviews through a three-part effort: (i) First, we propose a novel sampling method for identifying products that have been targeted for manipulation and a seed set of deceptive reviewers who have been enlisted through crowdsourcing platforms. (ii) Second, we augment this base set of deceptive reviewers through a reviewer-reviewer graph clustering approach based on a Markov Random Field where we define individual potentials (of single reviewers) and pair potentials (between two reviewers). (iii) Finally, we embed the results of this probabilistic model into a classification framework for detecting crowd-manipulated reviews. We find that the proposed approach achieves up to 0.96 AUC, outperforming both traditional detection methods and a SimRank-based alternative clustering approach.
- Amazon. Amazon vine program. http://www.amazon.com/gp/vine/help, August 2014.Google Scholar
- D. Anguelov, D. Koller, H.-C. Pang, P. Srinivasan, and S. Thrun. Recovering articulated object models from 3d range data. In UAI, 2004. Google ScholarDigital Library
- M. S. Bernstein, G. Little, R. C. Miller, B. Hartmann, M. S. Ackerman, D. R. Karger, D. Crowell, and K. Panovich. Soylent: A word processor with a crowd inside. In UIST, 2010. Google ScholarDigital Library
- J. A. Chevalier and D. Mayzlin. The effect of word of mouth on sales: Online book reviews. Technical report, National Bureau of Economic Research, 2003, http://www.nber.org/papers/w10148.Google ScholarCross Ref
- C. Danescu-Niculescu-Mizil, G. Kossinets, J. Kleinberg, and L. Lee. How opinions are received by online communities: A case study on Amazon. com helpfulness votes. In WWW, 2009. Google ScholarDigital Library
- A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39:1--38, 1977.Google ScholarCross Ref
- C. Elliott. Hotel reviews online: In bed with hope, half-truths and hype. http://www.nytimes.com/2006/02/07/business/07guides.html, Feb, 2006.Google Scholar
- T. Fawcett. An introduction to roc analysis. Pattern Recogn. Lett., 27(8):861--874, June 2006. Google ScholarDigital Library
- S. Feng, L. Xing, A. Gogar, and Y. Choi. Distributional footprints of deceptive product reviews. In ICWSM, 2012.Google Scholar
- C. Forman, A. Ghose, and B. Wiesenfeld. Examining the relationship between reviews and sales: The role of reviewer identity disclosure in electronic markets. Information Systems Research, 19(3):291--313, 2008.Google ScholarCross Ref
- M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh, and R. Xin. Crowddb: Answering queries with crowdsourcing. In SIGMOD, 2011. Google ScholarDigital Library
- G. Jeh and J. Widom. Simrank: A measure of structural-context similarity. In KDD, 2002. Google ScholarDigital Library
- N. Jindal and B. Liu. Opinion spam and analysis. In WSDM, 2008. Google ScholarDigital Library
- J. Kleinberg and E. Tardos. Approximation algorithms for classification problems with pairwise relationships: Metric labeling and markov random fields. Journal of the ACM (JACM), 49(5):616--639, 2002. Google ScholarDigital Library
- K. Lee, J. Caverlee, Z. Cheng, and D. Z. Sui. Campaign extraction from social media. ACM Trans. Intell. Syst. Technol., 5(1):9:1--9:28, Jan. 2014. Google ScholarDigital Library
- K. Lee, B. D. Eoff, and J. Caverlee. Seven months with the devils: A long-term study of content polluters on twitter. In ICWSM, 2011.Google Scholar
- K. Lee, P. Tamilarasan, and J. Caverlee. Crowdturfers, campaigns, and social media: Tracking and revealing crowdsourced manipulation of social media. In ICWSM, 2013.Google Scholar
- S. Lee and J. Kim. Warningbird: Detecting suspicious urls in twitter stream. In NDSS, 2012.Google Scholar
- Y. Lu, P. Tsaparas, A. Ntoulas, and L. Polanyi. Exploiting social context for review quality prediction. In WWW, 2010. Google ScholarDigital Library
- C. C. Miller. Company settles case of reviews it faked. http://www.nytimes.com/2009/07/15/technology/internet/15lift.html, Jul, 2009.Google Scholar
- A. Mukherjee, B. Liu, and N. Glance. Spotting fake reviewer groups in consumer reviews. In WWW, 2012. Google ScholarDigital Library
- M. Ott, C. Cardie, and J. Hancock. Estimating the prevalence of deception in online review communities. In WWW, 2012. Google ScholarDigital Library
- M. Ott, Y. Choi, C. Cardie, and J. T. Hancock. Finding deceptive opinion spam by any stretch of the imagination. In HLT, 2011. Google ScholarDigital Library
- P. Sobkowicz, M. Thelwall, K. Buckley, G. Paltoglou, and A. Sobkowicz. Lognormal distributions of user post lengths in internet discussions-a consequence of the weber-fechner law EPJ Data Science, 2(1):1--20, 2013.Google Scholar
- D. Streitfeld. In a race to out-rave, 5-star web reviews go for$5. http://www.nytimes.com/2011/08/20/technology/finding-fake-reviews-online.html, Aug, 2011.Google Scholar
- VIP deals rebate letter. https://www.documentcloud.org/documents/286364-vip-deals.html, Dec, 2011.Google Scholar
- G. Wang, T. Wang, H. Zheng, and B. Y. Zhao. Man vs. machine: Practical adversarial detection of malicious crowdsourcing workers. In USENIX Security, 2014. Google ScholarDigital Library
- G. Wang, C. Wilson, X. Zhao, Y. Zhu, M. Mohanlal, H. Zheng, and B. Y. Zhao. Serf and turf: Crowdturfing for fun and profit. In WWW, 2012. Google ScholarDigital Library
- S. Xie, G. Wang, S. Lin, and P. S. Yu. Review spam detection via temporal pattern discovery. In KDD, 2012. Google ScholarDigital Library
Index Terms
- Uncovering Crowdsourced Manipulation of Online Reviews
Recommendations
Building a task blacklist for online social platforms
ASONAM '19: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and MiningRecently, the use of crowdsourcing platforms (e.g., Amazon Mechanical Turk) has boomed because of their flexible and cost-effective nature, which benefits both requestors and workers. However, some requestors misused power of the crowdsourcing platforms ...
Uncovering the predictors of unsafe computing behaviors in online crowdsourcing contexts
AbstractThe self-protective decisions of crowd workers are driven by the interplay of many factors, including the characteristics of the crowdsourced tasks, the trustworthiness of the task providers, and the perceived reliability of the ...
Crowdsourced App Review Manipulation
SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information RetrievalWith the rapid adoption of smartphones worldwide and the reliance on app marketplaces to discover new apps, these marketplaces are critical for connecting users with apps. And yet, the user reviews and ratings on these marketplaces may be strategically ...
Comments