skip to main content
research-article

Collusive Opinion Fraud Detection in Online Reviews: A Probabilistic Modeling Approach

Authors Info & Claims
Published:24 July 2017Publication History
Skip Abstract Section

Abstract

We address the collusive opinion fraud problem in online review portals, where groups of people work together to deliver deceptive reviews for manipulating the reputations of targeted items. Such collusive fraud is considered much harder to defend against, since the participants (or colluders) can evade detection by shaping their behaviors collectively so as not to appear suspicious. To alleviate this problem, countermeasures have been proposed that leverage the collective behaviors of colluders. The motivation stems from the observation that colluders typically act in a very synchronized way, as they are instructed by the same campaigns with common items to target and schedules to follow. However, the collective behaviors examined in existing solutions focus mostly on the external appearance of fraud campaigns, such as the campaign size and the size of the targeted item set. These signals may become ineffective once colluders have changed their behaviors collectively. Moreover, the detection algorithms used in existing approaches are designed to only make collusion inference on the input data; predictive models that can be deployed for detecting emerging fraud cannot be learned from the data. In this article, to complement existing studies on collusive opinion fraud characterization and detection, we explore more subtle behavioral trails in collusive fraud practice. In particular, a suite of homogeneity-based measures are proposed to capture the interrelationships among colluders within campaigns. Moreover, a novel statistical model is proposed to further characterize, recognize, and predict collusive fraud in online reviews. The proposed model is fully unsupervised and highly flexible to incorporate effective measures available for better modeling and prediction. Through experiments on two real-world datasets, we show that our method outperforms the state of the art in both characterization and detection abilities.

References

  1. L. Akoglu, R. Chandy, and C. Faloutsos. 2013. Opinion fraud detection in online reviews by network effects. In Proceedings of the 7th International AAAI Conference on Web and Social Media (ICWSM’13). 2--11.Google ScholarGoogle Scholar
  2. Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. CopyCatch: Stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13). 119--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Christopher M. Bishop et al. 2006. Pattern Recognition and Machine Learning. Vol. 4. Springer, New York, NY.Google ScholarGoogle Scholar
  4. Chester I. Bliss. 1934. The method of probits. Science 79, 2037 (1934), 38--39. Google ScholarGoogle ScholarCross RefCross Ref
  5. Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B 39, 1 (1977), 1--38.Google ScholarGoogle ScholarCross RefCross Ref
  6. Geli Fei, Arjun Mukherjee, Bing Liu, Meichun Hsu, Malu Castellanos, and Riddhiman Ghosh. 2013. Exploiting burstiness in reviews for review spammer detection. In Proceedings of the 7th International AAAI Conference on Web and Social Media (ICWSM’13) 13 (2013), 175--184.Google ScholarGoogle Scholar
  7. Yoav Freund, Raj Iyer, Robert E. Schapire, and Yoram Singer. 2003. An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4, 11 (2003), 933--969.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar Sharma, Gautam Korlam, Fabricio Benevenuto, Niloy Ganguly, and Krishna Phani Gummadi. 2012. Understanding and combating link farming in the twitter social network. In Proceedings of the 21st International Conference on World Wide Web (WWW’12). 61--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Jindal and B. Liu. 2008. Opinion spam and analysis. In Proceedings of the 2008 International Conference on Web Search and Data Mining (WSDM’08). 219--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Thorsten Joachims. 2002. Optimizing search engines using clickthrough data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’02). 133--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Jordan. 2002. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Adv. Neur. Inform. Process. Syst. 14 (2002), 841.Google ScholarGoogle Scholar
  12. Julia A. Lasserre, Christopher M. Bishop, and Thomas P. Minka. 2006. Principled hybrids of generative and discriminative models. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1. IEEE, 87--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. P. Lim, V. A. Nguyen, N. Jindal, B. Liu, and H. W. Lauw. 2010. Detecting product review spammers using rating behaviors. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM’10). 939--948. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Michael Luca and Georgios Zervas. 2013. Fake it Till You Make It: Reputation, Competition, and Yelp Review Fraud. Harvard Business School NOM Unit Working Paper 14-006.Google ScholarGoogle Scholar
  15. Arash Molavi Kakhki, Chloe Kliman-Silver, and Alan Mislove. 2013. Iolaus: Securing online content rating systems. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13). 919--930. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Arjun Mukherjee, Abhinav Kumar, Bing Liu, Junhui Wang, Meichun Hsu, Malu Castellanos, and Riddhiman Ghosh. 2013. Spotting opinion spammers using behavioral footprints. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’13). 632--640. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Mukherjee, B. Liu, and N. Glance. 2012. Spotting fake reviewer groups in consumer reviews. In Proceedings of the 21st International Conference on World Wide Web (WWW’12). 191--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Arjun Mukherjee, Vivek Venkataraman, Bing Liu, and Natalie S Glance. 2013. What yelp fake review filter might be doing?. In Proceedings of the 7th International AAAI Conference on Web and Social Media (ICWSM’13).Google ScholarGoogle Scholar
  19. Myle Ott, Claire Cardie, and Jeff Hancock. 2012. Estimating the prevalence of deception in online review communities. In Proceedings of the 21st International Conference on World Wide Web (WWW’12). 201--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T. Hancock. 2011. Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Volume 1. 309--319.Google ScholarGoogle Scholar
  21. Mahmudur Rahman, Bogdan Carbunar, Jaime Ballesteros, George Burri, and Duen Horng Polo Chau. 2014. Turning the tide: Curbing deceptive yelp behaviors. In Proceedings of the 2014 SIAM International Conference on Data Mining (SDM’14). 244--252. Google ScholarGoogle ScholarCross RefCross Ref
  22. Shebuti Rayana and Leman Akoglu. 2015. Collective opinion spam detection: Bridging review networks and metadata. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 985--994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Huan Sun, Alex Morales, and Xifeng Yan. 2013. Synthetic review spamming and defense. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’13). 1088--1096. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. G. Wang, S. Xie, B. Liu, and S. Y. Philip. 2012a. Identify online store review spammers via social review graph. ACM Trans. Intell. Syst. Technol. 3, 4 (2012), 61.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Guan Wang, Sihong Xie, Bing Liu, and Philip S. Yu. 2012b. Identify online store review spammers via social review graph. ACM Trans. Intell. Syst. Technol. 3, 4 (2012), 61.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Chang Xu and Jie Zhang. 2015. Towards collusive fraud detection in online reviews. In Proceedings of the 2015 IEEE International Conference on Data Mining (ICDM’15). 1051--1056. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Chang Xu, Jie Zhang, Kuiyu Chang, and Chong Long. 2013. Uncovering collusive spammers in chinese review websites. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM’13). 979--988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Chao Yang, Robert Harkreader, Jialong Zhang, Seungwon Shin, and Guofei Gu. 2012. Analyzing spammers’ social networks for fun and profit: A case study of cyber criminal ecosystem on twitter. In Proceedings of the 21st International Conference on World Wide Web (WWW’12). 71--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Junting Ye and Leman Akoglu. 2015. Discovering opinion spammer groups by network footprints. In Machine Learning and Knowledge Discovery in Databases. 267--282. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Collusive Opinion Fraud Detection in Online Reviews: A Probabilistic Modeling Approach

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on the Web
        ACM Transactions on the Web  Volume 11, Issue 4
        November 2017
        257 pages
        ISSN:1559-1131
        EISSN:1559-114X
        DOI:10.1145/3127338
        Issue’s Table of Contents

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 July 2017
        • Accepted: 1 May 2017
        • Revised: 1 March 2017
        • Received: 1 June 2016
        Published in tweb Volume 11, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader