Abstract
Availability of millions of products and services on e-commerce sites makes it difficult to search the best suitable product according to the requirements because of existence of many alternatives. To get rid of this the most popular and useful approach is to follow reviews of others in opinionated social medias, who have already tried them. Almost all e-commerce sites provide facility to the users for giving views and experience of the product and services they experienced. The customers reviews are increasingly used by individuals, manufacturers and retailers for purchase and business decisions. As there is no scrutiny over the reviews received, anybody can write anything unanimously which conclusively leads to review spam. Moreover, driven by the desire of profit and/or publicity, spammers produce synthesized reviews to promote some products/brand and demote competitors products/brand. Deceptive review spam has seen a considerable growth overtime. In this work, we have applied supervised as well as unsupervised techniques to identify review spam. Most effective feature sets have been assembled for model building. Sentiment analysis has also been incorporated in the detection process. In order to get best performance some well-known classifiers were applied on labeled dataset. Further, for the unlabeled data, clustering is used after desired attributes were computed for spam detection. Additionally, there is a high chance that spam reviewers may also be held responsible for content pollution in multimedia social networks, because nowadays many users are giving the reviews using their social network logins. Finally, the work can be extended to find suspicious accounts responsible for posting fake multimedia contents into respective social networks.
Similar content being viewed by others
References
Akoglu L, Chandy R, Faloutsos C (2013) Opinion fraud detection in online reviews by network effects. Proc Seventh Int AAAI Conf Weblogs Soc Media 13:2–11
Algur SP, Patil AP, Hiremath P, Shivashan S (2010) Conceptual level similarity measure based review spam detection. In: International Conference on Signal and Image Processing. doi:10.1109/ICSIP.2010.5697509, pp 416–423
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory. doi:10.1145/279943.279962, pp 92–100
Crawford M, Khoshgoftaar TM, Prusa JD, Richter AN, Al Najada H (2015) Survey of review spam detection using machine learning techniques. J Big Data 2(1):1–24. doi:10.1186/s40537-015-0029-9
Fei G, Mukherjee A, Liu B, Hsu M, Castellanos M, Ghosh R (2013) Exploiting burstiness in reviews for review spammer detection. Proc Seventh Int AAAI Conf Weblogs Soc Media 13:175–184
Gao Y, Wang F, Luan H, Chua TS (2014) Brand data gathering from live social media streams. In: Proceedings of International Conference on Multimedia Retrieval. doi:10.1145/2578726.2578748, p 169
Gao Y, Zhao S, Yang Y, Chua TS (2015) Multimedia social event detection in microblog. In: Multimedia Modeling. doi:10.1007/978-3-319-14445-0-24, pp 269–281
Günnemann S, Günnemann N, Faloutsos C (2014) Detecting anomalies in dynamic rating data: A robust probabilistic model for rating evolution. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. doi:10.1145/2623330.2623721, pp 841–850
Harris C (2012) Detecting deceptive opinion spam using human computation. In: Workshops at AAAI on Artificial Intelligence
Hernández D, Guzmán R, Móntes y, Gomez M, Rosso P (2013) Using pu-learning to detect deceptive opinion spam. In: Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp 38–45
Jindal N, Liu B (2007) Analyzing and detecting review spam. In: Proceedings of the Seventh IEEE International Conference on Data Mining. doi:10.1109/ICDM.2007.68, pp 547–552
Jindal N, Liu B (2007) Review spam detection. In: Proceedings of the 16th International Conference on World Wide Web. doi:10.1145/1242572.1242759, pp 1189–1190
Jindal N, Liu B (2008) Opinion spam and analysis. In: Proceedings of the 2008 International Conference on Web Search and Data Mining. doi:10.1145/1341531.1341560, pp 219–230
Lai C, Xu K, Lau RY, Li Y, Jing L (2010) Toward a language modeling approach for consumer review spam detection. In: Proceedings of IEEE 7th International Conference on E-business Engineering. doi:10.1109/ICEBE.2010.47, pp 1–8
Lau RY, Liao S, Kwok RCW, Xu K, Xia Y, Li Y (2011) Text mining and probabilistic language modeling for online review spam detecting. ACM Trans Manag Inf Syst 2(4):1–30. doi:10.1145/2070710.2070716
Lee K, Caverlee J, Pu C (2014) Social spam, campaigns, misinformation and crowdturfing. In: WWW (Companion volume). doi:10.1145/2567948.2577270, pp 199–200
Li F, Huang M, Yang Y, Zhu X (2011) Learning to identify review spam. In: IJCAI Proceedings-International Joint Conference on Artificial Intelligence. doi:10.5591/978-1-57735-516-8/IJCAI11-414, vol 22, p 2488
Lim EP, Nguyen VA, Jindal N, Liu B, Lauw HW (2010) Detecting product review spammers using rating behaviors. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management. doi:10.1145/1871437.1871557, pp 939–948
Liu B, Dai Y, Li X, Lee WS, Yu PS (2003) Building text classifiers using positive and unlabeled examples. In: Proceedings of 3rd IEEE International Conference on Data Mining. doi:10.1109/ICDM.2003.1250918, pp 179–186
Long NH, Nghia PHT, Vuong NM (2014) Opinion spam recognition method for online reviews using ontological features. Tap chi KHOA HoC DHSP TPHCM (61) 44
Mukherjee A, Liu B, Wang J, Glance N, Jindal N (2011) Detecting group review spam. In: Proceedings of the 20th International Conference Companion on World Wide Web. doi:10.1145/1963192.1963240, pp 93–94
Mukherjee A, Liu B, Glance N (2012) Spotting fake reviewer groups in consumer reviews. In: Proceedings of the 21st International Conference on World Wide Web. doi:10.1145/2187836.2187863, pp 191–200
Mukherjee A, Kumar A, Liu B, Wang J, Hsu M, Castellanos M, Ghosh R (2013) Spotting opinion spammers using behavioral footprints. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. doi:10.1145/2487575.2487580, pp 632–640
Mukherjee A, Venkataraman V, Liu B, Glance N (2013) Fake review detection: Classification and analysis of real and pseudo reviews. Technical. Report., Technical Report UIC-CS-2013-03 University of Illinois at Chicago
Mukherjee A, Venkataraman V, Liu B, Glance NS (2013) What yelp fake review filter might be doing?. In: Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media
Ott M, Choi Y, Cardie C, Hancock JT (2011) Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-volume 1, pp 309–319
Ott M, Cardie C, Hancock JT (2013) Negative deceptive opinion spam. In: Proceedings of North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 497–501
Peng Q, Zhong M (2014) Detecting spam review through sentiment analysis. J Softw 9(8):2065–2072. doi:10.4304/jsw.9.8.2065-2072
Qi S, Wang F, Wang X, Wei J, Zhao H (2015) Live multimedia brand-related data identification in microblog. Neurocomputing 158:225–233. doi:10.1016/j.neucom.2015.01.041
Rayson P, Wilson A, Leech G (2001) Grammatical word class variation within the british national corpus sampler. Lang Comput 36(1):295–306
Ren Y, Ji D, Zhang H (2014) Positive unlabeled learning for deceptive reviews detection. In: Proceedings of First Conference on Empirical Methods in Natural Language Processing, pp 488–498
Shojaee S, Murad MAA, Bin Azman A, Sharef NM, Nadali S (2013) Detecting deceptive reviews using lexical and syntactic features. In: Proceedings of 13th International Conference on Intelligent Systems Design and Applications. doi:10.1109/ISDA.2013.6920707, pp 53–58
Wang F, Qi S, Gao G, Zhao S, Wang X (2016) Logo information recognition in large-scale social media data. Multimed Syst 22(1):63–73. doi:10.1007/s00530-014-0393-x
Wu G, Greene D, Smyth B, Cunningham P (2010) Distortion as a validation criterion in the identification of suspicious reviews. In: Proceedings of the First Workshop on Social Media Analytics. doi:10.1145/1964858.1964860, pp 10–13
Zhang Z, Wang K (2013) A trust model for multimedia social networks. Soc Netw Anal Min 3(4):969–979. doi:10.1007/s13278-012-0078-4
Zhao S, Yao H, Zhao S, Jiang X, Jiang X (2014) Multi-modal microblog classification via multi-task learning. Multimed Tools Appl:1–18. doi:10.1007/s11042-014-2342-2
Acknowledgments
The work presented in this article is partially funded by the following two projects:
1. Information Security Education & Awareness Project (Phase II), Ministry of Communications and Information Technology, Government of India, and
2. Fund for Improvement of S&T Infrastructure in Universities and Higher Educational Institutions (FIST) Program, Department of Science and Technology, Government of India.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rout, J.K., Singh, S., Jena, S.K. et al. Deceptive review detection using labeled and unlabeled data. Multimed Tools Appl 76, 3187–3211 (2017). https://doi.org/10.1007/s11042-016-3819-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3819-y