Efficiently Detecting Web Spambots in a Temporally Annotated Sequence

Alamro, Hayam; Iliopoulos, Costas S.; Loukides, Grigorios

doi:10.1007/978-3-030-44041-1_87

Hayam Alamro^19,20,
Costas S. Iliopoulos¹⁹ &
Grigorios Loukides¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1151))

Included in the following conference series:

International Conference on Advanced Information Networking and Applications

2065 Accesses
1 Citations

Abstract

Web spambots are becoming more advanced, utilizing techniques that can defeat existing spam detection algorithms. These techniques include performing a series of malicious actions with variable time delays, repeating the same series of malicious actions multiple times, and interleaving legitimate (decoy) and malicious actions. Existing methods that are based on string pattern matching are not able to detect spambots that use these techniques. In response, we define a new problem to detect spambots utilizing the aforementioned techniques and propose an efficient algorithm to solve it. Given a dictionary of temporally annotated sequences \(\hat{S}\) modeling spambot actions, each associated with a time window, a long, temporally annotated sequence T modeling a user action log, and parameters f and k, our problem seeks to detect each sequence in \(\hat{S}\) that occurs in T at least f times within its associated time window, and with at most k mismatches. Our algorithm solves the problem exactly, it requires linear time and space, and it employs advanced data structures and the Kangaroo method, to deal with the problem efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Yan, J., El Ahmad, A.S.: A low-cost attack on a Microsoft captcha. In: CCS. ACM, pp. 543–554 (2008)
Google Scholar
Zinman, A., Donath, J.S.: Is britney spears spam? In: CEAS (2007)
Google Scholar
Webb, S., Caverlee, J., Pu, C.: Social honeypots: making friends with a spammer near you. In: CEAS, pp. 1–10 (2008)
Google Scholar
Heymann, P., Koutrika, G., Garcia-Molina, H.: Fighting spam on social web sites: a survey of approaches and future challenges. IEEE Internet Comput. 11(6), 36–45 (2007)
Article Google Scholar
Hayati, P., Chai, K., Potdar, V., Talevski, A.: Behaviour-based web spambot detection by utilising action time and action frequency. In: International Conference on Computational Science and Its Applications, pp. 351–360 (2010)
Google Scholar
Benevenuto, F., Rodrigues, T., Almeida, V., Almeida, J., Zhang, C., Ross, K.: Identifying video spammers in online social networks. In: International workshop on Adversarial Information Retrieval on the Web, pp. 45–52. ACM (2008)
Google Scholar
Wang, A.H.: Detecting spam bots in online social networking sites: a machine learning approach. In: CODASPY, pp. 335–342 (2010)
Google Scholar
Hayati, P., Potdar, V., Talevski, A., Smyth, W.: Rule-based on-the-fly web spambot detection using action strings. In: CEAS (2010)
Google Scholar
Ghanaei, V., Iliopoulos, C.S., Pissis, S.P.: Detection of web spambot in the presence of decoy actions. In: IEEE International Conference on Big Data and Cloud Computing, pp. 277–279 (2014)
Google Scholar
Nicolae, M., Rajasekaran, S.: On pattern matching with k mismatches and few don’t cares. IPL 118, 78–82 (2017)
Article MathSciNet Google Scholar
Wang, D., Rundensteiner, E.A., Wang, H., Ellison III, R.T.: Active complex event processing: applications in real-time health care. PVLDB 3(1–2), 1545–1548 (2010)
Google Scholar
Wang, D., He, Y., Rundensteiner, E., Naughton, J.F.: Utility-maximizing event stream suppression. In: SIGMOD, pp. 589–600 (2013)
Google Scholar
Harvey, S.J.: Smart meters, smarter regulation: balancing privacy and innovation in the electric grid. UCLA L. Rev. 61, 2068 (2013)
Google Scholar
Aljamea, M.M., Brankovic, L., Gao, J., Iliopoulos, C.S., Samiruzzaman, M.: Smart meter data analysis. In: Proceedings of the International Conference on Internet of Things and Cloud Computing, p. 22 (2016)
Google Scholar
Alamro, H., Badkobeh, G., Belazzougui, D., Iliopoulos, C.S., Puglisi, S.J.: Computing the antiperiod(s) of a string. In: CPM, pp. 32:1–32:11 (2019)
Google Scholar
Kasai, T., Lee, G., Arimura, H., Arikawa, S., Park, K.: Linear-time longest-common-prefix computation in suffix arrays and its applications. In: CPM, pp. 181–192 (2001)
Google Scholar
Puglisi, S.J., Smyth, W.F., Turpin, A.H.: A taxonomy of suffix array construction algorithms. ACM Comput. Surv. 39(2), 4–es (2007)
Google Scholar
Yamamoto, M., Church, K.W.: Using suffix arrays to compute term frequency and document frequency for all substrings in a corpus. Comput. Linguist. 27(1), 1–30 (2001)
Article Google Scholar
Kärkkäinen, J., Sanders, P., Burkhardt, S.: Linear work suffix array construction. JACM 53(6), 918–936 (2006)
Article MathSciNet Google Scholar
Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: Replacing suffix trees with enhanced suffix arrays. J. Discrete Algorithms 2, 53–86 (2004)
Article MathSciNet Google Scholar
Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: Enhanced suffix arrays and applications. CRC Press (2006)
Google Scholar
Louza, F.A., Telles, G.P., Hoffmann, S., Ciferri, C.D.: Generalized enhanced suffix array construction in external memory. AMB 12(1), 26 (2017)
Google Scholar
Nong, G., Zhang, S., Chan, W.H.: Linear suffix array construction by almost pure induced-sorting. In: DCC, pp. 193–202 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, King’s College London, London, UK
Hayam Alamro, Costas S. Iliopoulos & Grigorios Loukides
Department of Information Systems, Princess Nourah bint Abdulrahman University, Riyadh, Kingdom of Saudi Arabia
Hayam Alamro

Authors

Hayam Alamro
View author publications
You can also search for this author in PubMed Google Scholar
Costas S. Iliopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Grigorios Loukides
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Grigorios Loukides .

Editor information

Editors and Affiliations

Department of Information and Communication Engineering, Faculty of Information Engineering, Fukuoka Institute of Technology, Fukuoka, Japan
Leonard Barolli
Department of Electrical Engineering and Information Technology, University of Naples “Frederico II”, Naples, Italy
Flora Amato
Department of Political Science, University of Campania “Luigi Vanvitelli”, Caserta, Italy
Francesco Moscato
Faculty of Business Administration, Rissho University, Tokyo, Japan
Tomoya Enokido
Department of Advanced Sciences, Faculty of Science and Engineering, Hosei University, Tokyo, Japan
Makoto Takizawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alamro, H., Iliopoulos, C.S., Loukides, G. (2020). Efficiently Detecting Web Spambots in a Temporally Annotated Sequence. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds) Advanced Information Networking and Applications. AINA 2020. Advances in Intelligent Systems and Computing, vol 1151. Springer, Cham. https://doi.org/10.1007/978-3-030-44041-1_87

Download citation

DOI: https://doi.org/10.1007/978-3-030-44041-1_87
Published: 28 March 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44040-4
Online ISBN: 978-3-030-44041-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics