skip to main content
10.1145/3110025.3110091acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Classification of Twitter Accounts into Automated Agents and Human Users

Published:31 July 2017Publication History

ABSTRACT

Online social networks (OSNs) have seen a remarkable rise in the presence of surreptitious automated accounts. Massive human user-base and business-supportive operating model of social networks (such as Twitter) facilitates the creation of automated agents. In this paper we outline a systematic methodology and train a classifier to categorise Twitter accounts into 'automated' and 'human' users. To improve classification accuracy we employ a set of novel steps. First, we divide the dataset into four popularity bands to compensate for differences in types of accounts. Second, we create a large ground truth dataset using human annotations and extract relevant features from raw tweets. To judge accuracy of the procedure we calculate agreement among human annotators as well as with a bot detection research tool. We then apply a Random Forests classifier that achieves an accuracy close to human agreement. Finally, as a concluding step we perform tests to measure the efficacy of our results.

References

  1. F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida. Detecting spammers on twitter. In Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), volume 6, page 12, 2010.Google ScholarGoogle Scholar
  2. Y. Boshmaf, I. Muslukhov, K. Beznosov, and M. Ripeanu. The socialbot network: When bots socialize for fame and money. In Proceedings of the 27th Annual Computer Security Applications Conference, ACSAC '11, pages 93--102, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Z. Chu, S. Gianvecchio, H. Wang, and S. Jajodia. Who is tweeting on twitter: Human, bot, or cyborg? In Proceedings of the 26th Annual Computer Security Applications Conference, ACSAC '10, pages 21--30, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Cohen. A coefficient of agreement for nominal scales. Educational and psychological measurement, 20(1):37--46, 1960. Google ScholarGoogle ScholarCross RefCross Ref
  5. C. A. Davis, O. Varol, E. Ferrara, A. Flammini, and F. Menczer. Botornot: A system to evaluate social bots. In Proceedings of the 25th International Conference Companion on World Wide Web, WWW '16 Companion, pages 273--274, Republic and Canton of Geneva, Switzerland, 2016. International World Wide Web Conferences Steering Committee. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Edwards, A. Edwards, P. R. Spence, and A. K. Shelton. Is that a bot running the social media feed? testing the differences in perceptions of communication quality for a human agent and a bot agent on twitter. Computers in Human Behavior, 33:372--376, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. E. Ferrara, O. Varol, C. Davis, F. Menczer, and A. Flammini. The rise of social bots. Commun. ACM, 59(7):96--104, June 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Freitas, F. Benevenuto, S. Ghosh, and A. Veloso. Reverse engineering socialbot infiltration strategies in twitter. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, ASONAM '15, pages 25--32, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Z. Gilani, R. Farahbakhsh, and J. Crowcroft. Do bots impact twitter activity? In Proceedings of the 26th International Conference on World Wide Web Companion, WWW '17 Companion, pages 781--782, Republic and Canton of Geneva, Switzerland, 2017. International World Wide Web Conferences Steering Committee.Google ScholarGoogle Scholar
  10. Z. Gilani, R. Farahbakhsh, G. Tyson, L. Wang, and J. Crowcroft. An in-depth characterisation of bots and humans on twitter. arXiv preprint arXiv:1704.01508, 2017.Google ScholarGoogle Scholar
  11. Z. Gilani, L. Wang, J. Crowcroft, M. Almeida, and R. Farahbakhsh. Stweeler: A framework for twitter bot analysis. In Proceedings of the 25th International Conference Companion on World Wide Web, WWW '16 Companion, pages 37--38, Republic and Canton of Geneva, Switzerland, 2016. International World Wide Web Conferences Steering Committee. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Krishnamurthy, P. Gill, and M. Arlitt. A few chirps about twitter. In Proceedings of the First Workshop on Online Social Networks, WOSN '08, pages 19--24, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. R. Landis and G. G. Koch. The measurement of observer agreement for categorical data. biometrics, pages 159--174, 1977.Google ScholarGoogle Scholar
  14. K. Lee, J. Caverlee, and S. Webb. Uncovering social spammers: Social honeypots + machine learning. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '10, pages 435--442, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. K. Lee, B. D. Eoff, and J. Caverlee. Seven months with the devils: A long-term study of content polluters on twitter. In ICWSM, 2011.Google ScholarGoogle Scholar
  16. S. Savage, A. Monroy-Hernandez, and T. Höllerer. Botivist: Calling volunteers to action using online bots. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, CSCW '16, pages 813--822, New York, NY, USA, 2016. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Sordoni, M. Galley, M. Auli, C. Brockett, Y. Ji, M. Mitchell, J.-Y. Nie, J. Gao, and W. B. Dolan. A neural network approach to context-sensitive generation of conversational responses. In HLT-NAACL, pages 196--205. Association for Computational Linguistics, May--June 2015. Google ScholarGoogle ScholarCross RefCross Ref
  18. G. Stringhini, C. Kruegel, and G. Vigna. Detecting spammers on social networks. In Proceedings of the 26th Annual Computer Security Applications Conference, ACSAC '10, pages 1--9, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. V. S. Subrahmanian, A. Azaria, S. Durst, V. Kagan, A. Galstyan, K. Lerman, L. Zhu, E. Ferrara, A. Flammini, and F. Menczer. The darpa twitter bot challenge. Computer, 49(6):38--46, June 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. B. Viswanath, M. A. Bashir, M. Crovella, S. Guha, K. P. Gummadi, B. Krishnamurthy, and A. Mislove. Towards detecting anomalous user behavior in online social networks. In Usenix Security, volume 14, 2014.Google ScholarGoogle Scholar
  21. J. Yan. Bot, cyborg and automated turing test. In International Workshop on Security Protocols, pages 190--197. Springer, 2006.Google ScholarGoogle Scholar
  1. Classification of Twitter Accounts into Automated Agents and Human Users

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASONAM '17: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017
        July 2017
        698 pages
        ISBN:9781450349932
        DOI:10.1145/3110025

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 31 July 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate116of549submissions,21%

        Upcoming Conference

        KDD '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader