Skip to main content

Multimodal Hate Speech Detection from Bengali Memes and Texts

  • Conference paper
  • First Online:
Speech and Language Technologies for Low-Resource Languages (SPELLL 2022)

Abstract

Numerous machine learning (ML) and deep learning (DL)-based approaches have been proposed to utilize textual data from social media for anti-social behavior analysis like cyberbullying, fake news detection, and identification of hate speech mainly for highly-resourced languages such as English. However, despite of having a lot of diversity and millions of native speakers, some languages like Bengali are under-resourced, which is due to lack of computational resources for natural language processing (NLP). Similar to other languages, Bengali social media contents also include images along with texts (e.g., multimodal memes are posted by embedding short texts into images on Facebook). Therefore, only the textual data is not enough to judge them since images might give extra context to make a proper judgement. This paper is about hate speech detection from multimodal Bengali memes and texts. We prepared the only multimodal hate speech dataset for-a-kind of problem for Bengali, which we use to train state-of-the-art neural architectures (e.g., Bi-LSTM/Conv-LSTM with word embeddings, ConvNets + pre-trained language models, e.g., monolingual Bangla BERT, multilingual BERT-cased/uncased, and XLM-RoBERTa) to jointly analyze textual and visual information for hate speech detection. Conv-LSTM and XLM-RoBERTa models performed best for texts, yielding F1 scores of 0.78 and 0.82, respectively. As of memes, ResNet-152 and DenseNet-161 models yield F1 scores of 0.78 and 0.79, respectively. As of multimodal fusion, XLM-RoBERTa + DenseNet-161 performed the best, yielding an F1 score of 0.83. Our study suggest that text modality is most useful for hate speech detection, while memes are moderately useful.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Disclaimer: memes and lexicons contain contents that are racist, sexist, homophobic, and offensive in different ways. Further, authors want to clarify that the dataset is collected and annotated from social media for research purposes only and not intended to hurt or offense any specific person, entity, or religious/political groups/parties.

  2. 2.

    We observe slightly better accuracy using lemmatization, hence we report the result based on it.

  3. 3.

    https://huggingface.co/sagorsarker/bangla-bert-base.

  4. 4.

    A linguist, a native speaker & an NLP researcher participated in annotation process.

  5. 5.

    https://github.com/rezacsedu/Multimodal-Hate-Bengali.

References

  1. Alam, C.: Bidirectional LSTMs-CRFs networks for bangla POS tagging. In: 19th IEEE International Conference on ICCIT, pp. 377–382 (2016)

    Google Scholar 

  2. Blandfort, P., et al.: Multimodal social media analysis for gang violence prevention. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 13, pp. 114–124 (2019)

    Google Scholar 

  3. Braman, N., Gordon, J.W.H., Goossens, E.T., Willis, C., Stumpe, M.C., Venkataraman, J.: Deep orthogonal fusion: multimodal prognostic biomarker discovery integrating radiology, pathology, genomic, and clinical data. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) MICCAI 2021. LNCS, vol. 12905, pp. 667–677. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87240-3_64

    Chapter  Google Scholar 

  4. Chen, B., Zaebst, D., Seel, L.: A macro to calculate kappa statistics for categorizations by multiple raters. In: Proceeding of the 30th Annual SAS Users Group International Conference, pp. 155–230. Citeseer (2005)

    Google Scholar 

  5. Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. arXiv:1911.02116 (2019)

  6. Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 11, pp. 512–515 (2017)

    Google Scholar 

  7. Gomez, R., Gibert, J., Gomez, L., Karatzas, D.: Exploring hate speech detection in multimodal publications. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1470–1478 (2020)

    Google Scholar 

  8. Grave, E., Bojanowski, P., Mikolov, T.: Learning word vectors for 157 languages. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC) (2018)

    Google Scholar 

  9. Guo, X., Liu, X., Zhu, E.E.A.: Deep clustering with convolutional autoencoders. In: International Conference on Neural Information Processing, pp. 373–382. Springer, Guangzhou (2017). https://doi.org/10.1007/978-3-319-70096-0_39

  10. Guterres, A.: United nations strategy and plan of action on hate speech. United Nations (20Strategy) (2019)

    Google Scholar 

  11. Ishmam, A.M., Sharmin, S.: Hateful speech detection in public facebook pages for the Bengali language. In: 2019 18th IEEE International Conference on Machine Learning And Applications (ICMLA), pp. 555–560. IEEE (2019)

    Google Scholar 

  12. Izsak, R.: Hate speech and incitement to hatred against minorities in the media. UN Humans Rights Council, A/HRC/28/64 (2015)

    Google Scholar 

  13. Karim, M.R., Chakravarthi, B.R., Cochez, M.: Classification benchmarks for under-resourced Bengali language based on multichannel convolutional-LSTM network. In: 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), pp. 390–399. IEEE (2020)

    Google Scholar 

  14. Karim, M.R., et al.: DeepHateExplainer: explainable hate speech detection in under-resourced Bengali language. In: 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10. IEEE (2021)

    Google Scholar 

  15. Liu, Y., Ott, M., Goyal, N., Lewis, M., Zettlemoyer, L., Stoyanov, V.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv:1907.11692 (2019)

  16. Patrick, T., Hans, A.K., Friedhelm, S.: Multimodal deep denoising convolutional autoencoders for pain intensity classification based on physiological signals. In: The International Conference on Pattern Recognition Applications and Methods (2020)

    Google Scholar 

  17. Ranasinghe, T., Zampieri, M.: Multilingual offensive language identification with cross-lingual embeddings. arXiv preprint arXiv:2010.05324 (2020)

  18. Ribeiro, M.H., Calais, P.H., Almeida, V.A., Meira Jr, W.: Characterizing and detecting hateful users on Twitter. In: AAAI Conference on Web & Social Media (2018)

    Google Scholar 

  19. Romim, N., Ahmed, M., Talukder, H., Islam, M.S.: Hate speech detection in the Bengali language: a dataset and its baseline evaluation. arXiv preprint arXiv:2012.09686 (2020)

  20. Sai, S., Sharma, Y.: Towards offensive language identification for Dravidian languages. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 18–27 (2021)

    Google Scholar 

  21. Sai, S., Srivastava, N.D., Sharma, Y.: Explorative application of fusion techniques for multimodal hate speech detection. SN Comput. Sci. 3(2), 1–13 (2022)

    Article  Google Scholar 

  22. Salminen, J., Almerekhi, H., Milenkovic, M., Jung, S.: Anatomy of online hate: developing a taxonomy and ml models for identifying and classifying hate in online news media. In: ICWSM, pp. 330–339 (2018)

    Google Scholar 

  23. Sherief, M., Kulkarni, V., Belding, E.: Hate lingo: a target-based linguistic analysis of hate speech in social media. In: AAAI Conference on Web & Social Media (2018)

    Google Scholar 

  24. Waseem, Z., Hovy, D.: Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL Student Research Workshop, pp. 88–93 (2016)

    Google Scholar 

  25. Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., Kumar, R.: Semeval-2019 task 6: identifying and categorizing offensive language in social media (offenseval). arXiv preprint arXiv:1903.08983 (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Md. Rezaul Karim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Karim, M.R., Dey, S.K., Islam, T., Shajalal, M., Chakravarthi, B.R. (2023). Multimodal Hate Speech Detection from Bengali Memes and Texts. In: M, A.K., et al. Speech and Language Technologies for Low-Resource Languages . SPELLL 2022. Communications in Computer and Information Science, vol 1802. Springer, Cham. https://doi.org/10.1007/978-3-031-33231-9_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-33231-9_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-33230-2

  • Online ISBN: 978-3-031-33231-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics