Multimodal Hate Speech Detection from Bengali Memes and Texts

Karim, Md. Rezaul; Dey, Sumon Kanti; Islam, Tanhim; Shajalal, Md; Chakravarthi, Bharathi Raja

doi:10.1007/978-3-031-33231-9_21

Md. Rezaul Karim^12,13,
Sumon Kanti Dey¹⁴,
Tanhim Islam¹³,
Md Shajalal^12,15 &
…
Bharathi Raja Chakravarthi¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1802))

Included in the following conference series:

International Conference on Speech and Language Technologies for Low-resource Languages

289 Accesses
7 Citations

Abstract

Numerous machine learning (ML) and deep learning (DL)-based approaches have been proposed to utilize textual data from social media for anti-social behavior analysis like cyberbullying, fake news detection, and identification of hate speech mainly for highly-resourced languages such as English. However, despite of having a lot of diversity and millions of native speakers, some languages like Bengali are under-resourced, which is due to lack of computational resources for natural language processing (NLP). Similar to other languages, Bengali social media contents also include images along with texts (e.g., multimodal memes are posted by embedding short texts into images on Facebook). Therefore, only the textual data is not enough to judge them since images might give extra context to make a proper judgement. This paper is about hate speech detection from multimodal Bengali memes and texts. We prepared the only multimodal hate speech dataset for-a-kind of problem for Bengali, which we use to train state-of-the-art neural architectures (e.g., Bi-LSTM/Conv-LSTM with word embeddings, ConvNets + pre-trained language models, e.g., monolingual Bangla BERT, multilingual BERT-cased/uncased, and XLM-RoBERTa) to jointly analyze textual and visual information for hate speech detection. Conv-LSTM and XLM-RoBERTa models performed best for texts, yielding F1 scores of 0.78 and 0.82, respectively. As of memes, ResNet-152 and DenseNet-161 models yield F1 scores of 0.78 and 0.79, respectively. As of multimodal fusion, XLM-RoBERTa + DenseNet-161 performed the best, yielding an F1 score of 0.83. Our study suggest that text modality is most useful for hate speech detection, while memes are moderately useful.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Disclaimer: memes and lexicons contain contents that are racist, sexist, homophobic, and offensive in different ways. Further, authors want to clarify that the dataset is collected and annotated from social media for research purposes only and not intended to hurt or offense any specific person, entity, or religious/political groups/parties.
2.
We observe slightly better accuracy using lemmatization, hence we report the result based on it.
3.
https://huggingface.co/sagorsarker/bangla-bert-base.
4.
A linguist, a native speaker & an NLP researcher participated in annotation process.
5.
https://github.com/rezacsedu/Multimodal-Hate-Bengali.

References

Alam, C.: Bidirectional LSTMs-CRFs networks for bangla POS tagging. In: 19th IEEE International Conference on ICCIT, pp. 377–382 (2016)
Google Scholar
Blandfort, P., et al.: Multimodal social media analysis for gang violence prevention. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 13, pp. 114–124 (2019)
Google Scholar
Braman, N., Gordon, J.W.H., Goossens, E.T., Willis, C., Stumpe, M.C., Venkataraman, J.: Deep orthogonal fusion: multimodal prognostic biomarker discovery integrating radiology, pathology, genomic, and clinical data. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) MICCAI 2021. LNCS, vol. 12905, pp. 667–677. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87240-3_64
Chapter Google Scholar
Chen, B., Zaebst, D., Seel, L.: A macro to calculate kappa statistics for categorizations by multiple raters. In: Proceeding of the 30th Annual SAS Users Group International Conference, pp. 155–230. Citeseer (2005)
Google Scholar
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. arXiv:1911.02116 (2019)
Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 11, pp. 512–515 (2017)
Google Scholar
Gomez, R., Gibert, J., Gomez, L., Karatzas, D.: Exploring hate speech detection in multimodal publications. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1470–1478 (2020)
Google Scholar
Grave, E., Bojanowski, P., Mikolov, T.: Learning word vectors for 157 languages. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC) (2018)
Google Scholar
Guo, X., Liu, X., Zhu, E.E.A.: Deep clustering with convolutional autoencoders. In: International Conference on Neural Information Processing, pp. 373–382. Springer, Guangzhou (2017). https://doi.org/10.1007/978-3-319-70096-0_39
Guterres, A.: United nations strategy and plan of action on hate speech. United Nations (20Strategy) (2019)
Google Scholar
Ishmam, A.M., Sharmin, S.: Hateful speech detection in public facebook pages for the Bengali language. In: 2019 18th IEEE International Conference on Machine Learning And Applications (ICMLA), pp. 555–560. IEEE (2019)
Google Scholar
Izsak, R.: Hate speech and incitement to hatred against minorities in the media. UN Humans Rights Council, A/HRC/28/64 (2015)
Google Scholar
Karim, M.R., Chakravarthi, B.R., Cochez, M.: Classification benchmarks for under-resourced Bengali language based on multichannel convolutional-LSTM network. In: 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), pp. 390–399. IEEE (2020)
Google Scholar
Karim, M.R., et al.: DeepHateExplainer: explainable hate speech detection in under-resourced Bengali language. In: 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10. IEEE (2021)
Google Scholar
Liu, Y., Ott, M., Goyal, N., Lewis, M., Zettlemoyer, L., Stoyanov, V.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv:1907.11692 (2019)
Patrick, T., Hans, A.K., Friedhelm, S.: Multimodal deep denoising convolutional autoencoders for pain intensity classification based on physiological signals. In: The International Conference on Pattern Recognition Applications and Methods (2020)
Google Scholar
Ranasinghe, T., Zampieri, M.: Multilingual offensive language identification with cross-lingual embeddings. arXiv preprint arXiv:2010.05324 (2020)
Ribeiro, M.H., Calais, P.H., Almeida, V.A., Meira Jr, W.: Characterizing and detecting hateful users on Twitter. In: AAAI Conference on Web & Social Media (2018)
Google Scholar
Romim, N., Ahmed, M., Talukder, H., Islam, M.S.: Hate speech detection in the Bengali language: a dataset and its baseline evaluation. arXiv preprint arXiv:2012.09686 (2020)
Sai, S., Sharma, Y.: Towards offensive language identification for Dravidian languages. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 18–27 (2021)
Google Scholar
Sai, S., Srivastava, N.D., Sharma, Y.: Explorative application of fusion techniques for multimodal hate speech detection. SN Comput. Sci. 3(2), 1–13 (2022)
Article Google Scholar
Salminen, J., Almerekhi, H., Milenkovic, M., Jung, S.: Anatomy of online hate: developing a taxonomy and ml models for identifying and classifying hate in online news media. In: ICWSM, pp. 330–339 (2018)
Google Scholar
Sherief, M., Kulkarni, V., Belding, E.: Hate lingo: a target-based linguistic analysis of hate speech in social media. In: AAAI Conference on Web & Social Media (2018)
Google Scholar
Waseem, Z., Hovy, D.: Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL Student Research Workshop, pp. 88–93 (2016)
Google Scholar
Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., Kumar, R.: Semeval-2019 task 6: identifying and categorizing offensive language in social media (offenseval). arXiv preprint arXiv:1903.08983 (2019)

Download references

Author information

Authors and Affiliations

Fraunhofer Institute for Applied Information Technology FIT, Sankt Augustin, Germany
Md. Rezaul Karim & Md Shajalal
RWTH Aachen University, Aachen, Germany
Md. Rezaul Karim & Tanhim Islam
Noakhali Science and Technology University, Noakhali, Bangladesh
Sumon Kanti Dey
University of Siegen, Siegen, Germany
Md Shajalal
University of Galway, Galway, Ireland
Bharathi Raja Chakravarthi

Authors

Md. Rezaul Karim
View author publications
You can also search for this author in PubMed Google Scholar
Sumon Kanti Dey
View author publications
You can also search for this author in PubMed Google Scholar
Tanhim Islam
View author publications
You can also search for this author in PubMed Google Scholar
Md Shajalal
View author publications
You can also search for this author in PubMed Google Scholar
Bharathi Raja Chakravarthi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Md. Rezaul Karim .

Editor information

Editors and Affiliations

National Institute of Technology Karnataka, Mangalore, India
Anand Kumar M
National University of Ireland, Galway, Ireland
Bharathi Raja Chakravarthi
Sri Sivasubramaniya Nadar College of Engineering, Kalavakkam, India
Bharathi B
National University of Ireland, Galway, Ireland
Colm O’Riordan
Indian Institute of Technology Madras, Chennai, India
Hema Murthy
Sri Sivasubramaniya Nadar College of Engineering, Kalavakkam, India
Thenmozhi Durairaj
University of Hildesheim, Hildesheim, Germany
Thomas Mandl

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Karim, M.R., Dey, S.K., Islam, T., Shajalal, M., Chakravarthi, B.R. (2023). Multimodal Hate Speech Detection from Bengali Memes and Texts. In: M, A.K., et al. Speech and Language Technologies for Low-Resource Languages . SPELLL 2022. Communications in Computer and Information Science, vol 1802. Springer, Cham. https://doi.org/10.1007/978-3-031-33231-9_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-33231-9_21
Published: 29 May 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-33230-2
Online ISBN: 978-3-031-33231-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multimodal Hate Speech Detection from Bengali Memes and Texts