Abstract
Numerous machine learning (ML) and deep learning (DL)-based approaches have been proposed to utilize textual data from social media for anti-social behavior analysis like cyberbullying, fake news detection, and identification of hate speech mainly for highly-resourced languages such as English. However, despite of having a lot of diversity and millions of native speakers, some languages like Bengali are under-resourced, which is due to lack of computational resources for natural language processing (NLP). Similar to other languages, Bengali social media contents also include images along with texts (e.g., multimodal memes are posted by embedding short texts into images on Facebook). Therefore, only the textual data is not enough to judge them since images might give extra context to make a proper judgement. This paper is about hate speech detection from multimodal Bengali memes and texts. We prepared the only multimodal hate speech dataset for-a-kind of problem for Bengali, which we use to train state-of-the-art neural architectures (e.g., Bi-LSTM/Conv-LSTM with word embeddings, ConvNets + pre-trained language models, e.g., monolingual Bangla BERT, multilingual BERT-cased/uncased, and XLM-RoBERTa) to jointly analyze textual and visual information for hate speech detection. Conv-LSTM and XLM-RoBERTa models performed best for texts, yielding F1 scores of 0.78 and 0.82, respectively. As of memes, ResNet-152 and DenseNet-161 models yield F1 scores of 0.78 and 0.79, respectively. As of multimodal fusion, XLM-RoBERTa + DenseNet-161 performed the best, yielding an F1 score of 0.83. Our study suggest that text modality is most useful for hate speech detection, while memes are moderately useful.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Disclaimer: memes and lexicons contain contents that are racist, sexist, homophobic, and offensive in different ways. Further, authors want to clarify that the dataset is collected and annotated from social media for research purposes only and not intended to hurt or offense any specific person, entity, or religious/political groups/parties.
- 2.
We observe slightly better accuracy using lemmatization, hence we report the result based on it.
- 3.
- 4.
A linguist, a native speaker & an NLP researcher participated in annotation process.
- 5.
References
Alam, C.: Bidirectional LSTMs-CRFs networks for bangla POS tagging. In: 19th IEEE International Conference on ICCIT, pp. 377–382 (2016)
Blandfort, P., et al.: Multimodal social media analysis for gang violence prevention. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 13, pp. 114–124 (2019)
Braman, N., Gordon, J.W.H., Goossens, E.T., Willis, C., Stumpe, M.C., Venkataraman, J.: Deep orthogonal fusion: multimodal prognostic biomarker discovery integrating radiology, pathology, genomic, and clinical data. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) MICCAI 2021. LNCS, vol. 12905, pp. 667–677. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87240-3_64
Chen, B., Zaebst, D., Seel, L.: A macro to calculate kappa statistics for categorizations by multiple raters. In: Proceeding of the 30th Annual SAS Users Group International Conference, pp. 155–230. Citeseer (2005)
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. arXiv:1911.02116 (2019)
Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 11, pp. 512–515 (2017)
Gomez, R., Gibert, J., Gomez, L., Karatzas, D.: Exploring hate speech detection in multimodal publications. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1470–1478 (2020)
Grave, E., Bojanowski, P., Mikolov, T.: Learning word vectors for 157 languages. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC) (2018)
Guo, X., Liu, X., Zhu, E.E.A.: Deep clustering with convolutional autoencoders. In: International Conference on Neural Information Processing, pp. 373–382. Springer, Guangzhou (2017). https://doi.org/10.1007/978-3-319-70096-0_39
Guterres, A.: United nations strategy and plan of action on hate speech. United Nations (20Strategy) (2019)
Ishmam, A.M., Sharmin, S.: Hateful speech detection in public facebook pages for the Bengali language. In: 2019 18th IEEE International Conference on Machine Learning And Applications (ICMLA), pp. 555–560. IEEE (2019)
Izsak, R.: Hate speech and incitement to hatred against minorities in the media. UN Humans Rights Council, A/HRC/28/64 (2015)
Karim, M.R., Chakravarthi, B.R., Cochez, M.: Classification benchmarks for under-resourced Bengali language based on multichannel convolutional-LSTM network. In: 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), pp. 390–399. IEEE (2020)
Karim, M.R., et al.: DeepHateExplainer: explainable hate speech detection in under-resourced Bengali language. In: 2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10. IEEE (2021)
Liu, Y., Ott, M., Goyal, N., Lewis, M., Zettlemoyer, L., Stoyanov, V.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv:1907.11692 (2019)
Patrick, T., Hans, A.K., Friedhelm, S.: Multimodal deep denoising convolutional autoencoders for pain intensity classification based on physiological signals. In: The International Conference on Pattern Recognition Applications and Methods (2020)
Ranasinghe, T., Zampieri, M.: Multilingual offensive language identification with cross-lingual embeddings. arXiv preprint arXiv:2010.05324 (2020)
Ribeiro, M.H., Calais, P.H., Almeida, V.A., Meira Jr, W.: Characterizing and detecting hateful users on Twitter. In: AAAI Conference on Web & Social Media (2018)
Romim, N., Ahmed, M., Talukder, H., Islam, M.S.: Hate speech detection in the Bengali language: a dataset and its baseline evaluation. arXiv preprint arXiv:2012.09686 (2020)
Sai, S., Sharma, Y.: Towards offensive language identification for Dravidian languages. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 18–27 (2021)
Sai, S., Srivastava, N.D., Sharma, Y.: Explorative application of fusion techniques for multimodal hate speech detection. SN Comput. Sci. 3(2), 1–13 (2022)
Salminen, J., Almerekhi, H., Milenkovic, M., Jung, S.: Anatomy of online hate: developing a taxonomy and ml models for identifying and classifying hate in online news media. In: ICWSM, pp. 330–339 (2018)
Sherief, M., Kulkarni, V., Belding, E.: Hate lingo: a target-based linguistic analysis of hate speech in social media. In: AAAI Conference on Web & Social Media (2018)
Waseem, Z., Hovy, D.: Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL Student Research Workshop, pp. 88–93 (2016)
Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., Kumar, R.: Semeval-2019 task 6: identifying and categorizing offensive language in social media (offenseval). arXiv preprint arXiv:1903.08983 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Karim, M.R., Dey, S.K., Islam, T., Shajalal, M., Chakravarthi, B.R. (2023). Multimodal Hate Speech Detection from Bengali Memes and Texts. In: M, A.K., et al. Speech and Language Technologies for Low-Resource Languages . SPELLL 2022. Communications in Computer and Information Science, vol 1802. Springer, Cham. https://doi.org/10.1007/978-3-031-33231-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-33231-9_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-33230-2
Online ISBN: 978-3-031-33231-9
eBook Packages: Computer ScienceComputer Science (R0)