ABSTRACT
The identification of Hate Speech in Social Media is of great importance and receives much attention in the text classification community. There is a huge demand for research for languages other than English. The HASOC track intends to stimulate development in Hate Speech for Hindi, German and English. Three datasets were developed from Twitter and Facebook and made available. Binary classification and more fine-grained subclasses were offered in 3 subtasks. For all subtasks, 321 experiments were submitted. The approaches used most often were LSTM networks processing word embedding input. The performance of the best system for identification of Hate Speech for English, Hindi, and German was a Marco-F1 score of 0.78, 0.81 and 0.61, respectively.
- Fortuna, P., & Nunes, S. (2018). A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR), 51(4), 85.Google ScholarDigital Library
- Schmidt, A., & Wiegand, M. (2017, April). A survey on hate speech detection using natural language processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media (pp. 1--10).Google ScholarCross Ref
- Wiegand, M., Siegel, M., & Ruppenhofer, J. (2018). Overview of the Germeval 2018 shared task on the identification of offensive language. Proceedings of GermEval 2018, https://ids-pub.bsz-bw.de/files/8493/Wiegand_Siegel_Ruppenhofer_Overview_of_the_GermEval_2018.pdfGoogle Scholar
- Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., & Kumar, R. (2019). Semeval-2019 task 6: Identifying and categorizing offensive language in social media (offenseval). arXiv preprint arXiv:1903.08983.Google Scholar
Recommendations
Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech
FIRE '21: Proceedings of the 13th Annual Meeting of the Forum for Information Retrieval EvaluationThe HASOC track is dedicated to the evaluation of technology for finding Offensive Language and Hate Speech. HASOC is creating a multilingual data corpus mainly for English and under-resourced languages(Hindi and Marathi). This paper presents one HASOC ...
Overview of the HASOC Subtrack at FIRE 2022: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages
FIRE '22: Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval EvaluationIn recent years, the spread of online offensive content has become of great concern, motivating researchers to develop robust systems capable of identifying such content automatically. To carry out a fair evaluation of these systems, several ...
Tracking Hate in Social Media: Evaluation, Challenges and Approaches
AbstractThis paper presents online hate speech as a societal and computational challenge. Offensive content detection in social media is considered as a multilingual, multi-level, multi-class classification problem for three Indo-European languages. This ...
Comments