Offensive Language Detection on Social Media Using Three Language Models and Three Datasets
Abstract
There are more and more offensive posts on Social Media nowadays. Those posts are harmful and should be treated seriously. The most efficient way to detect offensive posts is to fine-tune a Large Language Model (LLM) on an offensive language dataset. In our research, we focus on maximizing the capacity of LLMs on offensive language detection tasks on Social Media. We select three LLMs with different attributes (DeepMoji, Bert, and HateBert) and three offensive language datasets (OLID, Curious Cat, and Ask FM). We mainly discuss achieving the best performance by configuring the LLMs and datasets. Experimental results show that simply fine-tuning an LLM with larger data can not always achieve the best performance. The combination of LLMs was effective, especially the combination of DeepMoji and HateBert.
References
Aditya Gaydhani, Vikrant Doma, Shrikant Kendre, and Laxmi Bhagwat. Detecting hate speech and offensive language on twitter using machine learning: An n-gram and tfidf based approach. arXiv preprint arXiv:1809.08651, 2018.
Fatemah Husain. Arabic offensive language detection using machine learning and ensemble machine learning approaches. arXiv preprint arXiv:2005.08946, 2020.
Ji Ho Park and Pascale Fung. One-step and two-step classification for abusive language detection on Twitter. In Proceedings of the First Workshop on Abusive Language Online, pages 41–45, 2017.
Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. Deep learning for hate speech detection in tweets. In Proceedings of the 26th international conference on World Wide Web companion, pages 759–760, 2017.
Konthala Yasaswini, Karthik Puranik, Adeep Hande, Ruba Priyadharshini, Sajeetha Thavareesan, and Bharathi Raja Chakravarthi. IIITT@DravidianLangTech-EACL2021: Transfer Learning for Offensive Language Detection in Dravidian Languages. In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pages 187–194, 2021.
Marzieh Mozafari, Reza Farahbakhsh, and Noel Crespi. Cross-Lingual Few-Shot Hate Speech and Offensive Language Detection Using Meta Learning. IEEE Access, 10:14880–14896, 2022.
Pradeep Kumar Roy, Snehaan Bhawal, and Chinnaudayar Navaneethakrishnan Subalalitha. Hate speech and offensive language detection in Dravidian languages using deep ensemble framework. Computer Speech & Language, 75:101386, 2022.
Camilla Casula and Sara Tonelli. Generation-Based Data Augmentation for Offen-sive Language Detection: Is It Worth It? In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 3359–3377, 2023.
Bjarke Felbo, Alan Mislove, Anders Sogaard, Iyad Rahwan, and Sune Lehmann. Using millions of emoji occurrences to learn any-domain representations for de-tecting sentiment, emotion and sarcasm. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1615–1625, 2017.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need, 2017.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Associ-ation for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, 2019.
Tommaso Caselli, Valerio Basile, Jelena Mitrović, and Michael Granitzer. Hate-BERT: Retraining BERT for Abusive Language Detection in English. In Pro-ceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021), pages 17–25, 2021.
Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, and Ritesh Kumar. SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval). In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 75–86, 2019.
Niloofar Safi Samghabadi, Afsheen Hatami, Mahsa Shafaei, Sudipta Kar, and Thamar Solorio. Attending the Emotions to Detect Online Abusive Language. In Proceedings of the Fourth Workshop on Online Abuse and Harms, pages 79–88. Association for Computational Linguistics, 2020.
Niloofar Safi Samghabadi, Suraj Maharjan, Alan Sprague, Raquel Diaz-Sprague, and Thamar Solorio. Detecting Nastiness in Social Media. In Proceedings of the First Workshop on Abusive Language Online, pages 63–72, 2017.
Yujia Qin, Yankai Lin, Jing Yi, Jiajie Zhang, Xu Han, Zhengyan Zhang, Yusheng Su, Zhiyuan Liu, Peng Li, Maosong Sun, and Jie Zhou. Knowledge Inheritance for Pre-trained Language Models. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Lan-guage Technologies, pages 3921–3937, 2022.
Nils Reimers and Iryna Gurevych. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019.