Cyberbullying Detection in Arabic Text Using Different Deep Learning Approaches
DOI:
https://doi.org/10.31185/wjcms.390Keywords:
cyberbullying detection, deep learning, text miningAbstract
The rise of social media has enabled the rapid spread of user-generated content across various domains, including advertising, entertainment, politics, and economics. However, this growth has also facilitated the increase of harmful behaviors, notably cyberbullying. Addressing this issue requires advanced emotional and sentiment analysis techniques. In this study, the ArCyC (Arabic Cyberbullying Corpus) were integrated with Twitter data to develop robust models for cyberbullying detection. Several deep learning models have been suggested which are: Deep Neural Net. (DNN), Convolutional Neural Nets. (CNN), Recurrent Neural Net. (RNN), Hybrid CNN+RNN, BERT & AraBERT. Text and Emoji have been experimented in the dataset. Models’ performance was evaluated based on accuracy. Experiments using the ArCyC dataset demonstrated that both textual and symbolic elements contributed significantly to classification accuracy. In contrast, analysis with the ArCyC dataset revealed that textual features had a more dominant influence due to the limited use of emojis. The results underscore the effectiveness of deep learning approaches in detecting cyberbullying within Arabic social media content. AraBERT for text has obtained the highest accuracy equal to 95%, similarly LSTM obtained the same accuracy for both text and emoji’s
Downloads
References
[1] Van Hee, C.; Lefever, E.; Verhoeven, B.; Mennes, J.; Desmet, B.; De Pauw, G.; Daelemans, W.; Hoste, V. Detection and fine-grained classification of cyberbullying events. In Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP), Hissar, Bulgaria, 5–11 September 2015.
[2] Chen, Y. Detecting Offensive Language in Social Medias for Protection of Adolescent Online Safety. Master’s Thesis, Penn State University, State College, PA, USA, 2011.
[3] Balakrishnan, V.; Khan, S.; Arabnia, H.R. Improving cyberbullying detection using Twitter users’ psychological features and machine learning. Comput. Secur. 2020, 90, 101710. [CrossRef] DOI: https://doi.org/10.1016/j.cose.2019.101710
[4] Akhter, M.P.; Jiangbin, Z.; Naqvi, I.R.; Abdelmajeed, M.; Sadiq, M.T. Automatic detection of offensive language for urdu and roman urdu. IEEE Access 2020, 8, 91213–91226. [CrossRef]. DOI: https://doi.org/10.1109/ACCESS.2020.2994950
[5] Kumar, R.; Lahiri, B.; Ojha, A.K. Aggressive and offensive language identification in hindi, bangla, and english: A comparative study. SN Comput. Sci. 2021, 2, 1–20. [CrossRef] DOI: https://doi.org/10.1007/s42979-020-00414-6
[6] Plaza-del-Arco, F.M.; Molina-González, M.D.; Urena-López, L.A.; Martín-Valdivia, M.T. Comparing pre-trained language models for Spanish hate speech detection. Expert Syst. Appl. 2021, 166, 114120. [CrossRef] DOI: https://doi.org/10.1016/j.eswa.2020.114120
[7] Herwanto, G.B.; Ningtyas, A.M.; Nugraha, K.E.; Trisna, I.N. Hate speech and abusive language classification using fastText. In Proceedings of the 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), Yogyakarta, Indonesia, 5–6 December 2019. DOI: https://doi.org/10.1109/ISRITI48646.2019.9034560
[8] Fortuna, P.; Soler-Company, J.;Wanner, L. How well do hate speech, toxicity, abusive and offensive language classification models generalize across datasets? Inf. Process. Manag. 2021, 58, 102524. [CrossRef] DOI: https://doi.org/10.1016/j.ipm.2021.102524
[9] Alotaibi, A.; Hasanat, M.H.A. Racism Detection in Twitter Using Deep Learning and Text Mining Techniques for the Arabic Language. In Proceedings of the 2020 First International Conference of Smart Systems and Emerging Technologies (SMARTTECH), Riyadh, Saudi Arabia, 3–5 November 2020. DOI: https://doi.org/10.1109/SMART-TECH49988.2020.00047
[10] Malmasi, S.; Zampieri, M. Challenges in discriminating profanity from hate speech. J. Exp. Theor. Artif. Intell. 2018, 30, 187–202. DOI: https://doi.org/10.1080/0952813X.2017.1409284
[11] Garaigordobil, M.; Mollo-Torrico, J.P.; Machimbarrena, J.M.; Páez, D. Cyberaggression in adolescents of Bolivia: Connection with psychopathological symptoms, adaptive and predictor variables. Int. J. Environ. Res. Public Health 2020, 17, 1022. [CrossRef] DOI: https://doi.org/10.3390/ijerph17031022
[12] Chatzakou, D.; Kourtellis, N.; Blackburn, J.; De Cristofaro, E.; Stringhini, G.; Vakali, A. Mean birds: Detecting aggression and bullying on twitter. In Proceedings of the 2017 ACM on Web Science Conference, Troy, NY, USA, 25–28 June 2017. DOI: https://doi.org/10.1145/3091478.3091487
[13] Gitari, N.D.; Zuping, Z.; Damien, H.; Long, J. A lexicon-based method for hate speech detection. Int. J. Multimed. Ubiquitous Eng. 2015, 10, 215–230. [CrossRef] DOI: https://doi.org/10.14257/ijmue.2015.10.4.21
[14] Zois, D.S.; Kapodistria, A.; Yao, M.; Chelmis, C. Optimal online cyberbullying detection. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018. DOI: https://doi.org/10.1109/ICASSP.2018.8462092
[15] Di Capua, M.; Di Nardo, E. Unsupervised cyber bullying detection in social networks. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 4–8 December 2016. DOI: https://doi.org/10.1109/ICPR.2016.7899672
[16] González-Ibánez, R. Identifying sarcasm in twitter: A closer look. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Stroudsburg, PA, USA, 19–24 June 2011.
[17] Chia, Z.L.; Ptaszynski, M.; Masui, F.; Leliwa, G.;Wroczynski, M. Machine Learning and feature engineering-based study into sarcasm and irony classification with application to cyberbullying detection. Inf. Process. Manag. 2021, 58, 102600. [CrossRef] DOI: https://doi.org/10.1016/j.ipm.2021.102600
[18] Lee, P.J.; Hu, Y.H.; Chen, K.; Tarn, J.M.; Cheng, L.E. Cyberbullying Detection on Social Network Services. In Proceedings of the 22nd Pacific Asia Conference on Information Systems, PACIS 2018, Yokohama, Japan, 26–30 June 2018.
[19] Krizhevsky, A., Sutskever, I., Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. Communications of the ACM. https://doi.org/10.1145/3065386 DOI: https://doi.org/10.1145/3065386
[20] Anand, M., Eswari, R. (2019). Classification of abusive comments in social media using deep learning. 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, pp. 974-977. https://doi.org/10.1109/ICCMC.2019.8819734 DOI: https://doi.org/10.1109/ICCMC.2019.8819734
[21] Li, Y., Algarni, A., Albathan, M., Shen, Y., Bijaksana, M.A. (2015). Relevance Feature Discovery for Text Mining. IEEE Transactions on Knowledge and Data Engineering, 27(6): 1656-1669. https://doi.org/10.1109/TKDE.2014.2373357 DOI: https://doi.org/10.1109/TKDE.2014.2373357
[22] Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y. (2016). Abusive Language Detection in Online User Content. Proceedings of the 25th International Conference on World Wide Web - WWW ’16, pp. 145-149. https://doi.org/10.1145/2872427.2883062 DOI: https://doi.org/10.1145/2872427.2883062
[23] Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv.org. DOI: https://doi.org/10.3115/v1/D14-1181
[24] https://doi.org/10.48550/arXiv.1408.5882
[25] Zhang, X., LeCun, Y. (2016). Text Understanding from Scratch. arXiv:1502.01710 [cs], https://doi.org/10.48550/arXiv.1502.01710
[26] Prusa, J.D., Khoshgoftaar, T.M., Dittman, D.J. (2015). Impact of feature selection techniques for tweet sentiment classification. Proceedings of the 28th International FLAIRS Conference, 2015: 299-304.
[27] Hani, J., Nashaat, M., Ahmed, M., Emad, Z., Amer, E., Mohammed, A. (2019). Social Media Cyber-bullying Detection using Machine Learning. International Journal of Advanced Computer Science and Applications, 10(5):https://doi.org/10.14569/ijacsa.2019.0100587 DOI: https://doi.org/10.14569/IJACSA.2019.0100587
[28] Ibrohim, M.O., Setiadi, M.A., Budi, I. (2019). Identification of hate speech and abusive language on Indonesian Twitter using the Word2vec, part of speech and emoji features. Proceedings of the International Conference on Advanced Information Science and System.
[29] https://doi.org/10.1145/3373477.3373495 DOI: https://doi.org/10.1145/3373477.3373495
[30] Waseem, Z., Hovy, D. (2016). Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. Proceedings of the NAACL Student Research Workshop, pp. 88-92.
[31] https://doi.org/10.18653/v1/n16-2013 DOI: https://doi.org/10.18653/v1/N16-2013
[32] Vigna, F., Cimino, A., Dell'orletta, F., Petrocchi, M., Tesconi, M. (2022). Hate me, hate me not: Hate speech detection on Facebook. https://ceur-ws.org/Vol- 1816/paper-09.pdf., accessed on Dec. 2, 2022.
[33] Yenala, H., Jhanwar, A., Chinnakotla, M.K., Goyal, J. (2017). Deep learning for detecting inappropriate content in text. International Journal of Data Science and Analytics, 6(4): 273- 286.https://doi.org/10.1007/s41060-017-0088-4 DOI: https://doi.org/10.1007/s41060-017-0088-4
[34] Islam, M.M., Uddin, M.A., Islam, L. Akter, A. Sharmin, S., Acharjee, U.K. (2020). Cyberbullying detection on social networks using machine learning methods. 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), Gold Coast.
[35] https://doi.org/10.1109/csde50874.2020.9411601 DOI: https://doi.org/10.1109/CSDE50874.2020.9411601
[36] Shekhar A., Venkatesan, M. (2018). A Bag-of-Phonetic- Codes Modelfor Cyber-Bullying Detection in Twitter. 2018 International Conference on Current Trends towards Converging Technologies (ICCTCT), Coimbatore, pp. 1-7. https://doi.org/10.1109/ICCTCT.2018.8550938 DOI: https://doi.org/10.1109/ICCTCT.2018.8550938
[37] Sharma, R., Ramakrishnan, A., Pendse, P., Chimurkar, Talele, K.T. (2021). Cyber-bullying detection via text mining and machine learning. 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur, pp. 1- 6.
[38] https://doi.org/10.1109/ICCCNT51525.2021.9579625 DOI: https://doi.org/10.1109/ICCCNT51525.2021.9579625
[39] Wadhwani, A., Jain, P., Sahu, S. (2021). Injurious Comment Detection and Removal utilizing Neural Network. 2021 International Conference on Innovative Practices in Technology and Management (ICIPTM), Noida, pp.165- 168.https://doi.org/10.1109/ICIPTM52218.2021.9388331 DOI: https://doi.org/10.1109/ICIPTM52218.2021.9388331
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Humera Shaziya

This work is licensed under a Creative Commons Attribution 4.0 International License.