Identification and Classification of Cyberbullying Posts: A Recurrent Neural Network Approach using Under-sampling and Class Weighting

Publisher:
Springer International Publishing
Publication Type:
Conference Proceeding
Citation:
Communications in Computer and Information Science, 2020, 1333, pp. 113-120
Issue Date:
2020-11-18
Full metadata record
© 2020, Springer Nature Switzerland AG. With the number of users of social media and web platforms increasing day-by-day in recent years, cyberbullying has become a ubiquitous problem on the internet. Controlling and moderating these social media platforms manually for online abuse and cyberbullying has become a very challenging task. This paper proposes a Recurrent Neural Network (RNN) based approach for the identification and classification of cyberbullying posts. In highly imbalanced input data, a Tomek Links approach does under-sampling to reduce the data imbalance and remove ambiguities in class labelling. Further, the proposed classification model uses Max-Pooling in combination with Bi-directional Long Short-Term Memory (LSTM) network and attention layers. The proposed model is evaluated using Wikipedia datasets to establish the effectiveness of identifying and classifying cyberbullying posts. The extensive experimental results show that our approach performs well in comparison to competing approaches in terms of precision, recall, with F1 score as 0.89, 0.86 and 0.88, respectively.
Please use this identifier to cite or link to this item: