Claim Missing Document
Check
Articles

Found 1 Documents
Search

Utilizing Translation to Enhance NLP Models in Offensive Language and Hate Speech Identification Sandy Kurniawan; Indra Budi
Jurnal Improsci Vol 1 No 4 (2024): Vol 1 No.4 17 Februari 2024
Publisher : Ann Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62885/improsci.v1i4.187

Abstract

The number of social media users in Indonesia has increased in recent years. The surge in social media users leads to more offensive language on these platforms. The use of offensive language can trigger conflicts between users. Therefore, it is necessary to identify the use of offensive language on social media. This study focused on identifying offensive language, hate speech, and hate speech targets on Twitter. The data used were obtained from previous research on identifying offensive language and hate speech. The amount of data is very influential on the performance of the classification. Therefore, data was added using translation in this study. Classical machine learning (SVM et al.) and deep learning (BiLSTM, CNN, and LSTM) algorithms are used as classification algorithms with word n-gram and word embedding as the features. Three scenarios were done based on the training data used in the classification model development. The result shows that scenario 3, which uses translation for data augmentation, can improve the classification model’s performance by 5%.