Jurnal Riset Informatika
Vol. 5 No. 3 (2023): June 2023

Clickbait Detection in Indonesia Headline News Using Indobert and Roberta

Muhammad Edo Syahputra (Bina Nusantara University)
Ade Putera Kemala (Bina Nusantara University)
Dimas Ramdhan (Bina Nusantara University)



Article Info

Publish Date
23 Jun 2023

Abstract

This paper explores clickbait detection using Transformer models, specifically IndoBERT and RoBERTa. The objective is to leverage the models specifically for clickbait detection accuracy by employing balancing and augmentation techniques on the dataset. The research demonstrates the benefit of balancing techniques in improving model performance. Additionally, data augmentation techniques also improved the performance of RoBERTa. However, it resulted differently for IndoBERT with slightly decreased performance. These findings underline the importance of considering model selection and dataset characteristics when applying augmentation. Based on the result, IndoBERT, with a balanced distribution, outperformed the previous study and the other models used in this research. This study used three dataset distribution settings: unbalanced, balanced, and augmented with 8513, 6632, and 15503 total data counts, respectively. Furthermore, by incorporating balancing and augmentation techniques, the research surpasses previous studies, contributing to the advancement of clickbait detection accuracy, contributing to the advancement of clickbait detection accuracy with 95% accuracy in f1-score with unbalanced distribution. However, the augmentation method in this study only improved the RoBERTa model. Moreover, performance might be boosted by gathering more varied datasets. This work highlights the value of leveraging pre-trained Transformer models and specific dataset-handling techniques. The implications include the necessity of dataset balancing for accurate detection and the varying impact of augmentation on different models. These insights aid researchers and practitioners in making informed decisions for clickbait detection tasks, benefiting content moderation, online user experience, and information reliability. The study emphasizes the significance of utilizing state-of-the-art models and tailored approaches to improve clickbait detection performance.

Copyrights © 2023






Journal Info

Abbrev

jri

Publisher

Subject

Computer Science & IT

Description

Jurnal Riset Informatika, merupakan Jurnal yang diterbitkan oleh Kresnamedia Publisher. Jurnal Riset Informatika, berawal diperuntukan menampung paper-paper ilmiah yang dibuat oleh peneliti dan dosen-dosen program studi Sistem Informasi dan Teknik ...