Claim Missing Document
Check
Articles

Found 6 Documents
Search

Sequential Topic Modelling: A Case Study on Indonesian LGBT Conversation on Twitter Arsy Arslina; Muhaza Liebenlito
InPrime: Indonesian Journal of Pure and Applied Mathematics Vol 1, No 1 (2019)
Publisher : Department of Mathematics, Faculty of Sciences and Technology, UIN Syarif Hidayatullah

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1933.171 KB) | DOI: 10.15408/inprime.v1i1.12726

Abstract

AbstractAs a country with the largest Muslim population in the world, the Lesbian, Gay, Bisexual, and Transgender (LGBT) issue in Indonesia has always been a hot topic to investigate. Social media such as Twitter is normally the main media where people normally discuss this LGBT topic. In this paper, we collect 18,552 tweets dated from 2015 up to 2018 to analyze the dynamics of the LGBT conversation among Indonesian peoples. In this research, we will explore the main topic of the LGBT conversation using Linear Discriminant Analysis (LDA). LDA is one of the most popular methods of soft clustering. This technique is effective to identify latent topic information (hidden) in a collection of big data using a bag of words approaches that treat every document as a vector of total words and is represented as a probability distribution on several topics. The result shows that there are seven main categories that people normally talked about regarding LGBT i.e. politics, religion, government, ethics, nationality, culture, and technology. Looking at the topic probability distributions on each semester we found that it is generally homogenous. An exception occurs during the government election period where politic tends to have a significantly higher probability. In other words, we have found that there is a tendency that LGBT issues are used in Indonesian politics.Keywords: LGBT; politics; topic modeling; twitter. AbstrakSebagai negara dengan penduduk muslim terbesar di dunia, isu mengenai Lesbian, Gay, Bisexual, dan Transgender (LGBT) di Indonesia adalah isu sensitif yang senantiasa menarik untuk diteliti. Media sosial seperti twitter adalah salah satu media yang biasa digunakan masyarakat untuk mendiskusikan tentang topik LGBT ini. Penelitian ini menggunakan 18.552 tweet tahun 2015 – 2018 dikumpulkan untuk melihat perbedaan pola perbincangan dari waktu ke waktu. Dalam penelitian ini, eksplorasi topik utama perbincangan LGBT dianalisis menggunakan metode Linear Discriminant Analysis (LDA). LDA adalah metode yang paling populer dalam soft clustering. Teknik ini efektif untuk mengidentifikasi informasi topik laten (tersembunyi) dalam koleksi dokumen besar menggunakan pendekatan bag of words yang memperlakukan setiap dokumen sebagai vektor jumlah kata dan direpresentasikan sebagai distribusi probabilitas atas beberapa topik, sementara setiap topik direpresentasikan sebagai distribusi probabilitas atas sejumlah kata. Hasil menunjukkan bahwa terdapat tujuh topik dominan yang sering muncul pada perbincangan tentang LGBT, yaitu politik, agama, pemerintahan, keasusilaan, kewarganegaraan, budaya dan teknologi. Pada kategori ini kemudian distribusi probabilitas topik dihitung dan dianalisa pada setiap semesternya. Hasilnya menunjukkan bahwa ada kecenderungan distribusi topik seragam, kecuali pada masa-masa pergantian pemerintahan dimana kategori politik cenderung meningkat secara signifikan. Dengan kata lain, ada kecenderungan bahwa isu LGBT dikaitkan dengan kehidupan perpolitikan di Indonesia.Kata kunci: LGBT, politik, topic modelling, twitter.
Classification of Tuberculosis and Pneumonia in Human Lung Based on Chest X-Ray Image using Convolutional Neural Network Muhaza Liebenlito; Yanne Irene; Abdul Hamid
InPrime: Indonesian Journal of Pure and Applied Mathematics Vol 2, No 1 (2020)
Publisher : Department of Mathematics, Faculty of Sciences and Technology, UIN Syarif Hidayatullah

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1192.116 KB) | DOI: 10.15408/inprime.v2i1.14545

Abstract

AbstractIn this paper, we use chest x-ray images of Tuberculosis and Pneumonia to diagnose the patient using a convolutional neural network model. We use 4273 images of pneumonia, 1989 images of normal, and 394 images of tuberculosis. The data are divided into 80% as the training set and 20% as the testing set. We do the preprocessing steps to all of our images data, such as resize, converting RGB to grayscale, and Gaussian normalization. On the training dataset, the sampling technique used is undersampling and oversampling to balance each class. The best model was chosen based on the Area under Curve value i.e. the area under the curve of Receiver Operating Characteristics. This method shows that the best model obtains when trains the training dataset using oversampling. The Area under Curve value is 0.99 for tuberculosis and 0.98 for pneumonia. Therefore, this best model succeeds to identify 86% true for tuberculosis and 96% true for pneumonia.Keywords: chest X-ray images; tuberculosis; pneumonia; convolutional neural network.                                                                AbstrakPada penelitian ini memanfaatkan data citra chest x-ray penderita penyakit tuberculosis dan pneumonia. Model convolutional neural network digunakan untuk membantu mendiagnosis kedua penyakit ini. Data yang digunakan masing-masing sudah dilabeli sebanyak 4273 citra pneumonia, 1989 citra normal dan 394 citra tuberculosis. Data tersebut dibagi menjadi 80% himpunan data latih dan 20% data uji. Himpunan data tersebut telah melalui 3 tahap prepocessing yaitu resize citra, merubah citra RGB menjadi grayscale dan standarisasi gausian pada citra. Pada data latih dilakukan teknik sampling berupa undersampling dan oversampling data untuk menyeimbangkan data latih antar kelas. Model terbaik dipilih berdasarkan nilai Area under Curve yaitu luas daerah di bawah kurva Receiver Operating Chracteristics. Hasil menunjukkan bahwa model terbaik dihasilkan ketika dilatih menggunakan data latih hasil oversampling dengan nilai Area under Curve kelas tuberculosis sebesar 0,99 dan nilai Area under Curve kelas pneumonia sebesar 0,98. Oleh karena itu, model terbaik ini mampu mengindentifikasi sebanyak 86% penyakit tuberculosis dan 96% penyakit pneumonia.Kata Kunci: citra chest X-ray; penyakit infeksi paru; pengolahan citra digital Convolutional Neural Network.
Deteksi Kepribadian MBTI pada Diskusi Agama Islam di Twitter Indonesia 2009-2019 Nanira Annisa Fitri; Taufik Edy Sutanto; Muhaza Liebenlito
Indonesian Journal of Computer Science Vol. 12 No. 5 (2023): Indonesian Journal of Computer Science (IJCS) Volume 12 Number 5 (2023)
Publisher : STMIK Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Kepribadian adalah perilaku serta pemikiran seseorang yang dapat mempengaruhi orang lain dalam berinteraksi di berbagai situasi. Satu macam tes kepribadian yang populer adalah Myers-Briggs Type Indicator (MBTI) yang mengklasifikasikan kepribadian seseorang berdasarkan empat dimensi yaitu Introvert-Extrovert, Sensing-Intuitive, Thinking-Feeling, serta Judging-Perceiving. Sebagian besar penelitian sebelumnya mengenai prediksi MBTI yang telah dilakukan menggunakan dataset berbahasa Inggris. Penelitian ini melakukan analisa prediksi MBTI pengguna twitter dengan menggunakan data latih bahasa Indonesia yang berasal dari terjemahan model kecerdasan buatan. Empat model klasifikasi seperti Random Forest, Support Vector Machine (SVM), Logistic Regression, dan Bernoulli Naive Bayes digunakan untuk menganalisis efektivitas prediksi MBTI pada data yang digunakan. Hasil eksperimen menunjukkan bahwa model klasifikasi SVM menghasilkan akurasi tertinggi yaitu 90% untuk I-E, 93% untuk N-S, 88% untuk T-F, dan 84% untuk J-P. Model yang diajukan memiliki potensi untuk menjadi solusi untuk memprediksi MBTI secara cepat dan baik pada data yang besar dalam bahasa Indonesia.
Active learning on Indonesian Twitter sentiment analysis using uncertainty sampling Muhaza Liebenlito; Nur Inayah; Esti Choerunnisa; Taufik Edy Sutanto; Suma Inna
Journal of Applied Data Sciences Vol 5, No 1: JANUARY 2024
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v5i1.144

Abstract

Nowadays, sentiment analysis research in social media is rapidly developing. Sentiment analysis typically falls under supervised learning, which requires annotating data. However, the annotation process for sentiment analysis tasks is notoriously time-consuming. Fortunately, an effective strategy to overcome this challenge has emerged, known as active learning. Active learning involves labeling only a small subset of the dataset, leaving the rest for annotation through sampling strategies. This study focuses on comparing two active learning strategies: random sampling and boundary sampling. These strategies are applied to machine learning models such as logistic regression and random forests. In addition, we present an evaluation of the model performance and data savings achieved by implementing these strategies in the context of traditional machine learning for sentiment analysis on Twitter. The dataset considered consists of two labels: positive and negative sentiments. The results of our investigation show that active learning can significantly reduce the amount of training data required, saving up to 65% of the total training data required to achieve peak model accuracy. The most successful model identified uses a random forest with a margin sampling strategy, yielding an accuracy of 81.12% and an F1 score of 88.60%. This research highlights the effectiveness of active learning strategies in sentiment analysis, demonstrating their potential to improve model performance and resource efficiency. The results underscore the viability of employing active learning methods, particularly the combination of random forest models with margin sampling, for more efficient sentiment analysis in social media.
Solusi Model Navier Stokes Korteweg dengan Syarat Batas Slip di Half-Space Berdimensi 3 Anisa Salsabila; Suma Inna; Muhaza Liebenlito; Rahmi Purnomowati
MAJAMATH: Jurnal Matematika dan Pendidikan Matematika Vol. 7 No. 1 (2024): Vol 7 No 1 Maret 2024
Publisher : Prodi Pendidikan matematika Universitas Islam Majapahit (UNIM), Mojokerto, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.36815/majamath.v7i1.3158

Abstract

Model Navier Stokes Kortewege merupakan model yang mendeskripsikan aliran fluida termampatkan dengan mempertimbangkan konstanta kapilaritas ( ) yang dikenal sebagai konstanta kapiler. Adapun penelitian ini bertujuan untuk membuktikan keberadaan operator solusi sistem persamaan resolvent model Navier Stokes Kortewege dengan syarat batas slip di half-space berdimensi 3  khususnya pada kasus koefisien  dan . Dalam mencari operator solusi sistem persamaan resolvent tersebut dilakukan beberapa langkah, seperti melakukan reduksi terhadap sistem persamaan resolvent tak homogen, kemudian dilakukan transformasi Fourier parsial terhadap sistem persamaan resolvent homogen untuk memperoleh persamaan diferensial biasa yang lebih sederhana. Sehingga, diperoleh operator solusi .
Deteksi Clickbait pada Judul Berita Online Berbahasa Indonesia Menggunakan FastText Muhaza Liebenlito; Arlianis Arum Yesinta; Muhamad Irvan Septiar Musti
Journal of Applied Computer Science and Technology Vol 5 No 1 (2024): Juni 2024: Article in Progres
Publisher : Indonesian Society of Applied Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52158/jacost.v5i1.655

Abstract

The rise of people accessing news portals has created intense competition between online media to get readers or visitors to maximize their revenue. This is what triggers the development of clickbait. Clickbait can reduce the quality of the news itself, and it also has the potential to be misinformation regarding to news contents as known as fake news. Therefore, it is necessary to detect news titles that contain clickbait. This study aims to obtain an optimal clickbait news title classification model using FastText. To get the optimal model can be done by cleaning the data and optimizing the model's hyperparameters. The model was trained using 9600 training data collected from Indonesian online news. The best model obtained in this study has performance with an accuracy of 77% and an F1-Score of 69%.