Claim Missing Document
Check
Articles

Found 4 Documents
Search

Perbandingan Metode Term Weighting terhadap Hasil Klasifikasi Teks pada Dataset Terjemahan Kitab Hadis Ni'mah, Ana Tsalitsatun; Arifin, Agus Zainal
Rekayasa Vol 13, No 2: August 2020
Publisher : Universitas Trunojoyo Madura

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (434.672 KB) | DOI: 10.21107/rekayasa.v13i2.6412

Abstract

Hadis adalah sumber rujukan agama Islam kedua setelah Al-Qur’an. Teks Hadis saat ini diteliti dalam bidang teknologi untuk dapat ditangkap nilai-nilai yang terkandung di dalamnya secara pegetahuan teknologi. Dengan adanya penelitian terhadap Kitab Hadis, pengambilan informasi dari Hadis tentunya membutuhkan representasi teks ke dalam vektor untuk mengoptimalkan klasifikasi otomatis. Klasifikasi Hadis diperlukan untuk dapat mengelompokkan isi Hadis menjadi beberapa kategori. Ada beberapa kategori dalam Kitab Hadis tertentu yang sama dengan Kitab Hadis lainnya. Ini menunjukkan bahwa ada beberapa dokumen Kitab Hadis tertentu yang memiliki topik yang sama dengan Kitab Hadis lain. Oleh karena itu, diperlukan metode term weighting yang dapat memilih kata mana yang harus memiliki bobot tinggi atau rendah dalam ruang Kitab Hadis untuk optimalisasi hasil klasifikasi dalam Kitab-kitab Hadis. Penelitian ini mengusulkan sebuah perbandingan beberapa metode term weighting, yaitu: Term Frequency Inverse Document Frequency (TF-IDF), Term Frequency Inverse Document Frequency Inverse Class Frequency (TF-IDF-ICF), Term Frequency Inverse Document Frequency Inverse Class Space Density Frequency (TF-IDF-ICSδF), dan Term Frequency Inverse Document Frequency Inverse Class Space Density Frequency Inverse Hadith Space Density Frequency (TF-IDF-ICSδF-IHSδF). Penelitian ini melakukan perbandingan hasil term weighting terhadap dataset Terjemahan 9 Kitab Hadis yang diterapkan pada mesin klasifikasi Naive Bayes dan SVM. 9 Kitab Hadis yang digunakan, yaitu: Sahih Bukhari, Sahih Muslim, Abu Dawud, at-Turmudzi, an-Nasa'i, Ibnu Majah, Ahmad, Malik, dan Darimi. Hasil uji coba menunjukkan bahwa hasil klasifikasi menggunakan metode term weighting TF-IDF-ICSδF-IHSδF mengungguli term weighting lainnya, yaitu mendapatkan Precission sebesar 90%, Recall sebesar 93%, F1-Score sebesar 92%, dan Accuracy sebesar 83%.Comparison of a term weighting method for the text classification in Indonesian hadithHadith is the second source of reference for Islam after the Qur’an. Currently, hadith text is researched in the field of technology for capturing the values of technology knowledge. With the research of the Book of Hadith, retrieval of information from the hadith certainly requires the representation of text into vectors to optimize automatic classification. The classification of the hadith is needed to be able to group the contents of the hadith into several categories. There are several categories in certain Hadiths that are the same as other Hadiths. Shows that there are certain documents of the hadith that have the same topic as other Hadiths. Therefore, a term weighting method is needed that can choose which words should have high or low weights in the Hadith Book space to optimize the classification results in the Hadith Books. This study proposes a comparison of several term weighting methods, namely: Term Frequency Inverse Document Frequency (TF-IDF), Term Frequency Inverse Document Frequency Inverse Class Frequency (TF-IDF-ICF), Term Frequency Inverse Document Frequency Inverse Class Space Density Frequency (TF-IDF-ICSδF) and Term Frequency Inverse Document Frequency Inverse Class Space Density Frequency Inverse Hadith Space Density Frequency (TF-IDF-ICSδF-IHSδF). This research compares the term weighting results to the 9 Hadith Book Translation dataset applied to the Naive Bayes classification engine and SVM. 9 Books of Hadith are used, namely: Sahih Bukhari, Sahih Muslim, Abu Dawud, at-Turmudzi, an-Nasa’i, Ibn Majah, Ahmad, Malik, and Darimi. The trial results show that the classification results using the TF-IDF-ICSδF-IHSδF term weighting method outperformed another term weighting, namely getting a Precession of 90%, Recall of 93%, F1-Score of 92%, and Accuracy of 83%.
Autonomy Stemmer Algorithm for Legal and Illegal Affix Detection use Finite-State Automata Method Ana Tsalitsatun Ni'mah; Dwi Ari Suryaningrum; Agus Zainal Arifin
EPI International Journal of Engineering Vol 2 No 1 (2019): Volume 2 Number 1, February 2019 with Special Issue on Composite Materials & Stru
Publisher : Center of Techonolgy (COT), Engineering Faculty, Hasanuddin University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25042/epi-ije.022019.09

Abstract

Stemming is the process of separating words from their affixes to get a basic word. Stemming is generally used when preprocessing in text-based applications. Indonesian Stemming has developed research which is divided into two types, namely, stemming without dictionaries and stemming using dictionaries. Stemming without dictionaries has a disadvantage in the results of removal of affixes which are sometimes inappropriate so that it results in over stemming or under stemming, while stemming using dictionaries has a disadvantage during the stemming process which is relatively long and cannot eliminate affixes to compound words. This study proposes a new stemming algorithm without a dictionary that is able to detect legal and illegal affixes in Indonesian using the Finite-State Automata method. The technique used is rule-based Stemmer based on Indonesian language morphology with Regular Expression. Test results were carried out using 118 news documents with 15792 words. The first test results on the autonomy stemmer algorithm obtain the correct word which amounts to 10449 of the total number of words processed, which means getting an average accuracy of 66%. The second test results on the autonomy stemmer algorithm get the results of the average speed of 0.0051 seconds. The third test result is being able to do the elimination of affixes to compound words.
Survei Dampak Penggunaan Integrasi Berkelanjutan dalam Perusahaan Pengembangan Perangkat Lunak Kharisma Monika Dian Pertiwi; Ana Tsalitsatun Ni’mah; Siti Rochimah
Jurnal Nasional Teknik Elektro dan Teknologi Informasi Vol 8 No 2: Mei 2019
Publisher : Departemen Teknik Elektro dan Teknologi Informasi, Fakultas Teknik, Universitas Gadjah Mada

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (891.631 KB)

Abstract

Continuous Integration (CI) is a software development technique adopted from the agile method. CI is widely used by software development companies, so there is a need for research to determine the impact of using CI in the software development industry. This study aims to analyze the impact of the use of CI on software and software development companies that are being developed. This research applies the Systematic Literature Review (SLR) research method. This study has two Research Questions, namely RQ, (1) “What is the impact of using Continuous Integration in software development?” (2) “What is the effect of using Continuous Integration on the company?”. The impact of the use of CI was identified by conducting a literature search for CI which was published in 2012 until 2018. Literature search was conducted on the IEEE Xplore and Science Direct. From the search, a total of 6,514 literature regarding CI is found. Then, a screening process is carried out based on inclusion criteria, exclusion criteria, and quality assessment of literature. After screening, 14 literature were selected. The selected literature met the specified criteria and could represent to determine the impact of using CI. Out of the 14 selected literatures, 13 literatures were able to answer the two research questions. Based on the SLRs that have been done, it is shown that the use of CI in software development can have good and bad effects on software and software development companies.
Term Weighting Based Indexing Class and Indexing Short Document for Indonesian Thesis Title Classification Ana Tsalitsatun Ni'mah; Fahmi Syuhada
Journal of Computer Science and Informatics Engineering (J-Cosine) Vol 6 No 2 (2022): December 2022
Publisher : Informatics Engineering Dept., Faculty of Engineering, University of Mataram

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29303/jcosine.v6i2.471

Abstract

Document classification nowadays is an easy thing to do because there are the latest methods to get maximum results. Document classification using the term weighting TF-IDF-ICF method has been widely studied. Documents used in this research generally use large documents. If the term weighting TF-IDF method is used in a short text document such as the Thesis Title, the document will not get a perfect score from the classification results. Because in the IDF will calculate the weight of words that always appear to be few, ICF will calculate the weight of words that often appear in the class to be few. While the word should have great weight to be the core of a short text document. Therefore, this study aims to conduct research on word weighting based on class indexation and short document indexation, namely TF-IDF-ICF-IDSF. This study uses a classification comparison Naïve Bayes and SVM. The dataset used is Thesis Title of Informatics Education student at Trunojoyo Madura University. The test results show that the classification results using the TF-IDF-ICF-IDSF term weighting method outperform other term weighting, namely getting 91% Precision, 93% Recall, 86% F1-Score, and 84% Accuracy on SVM.