Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Indonesian Journal on Computing (Indo-JC)

Analysis of the Commutative Method Approach on English Thesaurus for Developing Synonym Sets Arini Rohmawati; Moch. Arif Bijaksana; Kemas Muslim Lhaksmana
Indonesia Journal on Computing (Indo-JC) Vol. 4 No. 2 (2019): September, 2019
Publisher : School of Computing, Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/INDOJC.2019.4.2.332

Abstract

WordNet is a lexical database for languages, the difference between WordNet and dictionaries in general is that WordNet focuses on the synonyms. The main unit of WordNet is synonym set (synset), synset is a set of one or more words that have the same meaning and certainly can be replaced in certain contexts. Synset is a very important element in implementing WordNet. In this paper, an analysis of the synonym extraction process is carried out by using commutative approach, the data test obtained from the Oxford Paperback Thesaurus by taking 51 word entries. Commutative method has similar characters with synonym set, synonym set can replace each other in certain contexts. The data test extraction process is carried out until the performance measurement evaluation process using F1Score. The system generates synonym sets that matched with the manual extraction, the result of F1Score between the program and Princeton synonym sets are worth 10%.
Entity Recognition for Quran English Version with Supervised Learning Approach Muhammad Aris Maulana; Moch. Arif Bijaksana; Arief Fatchul Huda
Indonesia Journal on Computing (Indo-JC) Vol. 4 No. 3 (2019): December, 2019
Publisher : School of Computing, Telkom University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.34818/INDOJC.2019.4.3.362

Abstract

The Quran is a Muslim holy book that consists of 6236 ayat or verses which divides into 144 surahs or chapters. In each chapter, there are many entities scattered in each verse. For a person, finding a particular entity will be difficult without a classification process, Resulting in difficulties in understanding the Quran. A system can be modeled to extract the information on entities in the Quran to solve this problem. Therefore, we want to offer a method to identify and classify entities using Entity recognition. The system will use the SVM techniques where the system will be given various entities from the Quran as an input to be able to identify correct entities. We are using the dataset obtained from website tanzil.net consists of 19.473 tokens and 720 entities. The classification scenario using a linear kernel with unigram produces the highest f-measure value of 0.75.Al-Quran merupakan kitab suci Muslim yang terdiri dari 6236 ayat atau bait yang dibagi menjadi 144 surah atau bab. Di setiap bab, ada banyak entitas yang tersebar di setiap ayat. Bagi seorang individu, menemukan entitas tertentu akan sulit tanpa proses klasifikasi yang membuat kesulitan dalam memahami Quran. Sebuah sistem dapat dimodelkan untuk mengekstrak informasi tentang entitas dalam Al-Quran untuk menyelesaikan masalah ini. Oleh karena itu, kami menawarkan sistem untuk mengidentifikasi dan mengklasifikasikan entitas menggunakan Entity Recognition. Sistem akan menggunakan teknik SVM di mana sistem akan diberikan berbagai entitas dari Quran sebagai input untuk dapat mengidentifikasi entitas yang benar. Kami menggunakan dataset yang diperoleh dari situs web tanzil.net terdiri dari 19.473 tokens dan 720 entitas. Skenario klasifikasi yang menggunakan linear kernel dengan unigram memperoleh nilai f-measure tertinggi sebesar 0,75.