Seminar Nasional Teknologi Informasi Komunikasi dan Industri
2011: SNTIKI 3

Pattern-Based Stemmer Analysis and Implementation on Arabic Text

Ananda Wulandari (Fakultas Informatika Institut Teknologi Telkom, Bandung)
Kemas Rahmat S.W (Fakultas Informatika Institut Teknologi Telkom, Bandung)
Ade Romadhony (Fakultas Informatika Institut Teknologi Telkom, Bandung)

Article Info

Publish Date
12 Oct 2011


Pattern-based Stemmer is an implementation of searching algorithm to find  stem from an Arabic word that implement morphological anlysis technique and affix removal technique. In this research, if stemming process has been done, word class determination process will be conducted according this way: First, system would match between word which is entered with the fix word that is stored in the system. If the word was not found, word class determination rules will be conducted based on prefix, suffix, and infix. If this system could not figure out the word class of the word from the second step, then word class would be determined based on the word position in a sentence.Testing is commited in order to know the influences of the number of token, pattern and rule in the system to the system’s performance. Data that used in this testing are 37 surat in juz 30th from Al-Qur`an. They will be put into three categories, based on the number of rows of each surah : long surah, medium surah, and short surah. Based on the testing results, the best performance gained by storing more free-affix pattern, storing more word class determining rule, and adding affix elimination checking process into the system. Keywords: Arabic text, stemming, stem, word class.

Copyrights © 2011

Journal Info





Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Industrial & Manufacturing Engineering Mathematics


SNTIKI adalah Seminar Nasional Teknologi Informasi, Komunikasi dan Industri yang diselenggarakan setiap tahun oleh Fakultas Sains dan Teknologi Universitas Islam Negeri Sultan Syarif Kasim Riau. ISSN 2579 7271 (Print) | ISSN 2579 5406 ...