Indonesian Journal of Electrical Engineering and Computer Science
Vol 24, No 2: November 2021

Framework of diacritic segmentation for Arabic handwritten document

Ahmed Abdalla Shiekh (Universiti Teknikal Malaysia Melaka)
Mohd Sanusi Azmi (Universiti Teknikal Malaysia Melaka)
Maslita Abd Aziz (Universiti Teknikal Malaysia Melaka)
Mohammed Nasser Al-Mhiqani (Universiti Teknikal Malaysia Melaka)
Salem Saleh Bafjaish (Universiti Teknikal Malaysia Melaka)



Article Info

Publish Date
01 Nov 2021

Abstract

In recent Arabic standard language and Arabic dialectal texts, diacritics and short vowels are absent. There are some exceptions have been made for the Arabic beginner learner scripts, religious texts and as well as a significant political text. In addition, the text without diacritics is considered ambiguous due to numerous words with different diacritic marks seem identical. However, this paper we present a framework for segmenting diacritics from Arabic handwritten document by using region-based segmentation technique. Since Arabic handwritten and Mushaf Al-Quran contain many diacritical marks. Hence, the diacritics must be properly extracted from Arabic handwritten document to avoid losing some good features. Furthermore, the proposed framework is devised specifically to segment diacritics from Arabic handwritten image, thus there will be no feature extraction, feature selection, and classification processes included. Besides, we will present the methodology that is used to fulfil the objectives of this paper. The pre-processing phases will be explained and more specifically segmentation phase for segmenting diacritics which is the phase we concentrate more in this article. Lastly, we will identify the proposed technique region-based segmentation to facilitate our development throughout the experimental process.

Copyrights © 2021