Indonesian Journal of Electrical Engineering and Computer Science
Vol 15, No 3: September 2019

Pixel-wise classification using support vector machine for binarization of degraded historical document image

Fauziah Kasmin (Universiti Teknikal Malaysia Melaka)
Zuraini Othman (Universiti Teknikal Malaysia Melaka)
Sharifah Sakinah Syed Ahmad (Universiti Teknikal Malaysia Melaka)



Article Info

Publish Date
01 Sep 2019

Abstract

Binarization of historical documents nowadays is very important as digital archiving has become the best and preferred solution for the retrieval and storage of valuable archives. However, the process becomes more challenging due to the degradation of historical documents. Hence, this paper described a method on binarization of historical documents using the learning concept. Support vector machine (SVM) learning was used as a classifier in this work. After training some images with the help of ground truth images, a model was developed. Testing images then used the model to segregate each pixel as text or non-text. The grey level and RGB values were chosen as descriptors for a particular pixel and comparisons were made between these two descriptors. The intensities of the local neighbourhood for every pixel were used in the experiment. To compare these descriptors, standard dataset HDIBCO2014, DIBCO2012 and DIBCO2016 were used in the training and testing phase. The results from the experiment clearly showed that grey level values gave better performance compared to RGB values.

Copyrights © 2019