Article Info
Corner Pixel-Based Method for Selecting Binary Text in Scene
Ednawati Rainarli, Suprapto, Wahyono
dx.doi.org/10.17576/apjitm-2024-1302-05
Abstract
Text detection in natural images is a process to indicate the location and presence of text appearing in images. The complexity of the background images, the similarity of text shapes to non-text objects, and the variability in text shapes and colours make automatic text detection in natural images challenging to achieve using traditional image processing techniques alone. The machine learning methods are one way to perform filtering to eliminate non-text candidates. We used secondary data as additional training data, such as in the ICDAR 2011, ICDAR 2013, and ICDAR 2015. The diversity of text colours in these datasets makes binary image processing not uniformly applicable to each image. Therefore, in this study, we proposed a method to process text images and automatically select between binary or negative binary images by checking pixels at the four corners of the binary and negative binary images. If the number of white pixels is greater than or equal to two, select the negative binary image; otherwise, select the binary image. This way, we automatically selected suitable images for feature extraction before using them to build text and non-text classification models. For low-resolution text images and digitally created text images, in ICDAR 2011, the accuracy of selecting binary text images reached 85.00%. For focused text taken with specific purposes and horizontal text appearances, like in ICDAR 2013, the accuracy of selected binary text images reached up to 92.10%. The accuracy of binary text image selection reached 66.67% for incidental text with multi-oriented text positions. Based on the research results, the proposed strategy can work optimally, especially for focused text with various colours, including white or black coloured text, with diverse sizes and types of text.
keyword
Binary text; scene text detection; text candidates; text segmentation
Area
Multimedia and Usability