Automatic extraction and recognition of Sinhala text from images with complex backgrounds

Show simple item record

dc.contributor.author Dasunika, A.K.P.J.
dc.contributor.author Wijerathna, E.H.M.P.M.
dc.date.accessioned 2023-02-10T08:40:57Z
dc.date.available 2023-02-10T08:40:57Z
dc.date.issued 2023-01-18
dc.identifier.issn 1391-8796
dc.identifier.uri http://ir.lib.ruh.ac.lk/xmlui/handle/iruor/11016
dc.description.abstract Sinhala is a unique and national language spoken only in Sri Lanka. Sinhala characters are difficult to recognize in images with complex backgrounds, and some visually impaired people cannot read/write properly due to their eye problems. This study is mainly focused on extracting and recognizing Sinhala text from images with complex backgrounds using a Convolutional neural network. Different 33 Sinhala characters (10 images per each character) are used as a training dataset. 400 images of bus destination name boards with Sinhala characters are collected as a testing dataset. The character recognition model is trained using CNN with the collected training dataset. The model is trained 10 times until an accuracy of 81% is achieved. The collected dataset of bus destination name board images is used to extract and recognize Sinhala characters and pre-processing is performed on them to check the availability of text in the images. After the recognition process, the non-text regions are removed. Three types of segmentation such as line, word, and character are performed on preprocessed images to segment each character on the image. The segmented characters are used as input for the recognition model, and it is successfully identified as Sinhala characters. Furthermore, there are numerous types of Sinhala optical character recognition and Sinhala handwritten character recognition research, however, there is no research on recognizing Sinhala text on images with complex backgrounds. Moreover, people with eye problems/issues can easily read Sinhala characters in images with a complex background as an important benefit of this proposed methodology. en_US
dc.language.iso en en_US
dc.publisher Faculty of Science, University of Ruhuna, Matara, Sri Lanka en_US
dc.subject Convolutional Neural Network en_US
dc.subject Sinhala text en_US
dc.subject Text Extraction en_US
dc.subject Text recognition en_US
dc.subject Complex background en_US
dc.title Automatic extraction and recognition of Sinhala text from images with complex backgrounds en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account