Abstract:
Sinhala language is used by Sinhalese, the major ethnic group native to Sri Lanka, that
comprise the majority of the population (75%) in the country. Most of the textual data gathered
in Sri Lanka is in the Sinhala language. The written paper documents in Sinhala is converted
to electronic format requires an enormous amount of human labour. If the conversion can be
automated using handwriting recognition, it would increase efficiency and a significant cost
can be saved. In most of the researches the characters are segmented and recognized. Many
applications require identifying words rather than individual character modifier combinations.
When the identified characters are put together to create a word, if the characters are not
identified correctly from the character recognition algorithm, the word as a whole is going to
be incorrectly identified. In this research, a method to improve the recognition rates using a
dictionary, consonant and vowel information is used. The recognition rate could be improved
from 63% to 79%.