Abstract:
School-age children with speech disorders have difficulties in
communicating with others. They use sign language for communication
purposes, but it takes time to understand, learn and provide the appropriate
response. As a solution, teachers are using sign language along with lip
reading to make interaction with speech-impaired students. The discussions
with teachers revealed that it was very challenging and a tedious task to
understand lip reading accurately. This research is carried out to understand
and develop a solution to overcome a part of this communication challenge.
During this study, a framework was developed by the researchers to assist the
speech-impaired students using the lip-reading. It uses 60 Sinhala alphabet
letters, for each letter 50 pronouncing lip-reading videos of speech-impaired
students as input. The input video is then sent to a well-trained Sinhala
alphabet recognition video classification model which use motions of lips,
nose, chin, and cheeks. Then the model will predict the Sinhala alphabet
letters based on the input. Convolutional Neural Network and Long Shortterm
Memory techniques have been used to build the framework. The
framework provides 70% accuracy for the vowels in the Sinhala alphabet
recognition. However, the accuracy decreases up to 60% with Velar and
Retroflex alphabets. As an overall, this framework will support both teachers
and students who are speech-impaired to communicate effectively and
understand each other. Furthermore, the improved outcomes of the research
will lead to fulfilling the communication gaps and it will become a great
initiative for the community to connect with each other.