Abstract:
Around the world, hearing-impaired and speech-impaired people are using
different kinds of sign languages to communicate with each other and others.
In Sri Lanka, they use Sinhala Sign Language (SSL) to communicate. SSL
consists with more than 2000 sign-based words which cover the basic three
parts of sign language which are isolated signs (Static signs), continuous signs,
and annotations. Apart from the people who are using, others find it difficult
to understand SSL. Due to the fact, the impaired people are facing difficulties
in day-to-day communication. To address this difficulty in communication, a
prototype model was proposed to translate the SSL signs to words in real-time
by capturing the hand gestures of SSL with the aid of video processing,
MediaPipe and Long Short-term Memory (LSTM) techniques. As a starting
point, the proposed model was developed to recognize selected static SSL
signs. Mobile phone captured 250 videos of the selected signs from impaired
persons were used as inputs to the model. 30 extracted frames from each input
video are then used to extract right hand, left hand, and face landmarks.
Finally, the extracted landmarks are fed into a well-trained Convolutional
Neural Network model. This development reached an overall accuracy of over
65% for the selected static SSL gestures. The model will be further developed
to a simple and efficient mobile application to convert isolated signs (Static
signs), continuous signs, and annotations made by an impaired person.