Profanity word detection for Sinhala language using deep learning

Kumara, S.K.H.R.S.; Jayaneththi, J.K.D.B.G.

IRUOR Home
→
Scholarly Publications
→
Conference and Symposia Proceedings
→
Ruhuna International Science and Technology Conference
→
RISTCON 2023
→
View Item

Profanity word detection for Sinhala language using deep learning

Kumara, S.K.H.R.S.; Jayaneththi, J.K.D.B.G.

URI: http://ir.lib.ruh.ac.lk/xmlui/handle/iruor/11022

Date: 2023-01-18

Abstract:

In the present world, content censorship is an important concept. When it comes to the Sinhala language, several studies have been conducted on textbased content censorship methods, but not for audio content. The Sri Lankan government prohibits the use of profanity in public media. Therefore, Sri Lankan media companies must check their videos for profanity before telecasting. Till now, this process has been done manually, and it is extremely difficult with long videos and audio clips. This study suggests developing a deep learning model that can automatically find profanity words in Sinhala audio files. The ten profanity words were selected and audio samples from 100 people were gathered. The data was preprocessed, transformed into spectrogram images, and applied to a convolutional neural network (CNN) to develop the profanity filter model. By converting audio files to spectrograms and applying image processing to extract the features from the dataset, the model predicts the profanity words. This paper addresses the procedure of the mentioned process and its capabilities with upcoming updated versions of the final product.

Show full item record