Gender, race and religion prediction of Sri Lankan personal names using machine learning techniques

Show simple item record

dc.contributor.author Chathuranga, P.D.T.
dc.contributor.author Loresnsuhewa, S.A.S.
dc.contributor.author Kalyani, M.A.L.
dc.date.accessioned 2023-02-24T04:33:44Z
dc.date.available 2023-02-24T04:33:44Z
dc.date.issued 2020-01-22
dc.identifier.issn 1391-8796
dc.identifier.uri http://ir.lib.ruh.ac.lk/xmlui/handle/iruor/11477
dc.description.abstract Prediction of identification details such as gender, race and religion of a person can help natural language processing related tasks to perform better. Also it can be used to speed up the existing digital application filling processes by providing suggestions. To the best of our knowledge, few researches were carried out on gender prediction and no research on race or religion prediction was carried out for Sri Lankan names. We performed gender, race and religion prediction based on Sri Lankan personal names, which were written using both Sinhala Unicode and English characters. Feature vectors were constructed as character n-grams and Multinomial Naïve Bayes & Support Vector Machine classification techniques were used for the prediction. Highest accuracies between 89% - 98% were obtained for all three predictions performed. Promising results demonstrated the possibility to use n-gram models with machine learning techniques to predict gender, race and religion of Sri Lankan names. en_US
dc.language.iso en en_US
dc.publisher Faculty of Science, University of Ruhuna, Matara, Sri Lanka en_US
dc.subject Natural language processing en_US
dc.subject Machine learning en_US
dc.subject Gender prediction and prediction of identification details en_US
dc.title Gender, race and religion prediction of Sri Lankan personal names using machine learning techniques en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account