Abstract:
Speech is a pressure wave generated in air. When air is pushed through the
vocal folds with sufficient pressure, the vocal fold vibrates and produces voice
with the help of articulators. Inside the vocal tract, air is resonated, and these
resonances are known as formants in acoustic phonetic. Formant is the most
significant parameter used in speech sound analysis. Speech acoustic
characteristics are not only different from person to person but also from
language to language. In this study, formant analysis of Sinhala vowel sounds
is done to investigate the correlation between speaker gender and acoustic
characteristics. Randomly selected 66 native Sinhala speakers of 31 males and
35 females of age range between 19 to 36 years were used in the study. Twelve
Sinhala vowels were used as the speech material. Audio files were recorded by
using a smart phone having a sampling rate of 44 kHz in a quiet room (25 dB).
Recorded MPEG 1 Audio Layer III files were converted to wave files and fed
to Praat software. The first three formants, 1 2 3 f , f and f the most stable in
each vowel were determined by analyzing the respective spectrograms. R and
SPSS software were used for statistical analysis. The estimated discriminant
equation of and 1 2 f f is f2 = -1.36f1 + 2486.36. The statistical measures of the
performance of a binary classification, the hit ratio, sensitivity and specificity
were 90.8%, 96.6% and 86.1% respectively. This strong separation justify that
the gender parameter is grouped with the f1 and f2 variables. Therefore,
formant analysis can be used to predict the gender of an unknown speaker,
even for telephone conversations.