Discriminating Nasals and Approximants in English Language Using Zero Time Windowing

RaviShankar Prasad, Sudarsana Reddy Kadiri, Suryakanth V Gangashetty, Bayya Yegnanarayana

Nasals and approximants consonants are often confused with each other. Despite the distinction in the production mechanism, these two sound classes exhibit a similar low frequency behavior and lack significant high frequency content. The present study uses a spectral representation obtained using the zero time windowing (ZTW) analysis of speech, for the task of distinction between these two. The instantaneous spectral representation has good resolution at resonances, which helps to highlight the difference in the acoustic vocal tract system response for these sounds. The ZTW spectra around the regions of glottal closure instants are averaged to derive parameters for their classification in continuous speech. A set of parameters based on the dominant resonances, center of gravity, band energy ratio and cumulative spectral sum in low frequencies, is derived from the average spectrum. The paper proposes classification using a knowledge-based approach and training a support vector machine. These classifiers are tested on utterances from different English speakers in the TIMIT dataset. The proposed methods result in an average classification accuracy of 90% between the two classes in continuous speech.

 DOI: 10.21437/Interspeech.2018-1032

Cite as: Prasad, R., Kadiri, S.R., Gangashetty, S.V., Yegnanarayana, B. (2018) Discriminating Nasals and Approximants in English Language Using Zero Time Windowing. Proc. Interspeech 2018, 177-181, DOI: 10.21437/Interspeech.2018-1032.

  author={RaviShankar Prasad and Sudarsana Reddy Kadiri and Suryakanth V Gangashetty and Bayya Yegnanarayana},
  title={Discriminating Nasals and Approximants in English Language Using Zero Time Windowing},
  booktitle={Proc. Interspeech 2018},