Impulse-sequence representation of the excitation source component of normal speech signal has been of considerable interest in speech coding research. If a similar representation can be made for nonverbal (i.e., nonnormal or nonneutral) speech sounds, that would immensely help in their acoustic analyses and diverse applications. This paper proposes a representation of the excitation source characteristics of nonverbal speech sounds signal, in terms of a time-domain sequence of impulses or impulse-like pulses. The nonverbal speech sounds are examined in three categories, namely, emotional speech, paralinguistic sounds and expressive voices. This categorisation is proposed, based upon the degree of rapid changes in pitch of these sounds. A modified zero-frequency filtering (modZFF) method is proposed for obtaining an impulse sequence representation of the excitation source component in the acoustic signal of non-verbal speech sounds. Effectiveness of the proposed representation is validated by analysis-by-synthesis approach and perceptual evaluation for Noh singing voice signals. This representation may also be helpful in significant savings in the terms of signal storage and processing requirement, apart from analysis and speech coding of the nonverbal sounds.
Cite as: Mittal, V.K., Yegnanarayana, B. (2016) An Impulse Sequence Representation of the Excitation Source Characteristics of Nonverbal Speech Sounds. Proc. 7th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT 2016), 69-74, doi: 10.21437/SLPAT.2016-12
@inproceedings{mittal16b_slpat, author={Vinay Kumar Mittal and B. Yegnanarayana}, title={{An Impulse Sequence Representation of the Excitation Source Characteristics of Nonverbal Speech Sounds}}, year=2016, booktitle={Proc. 7th Workshop on Speech and Language Processing for Assistive Technologies (SLPAT 2016)}, pages={69--74}, doi={10.21437/SLPAT.2016-12} }