5th International Conference on Spoken Language Processing
For voiced speech the main excitation of the vocal tract occurs at the end of the glottal closing phase when the rate of change of the flow reaches its absolute maximum. This study presents a straightforward method that yields a numerical value to characterize the effect of the main excitation on vocal intensity. The method, Energy Ratio by Modified Excitation (ERME), takes advantage of the glottal flow and the model of the vocal tract transfer function given by inverse filtering and it synthesizes two signals based on the source-filter theory. The first synthesized sound is produced using the glottal flow waveform given by inverse filtering per se. The second signal is synthesized by removing the main excitation from the differentiated glottal flow. ERME is defined as the ratio between the energy of the first synthesized signal and the energy of the second one. It is shown that when the loudness of speech increases, the value of ERME first rises but in the case of loud voices it starts to decrease. This behavior of ERME shows that effects of secondary excitations of the vocal tract that occur during glottal opening become important in the production of loud voices.
Bibliographic reference. Alku, Paavo / Vintturi, Juha / Vilkman, Erkki (1998): "Analyzing the effect of secondary excitations of the vocal tract on vocal intensity in different loudness conditions", In ICSLP-1998, paper 0067.