Improving quality of speech synthesis in Indian languages

P. K. Lehana, P. C. Pandey

Harmonic plus noise model (HNM) which divides the speech signal in two sub bands: harmonic and noise, is implemented with the objective of studying its capabilities for improving the quality of speech synthesis in Indian languages. Investigations show that HNM is capable of synthesizing all vowels and syllables with good quality. All the syllables are intelligible if synthesized using only harmonic part except /aSa/ and /asa/. This fact can reduce the size of the database. For pitch synchronous analysis and synthesis glottal closure instants (GCIs) should be accurately calculated. The quality of synthesized speech improves if these instants are obtained from the glottal signal (output of an impedance glottograph) instead of these being obtained by processing the speech signal. Further the noise part is synthesized pitch synchronously for voiced frames. A database of HNM parameters for VCV syllables is developed for Indian languages. The number of parameters for each frame is comparable to that of formant synthesizer but the quality of synthesized speech is much better.

