INTERSPEECH 2008
9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

LTS Using Decision Forest of Regression Trees and Neural Networks

Tanuja Sarkar (1), Sachin Joshi (1), Sathish Chandra Pammi (1), Kishore Prahallad (2)

(1) IIIT Hyderabad, India; (2) Carnegie Mellon University, USA

Letter-to-sound (LTS) rules play a vital role in building a speech synthesis system. In this paper, we apply various Machine Learning approaches like Classification and Regression Trees (CART), Decision Forest, forest of Artificial Neural Network (ANN) and Auto Associative Neural Networks (AANN) for LTS rules. We used these techniques mainly for Schwa deletion in Hindi. We empirically show that the LTS using Decision Forest and Forest of ANNs outperforms the previous CART and normal ANN approaches respectively, and the non discriminative learning technique of AANN could not capture the LTS rules as efficiently as discriminative techniques. We explore use of syllabic features, namely, syllabic structure, onset of the syllable, number of syllables and place of Schwa along with primary contextual features. The results showed that use of these features leads to good performance. The Decision Forest and forest of ANNs approaches yielded phone accuracy of 92.86% and 93.18% respectively using the newly incorporated features for Hindi LTS.

Full Paper

Bibliographic reference.  Sarkar, Tanuja / Joshi, Sachin / Pammi, Sathish Chandra / Prahallad, Kishore (2008): "LTS using decision forest of regression trees and neural networks", In INTERSPEECH-2008, 1885-1888.