Interspeech'2005 - Eurospeech
We address the problem of predicting pauses between the words in a sentence, which is of considerable interest for text to speech systems. In doing so, we show that the performance of both a generative classifier (naive Bayes, NB) and a discriminative classifier (maximum entropy, ME) can be significantly enhanced by application of the generalised probabilistic descent (GPD) algorithm. The features used for prediction of pauses in sentences are both local (derived from the neighbourhood of a word juncture) and global (derived from a parse tree of the sentence). We first compare the results of using the NB and ME classifiers on these features, and then develop the theory required for applying GPD to these classifiers. We show that GPD is particularly suitable for application within the maximum entropy framework and increases very significantly the discriminative power of both the NB and ME classifiers. The F-score of 81.2% obtained after application of GPD to an ME classifier is believed to be the best performance obtained on the Boston Radio Corpus.
Bibliographic reference. Cox, Stephen (2005): "A discriminative approach to phrase break modelling", In INTERSPEECH-2005, 3229-3232.