In this paper we propose a new, automatic optimal baseform determination algorithm. Given a set of subword Hidden Markov Models (HMMs) and acoustic tokens of a specific word, we apply the tree-trellis N-best search algorithm to find the optimal base-forms (transcriptions) in the maximum likelihood sense. Different token preselection algorithms have been investigated to facilitate fast search for representative baseforms and to alleviate the problem of representing vastly different pronounciations with a single baseform. The DARPA Resource Management database was used for evaluating the new baseform optimization algorithrty, improvements of recognition rates using different token selection algorithms and the tree-trellis search have been consistently obtained.
Bibliographic reference. Svendsen, Torbjorn / Soong, Frank K. / Purnhagen, Heiko (1995): "Optimizing baseforms for HMM-based speech recognition", In EUROSPEECH-1995, 783-787.