September 22-25, 1997
The work described in this paper attempts to automatically generate word baseforms as used in the pronunciation dictionaries of large vocabulary speech recognition systems. The input to the algorithm consists of several sample utterances per word. No additional information, like e.g. word spelling, is used. The task involves determining a suitable inventory of subword units (SWU) as well as determining the baseforms themselves. Experiments show that improvements over a triphone based dictionary are possible with less than ten sample utterances per word if test and training vocabularies are different. A possible application would be a system based on a fixed inventory of HMM-models that needs to be adapted to different vocabularies.
Bibliographic reference. Hauenstein, Andreas (1997): "Signal driven generation of word baseforms from few examples", In EUROSPEECH-1997, 1031-1034.