September 22-25, 1997
This paper deals with two methods for automatically finding multiple phonetic transcriptions of words, given sample utterances of the words and an inventory of context-dependent subword units. The two approaches investigated are based on an analysis of the N-best phonetic decoding of the available utterances. In the set of transcriptions resulting fromthe N-best decoding of all the utterances, the first method selects the K most frequent variants (Frequency Criterion) , while the second method selects the K most likely ones (Maximum Likelihood Criterion). Experiments carried out on speaker-independent recognition showed that the performance obtained with the "Maximum Likelihood Criterion" is not much different from that obtained with manual transcriptions. In the case of speaker-dependent speech recognition, the estimate of the 3 most likely transcription variants of each word, yields promising results.
Bibliographic reference. Mokbel, Houda / Jouvet, Denis (1997): "Automatic derivation of multiple variants of phonetic transcriptions from acoustic signals", In EUROSPEECH-1997, 1619-1622.