This paper explores the use of continuous speech data to learn stochastic lexicons. Building on previous work in which we augmented graphones with acoustic examples of isolated words, we extend our pronunciation mixture model framework to two domains containing spontaneous speech: a weather information retrieval spoken dialogue system and the academic lectures domain. We find that our learned lexicons out-perform expert, hand-crafted lexicons in each domain.
Bibliographic reference. Badr, Ibrahim / McGraw, Ian / Glass, James (2011): "Pronunciation learning from continuous speech", In INTERSPEECH-2011, 549-552.