Obtaining good pronunciations for named-entities poses a challenge for automated speech recognition because named-entities are diverse in nature and origin, and new entities come up every day. In this paper, we investigate the feasibility of learning named-entity pronunciations using crowd-sourcing. By collecting audio samples from non-linguistic-expert speakers with Mechanical Turk and learning from them, we can quickly derive pronunciations that are more accurate in speech recognition tests than manual pronunciations generated by linguistic experts. Compared to traditional approaches of generating pronunciations, this new approach proves to be cheap, fast, and quite accurate.
Bibliographic reference. Rutherford, Attapol T. / Peng, Fuchun / Beaufays, Françoise (2014): "Pronunciation learning for named-entities through crowd-sourcing", In INTERSPEECH-2014, 1448-1452.