15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Pronunciation Learning for Named-Entities Through Crowd-Sourcing

Attapol T. Rutherford (1), Fuchun Peng (2), Françoise Beaufays (2)

(1) Brandeis University, USA
(2) Google, USA

Obtaining good pronunciations for named-entities poses a challenge for automated speech recognition because named-entities are diverse in nature and origin, and new entities come up every day. In this paper, we investigate the feasibility of learning named-entity pronunciations using crowd-sourcing. By collecting audio samples from non-linguistic-expert speakers with Mechanical Turk and learning from them, we can quickly derive pronunciations that are more accurate in speech recognition tests than manual pronunciations generated by linguistic experts. Compared to traditional approaches of generating pronunciations, this new approach proves to be cheap, fast, and quite accurate.

Full Paper

Bibliographic reference.  Rutherford, Attapol T. / Peng, Fuchun / Beaufays, Françoise (2014): "Pronunciation learning for named-entities through crowd-sourcing", In INTERSPEECH-2014, 1448-1452.