International Workshop on Spoken Language Translation (IWSLT) 2004
Keihanna Science City, Kyoto, Japan
In this paper we propose a new method of detecting and translating named entities (NE) from spoken language, e.g., Chinese broadcast news. This approach detects possible NE regions from less reliably recognized hypotheses using confidence measures. Each possible NE boundary within the region is compared with candidate NEs from retrieved documents based on their acoustic similarities and semantic correlations. These candidate NEs are re-ranked bv additionally incorporating general and topic-specific language models to measure the NE context consistency. This approach, combined with the HMM-based NE extraction on confidently recognized words, improves NE extraction F-score from 66% to 71% and NE translation quality from 69% to 73% over the baseline method. Systematic comparisons on NE translation quality with different speech input quality are also presented.
Bibliographic reference. Huang, Fei / Vogel, Stephan / Waibel, Alex (2004): "Towards named entity extraction and translation in spoken language translation", In IWSLT-2004, 131-137.