Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Structural Metadata Annotation: Moving Beyond English

Stephanie Strassel (1), Jáchym Kolár (2), Zhiyi Song (1), Leila Barclay (1), Meghan Glenn (1)

(1) University of Pennsylvania, USA; (2) University of West Bohemia in Pilsen, Czech Republic

The goal of metadata extraction (MDE) is to enable technology that can take raw speech-to-text output and refine it into forms that are more useful to humans and to downstream automatic processes. Starting in 2003, a structural metadata annotation task was defined for English as part of the DARPA EARS Program. A significant new challenge for MDE is the addition of new languages. This paper reports on work undertaken to apply MDE annotation to data from three very different languages: Mandarin Chinese, Levantine Arabic, and conversational Czech. Details of annotation task modifications are provided for each language; along with a general overview of data and annotation tools for non-English MDE.

Full Paper

Bibliographic reference.  Strassel, Stephanie / Kolár, Jáchym / Song, Zhiyi / Barclay, Leila / Glenn, Meghan (2005): "Structural metadata annotation: moving beyond English", In INTERSPEECH-2005, 1545-1548.