8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Strategies for Automatic Multi-Tier Annotation of Spoken Language Corpora

Steven Greenberg

The Speech Institute, USA

Spoken corpora of the future will be annotated at multiple levels of linguistic organization largely through automatic methods using a combination of sophisticated signal processing, statistical classifiers and expert knowledge. It is important that annotation tools be adaptable to a wide range of languages and speaking styles, as well as readily accessible to the speech research and technology communities around the world. This latter objective is of particular importance for minority languages, which are less likely to foster development of sophisticated speech technology without such universal access.

Full Paper

Bibliographic reference.  Greenberg, Steven (2003): "Strategies for automatic multi-tier annotation of spoken language corpora", In EUROSPEECH-2003, 45-48.