11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Exploiting Variety-Dependent Phones in Portuguese Variety Identification Applied to Broadcast News Transcription

Oscar Koller (1), Alberto Abad (1), Isabel Trancoso (1), Céu Viana (2)

(1) INESC-ID Lisboa, Portugal
(2) CLUL, Portugal

This paper presents a Variety IDentification (VID) approach and its application to broadcast news transcription for Portuguese. The phonotactic VID system, based on Phone Recognition and Language Modelling, focuses on a single tokenizer that combines distinctive knowledge about differences between the target varieties. This knowledge is introduced into a Multi-Layer Perceptron phone recognizer by training mono-phone models for two varieties as contrasting phone-like classes. Significant improvements in terms of identification rate were achieved compared to conventional single and fused phonotactic and acoustic systems. The VID system is used to select data to automatically train variety-specific acoustic models for broadcast news transcription. The impact of the selection is analyzed and variety-specific recognition is shown to improve results by up to 13% compared to a standard variety baseline.

Full Paper

Bibliographic reference.  Koller, Oscar / Abad, Alberto / Trancoso, Isabel / Viana, Céu (2010): "Exploiting variety-dependent phones in portuguese variety identification applied to broadcast news transcription", In INTERSPEECH-2010, 749-752.