11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Extending the Punctuation Module for European Portuguese

Fernando Batista (1), Helena Moniz (1), Isabel Trancoso (1), Hugo Meinedo (1), Ana Isabel Mata (2), Nuno Mamede (1)

(1) INESC-ID Lisboa, Portugal
(2) FLUL/CLUL, Portugal

This paper describes our recent work on extending the punctuation module of automatic subtitles for Portuguese Broadcast News. The main improvement was achieved by the use of prosodic information. This enabled the extension of the previous module which covered only full stops and commas, to cover question marks as well. The approach uses lexical, acoustic and prosodic information. Our results show that the latter is relevant for all types of punctuation. An analysis of the results also shows what type of interrogative is better dealt with by our method, taking into account the specificities of Portuguese. This may lead to different results for different types of corpora, depending on the types of interrogatives that are more frequent.

Full Paper

Bibliographic reference.  Batista, Fernando / Moniz, Helena / Trancoso, Isabel / Meinedo, Hugo / Mata, Ana Isabel / Mamede, Nuno (2010): "Extending the punctuation module for european portuguese", In INTERSPEECH-2010, 1509-1512.