INTERSPEECH 2004 - ICSLP
This paper introduces a novel speech coder structure for storage applications operating at low bit rates. The coder exploits the inherent segmental nature of speech signals by dividing the input into segments of variable length. Quite often the length of the segment is the same as the length of the phoneme. The individual segments are coded using adaptive techniques that take into account the relative perceptual importance of different types of speech, e.g. voiced and unvoiced speech. These main features of the proposed approach are enabled by the fact that many of the design constraints related to realtime conversational speech can be relaxed in storage applications. A practical implementation containing the speech-adaptive segmentation is described and its performance is verified in a listening test at average bit rates of about 1.0 kbps and 2.4 kbps respectively. The results show that the segmental model significantly improves the coding efficiency.
Bibliographic reference. Ramo, Anssi / Nurminen, Jani / Himanen, Sakari / Heikkinen, Ari (2004): "Segmental speech coding model for storage applications", In INTERSPEECH-2004, 2677-2680.