9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Higher Layer Coding of Non-Speech Like Signals Using Factorial Pulse Codebook

Udar Mittal, James P. Ashley, Jonathan Gibbs

Motorola, USA

A transform coding method for coding higher layers of a multilayer embedded speech and audio coding system using factorial pulse codebook is proposed. The proposed methods use frequency selective attenuation of lower layer output to reduce the spurious noise generated when speech model based coding method is used in lower layers for coding of the non-speech signals. The frequency selective attenuation along with the use of factorial pulse codebook makes the method suitable for coding non-speech like signals. A classifier for deciding whether a signal is speech like or non-speech like is also proposed. The proposed method is a part of an ITU embedded speech/audio coding standard (ITU-T G.EV-VBR). The formal listening tests confirm the benefits of using the proposed method for coding of music signals and speech signal having background music.

