A transform coding method for coding higher layers of a multilayer embedded speech and audio coding system using factorial pulse codebook is proposed. The proposed methods use frequency selective attenuation of lower layer output to reduce the spurious noise generated when speech model based coding method is used in lower layers for coding of the non-speech signals. The frequency selective attenuation along with the use of factorial pulse codebook makes the method suitable for coding non-speech like signals. A classifier for deciding whether a signal is speech like or non-speech like is also proposed. The proposed method is a part of an ITU embedded speech/audio coding standard (ITU-T G.EV-VBR). The formal listening tests confirm the benefits of using the proposed method for coding of music signals and speech signal having background music.
Bibliographic reference. Mittal, Udar / Ashley, James P. / Gibbs, Jonathan (2008): "Higher layer coding of non-speech like signals using factorial pulse codebook", In INTERSPEECH-2008, 671-674.