12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Memory-Based Approximation of the Gaussian Mixture Model Framework for Bandwidth Extension of Narrowband Speech

Amr H. Nour-Eldin, Peter Kabal

McGill University, Canada

In this paper, we extend our previous work on exploiting speech temporal properties to improve Bandwidth Extension (BWE) of narrowband speech using Gaussian Mixture Models (GMMs). By quantifying temporal properties through information theoretic measures and using delta features, we have shown that narrowband memory significantly increases certainty about highband parameters. However, as delta features are non-invertible, they can not be directly used to reconstruct highband frequency content. In the work presented herein, we embed temporal properties indirectly into the GMM structure through a memory-dependent tree-based approach to extend representation of the narrow band. In particular, sequences of past frames are progressively used to grow the GMM in a tree-like fashion. This growth approach results in reliable estimates for the GMM parameters such that Maximum Likelihood estimation is no longer necessary, thus circumventing the complexity accompanying high-dimensionality GMM training.

