Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Characteristics of Multi-Layer Perceptron Models in Enhancing Degraded Speech

T. T. Le (1), J. S. Mason (1), T. Kitamura (2)

(1) Dept. of Electrical & Electronic Eng., University College of Swansea, Swansea, UK (2) Dept. of Electrical & Computer Eng., Nagoya Institute of Technology, Nagoya, Japan

A multi-layer perceptron (MLP) acting directly in the time-domain is applied as a speech signal enhancer, and the performance examined in the context of three common classes of degradation, namely non-linear system degradation (introduced by a low-bit rate CELP coder), additive Gaussian white noise, and convolution by a linear system. The investigation focuses on two topics: (i) net topology, comparing single and multiple output structures, and (ii) the influence of non-linearities within the net. Experimental results confirm the importance of matching the enhancer to the class of degradation. In the case of the CELP coder the standard MLP with its inherently non-linear characteristics is consistently better than any equivalent linear structure. In contrast, when the degradation is from additive noise, a linear enhancer is always superior. Interestingly in both cases nets with multiple outputs give significantly better performance than single-output structures.

