15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Speaker Adaptation Based on Sparse and Low-Rank Eigenphone Matrix Estimation

Wen-Lin Zhang (1), Dan Qu (1), Wei-Qiang Zhang (2), Bi-Cheng Li (1)

(1) Zhengzhou Information Science & Technology Institute, China
(2) Tsinghua University, China

The eigenphone based speaker adaptation outperforms the conventional MLLR and eigenvoice methods when the adaptation data is sufficient, but it suffers from severe over-fitting when the adaptation data is limited. In this paper, l1 and nuclear norm regularization are applied simultaneously to obtain a more robust eigenphone estimation, resulting in a sparse and low-rank eigenphone matrix. The sparse constraint can reduce the number of free parameters while the low rank constraint can limit the dimension of phone variation subspace, which are both benefit to the generalization ability. Experimental results show that the proposed method can improve the adaptation performance substantially, especially when the amount of adaptation data is limited.

Full Paper

Bibliographic reference.  Zhang, Wen-Lin / Qu, Dan / Zhang, Wei-Qiang / Li, Bi-Cheng (2014): "Speaker adaptation based on sparse and low-rank eigenphone matrix estimation", In INTERSPEECH-2014, 2972-2976.