ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

A study on soft margin estimation of linear regression parameters for speaker adaptation

Shigeki Matsuda, Yu Tsao, Jinyu Li, Satoshi Nakamura, Chin-Hui Lee

We formulate a framework for soft margin estimation-based linear regression (SMELR) and apply it to supervised speaker adaptation. Enhanced separation capability and increased discriminative ability are two key properties in margin-based discriminative training. For the adaptation process to be able to flexibly utilize any amount of data, we also propose a novel interpolation scheme to linearly combine the speaker independent (SI) and speaker adaptive SMELR (SMELR/SA) models. The two proposed SMELR algorithms were evaluated on a Japanese large vocabulary continuous speech recognition task. Both the SMELR and interpolated SI+SMELR/SA techniques showed improved speech adaptation performance in comparison with the well-known maximum likelihood linear regression (MLLR) method. We also found that the interpolation framework works even more effectively than SMELR when the amount of adaptation data is relatively small.


doi: 10.21437/Interspeech.2009-208

Cite as: Matsuda, S., Tsao, Y., Li, J., Nakamura, S., Lee, C.-H. (2009) A study on soft margin estimation of linear regression parameters for speaker adaptation. Proc. Interspeech 2009, 1603-1606, doi: 10.21437/Interspeech.2009-208

@inproceedings{matsuda09_interspeech,
  author={Shigeki Matsuda and Yu Tsao and Jinyu Li and Satoshi Nakamura and Chin-Hui Lee},
  title={{A study on soft margin estimation of linear regression parameters for speaker adaptation}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1603--1606},
  doi={10.21437/Interspeech.2009-208}
}