ISCA Archive SLTU 2012
ISCA Archive SLTU 2012

Incorporating MLP features in the unsupervised training process

Thiago Fraga-Silva, Viet-Bac Le, Lori Lamel, Jean-Luc Gauvain

The combined use of multi layer perceptron (MLP) and perceptual linear prediction (PLP) features has been reported to improve the performance of automatic speech recognition systems for many different languages and domains. However, MLP features have not yet been used on unsupervised acoustic model training. This approach is introduced in this paper with encouraging results. In addition, unsupervised language model training was also investigated for a Portuguese broadcast speech recognition task, leading to a slight improvement of performance. The joint use of the unsupervised techniques presented here leads to an absolute WER reduction up to 3.2% over a baseline unsupervised system.

Index Terms: Unsupervised Training, MLP features, Acoustic Modeling, Language Modeling


Cite as: Fraga-Silva, T., Le, V.-B., Lamel, L., Gauvain, J.-L. (2012) Incorporating MLP features in the unsupervised training process. Proc. 3rd Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU 2012), 24-28

@inproceedings{fragasilva12_sltu,
  author={Thiago Fraga-Silva and Viet-Bac Le and Lori Lamel and Jean-Luc Gauvain},
  title={{Incorporating MLP features in the unsupervised training process}},
  year=2012,
  booktitle={Proc. 3rd Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU 2012)},
  pages={24--28}
}