7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Tree-Structured Maximum a Posteriori Adaptation for a Segment-Based Speech Recognition System

Irina Illina


In this paper, the problem of the adaptation of a speech recognition system to a new environment is addressed. Recently, a Structural Maximum a Posteriori adaptation (SMAP) for a frame-based HMMmodel adaptation has been developed. In this method, acoustic model pdfs are organised in a tree and the means and variances of the pdfs are adapted using the linear transformations estimated under MAP criteria. In this paper, we extend the SMAP adaptation to a segmentbased model: the Mixture Stochastic Trajectory Model (MSTM). SMAP approach is completed by the tree construction driven by adaptation data, a Minimum Description Length (MDL) structure definition of this tree and trajectory and state adaptations. On the Resource Management task, the speaker adaptation and noise adaptation experiments show that the proposed SMAP approach gives a significant improvement compared to unadapted system.

Full Paper

Bibliographic reference.  Illina, Irina (2002): "Tree-structured maximum a posteriori adaptation for a segment-based speech recognition system", In ICSLP-2002, 1405-1408.