INTERSPEECH 2006 - ICSLP
Speech recognition applications are known to require a significant amount of resources (training data, memory, computing power). However, the targeted context of this work - mobile phone embedded speech recognition system - only authorizes few KB of memory, few MIPS and usually small amount of training data.
In order to fit the resource constraints, an approach based on a semi-continuous HMM system using a GMM-based state-independent acoustic modeling is proposed in this paper. A transformation is computed and applied to the global GMM in order to obtain each of the HMM state-dependent probability density functions. This strategy aims at storing only the transformation function parameters for each state and authorizes to decrease the amount of computing power needed for the likelihood computation.
The proposed approach is evaluated on two tasks: a digit recognition task using the French corpus BDSON (which allows a Digit Error Rate of 2.5%) and a voice command task using French corpus VODIS (the Command Error Rate leads around 4.1%).
Bibliographic reference. Lévy, Christophe / Linarès, Georges / Bonastre, Jean-François (2006): "GMM-based acoustic modeling for embedded speech recognition", In INTERSPEECH-2006, paper 1255-Wed2A2O.3.