This paper proposes Universal Background Model (UBM) fusion in the framework of total variability or i-vector modeling with the application to language identification (LID). The total variability subspace which is typically exploited to discriminate between the language classes of different speech recordings, is trained by combining the normalized Baum-Welch statistics of multiple UBMs. When the UBMs model a diverse set of feature representations, the method yields an i-vector representation which is more discriminant between the classes of interest. This approach is particularly useful when applied to short-duration utterances, and is a computationally less complex alternative to performance boosting as compared to system level fusion. We assess the performance of UBM fused total variability modeling on the task of robust language identification on short-duration utterances, as part of Phase-III of the DARPA RATS (Robust Automatic Transcription of Speech) program.
Bibliographic reference. Segbroeck, Maarten van / Travadi, Ruchir / Narayanan, Shrikanth S. (2014): "UBM fused total variability modeling for language identification", In INTERSPEECH-2014, 3027-3031.