We report on a speech recognition system that uses inter-word context dependent units (CDP) to model pronunciation variations at word boundaries. Word juncture coarticulation phenomena are a major source of acoustic variability for the initial and final parts of a word when spoken in fluent speech. In some cases, the alteration of a phone due to neighboring phones is comparatively small and the actual realization is perceived as a variation of the original phone. To cope with this type of variations, we design a set of inter-word CDP and modify the pronunciation of a word by replacing the word beginning and word ending phones with all possible inter-word units. By properly connecting all possible word beginnings and word endings, word boundaries are thereby better represented. Taking into account word juncture pronunciation changes, better models can be obtained in training. Such models have achieve better results in recognition.
Bibliographic reference. Giachin, Egidio P. / Lee, Chin-Hui / Rabiner, Lawrence R. / Rosenberg, Aaron E. / Pieraccini, Roberto (1991): "Word juncture modeling using inter-word context-dependent phone-like units", In EUROSPEECH-1991, 1393-1396.