Neural networks have recently been shown to be a very effective approach to the unconstrained segmentation of speech into phoneme-like units. The neural network is trained to indicate when a short local sequence of feature vectors is associated with a segment boundary, and when it is not. Although this approach delivers state-of-the-art performance, it is prone to oversegmentation at ambiguous segment boundaries. To address this, we propose the incorporation of the neural network segmenter into a dynamic programming (DP) framework. We evaluate the DP-based approach on the TIMIT corpus, and show that it leads to improved performance.
Bibliographic reference. Vuuren, Van Zyl van / Bosch, Louis ten / Niesler, Thomas (2013): "A dynamic programming framework for neural network-based automatic speech segmentation", In INTERSPEECH-2013, 2287-2291.