Wide Residual Networks 1D for Automatic Text Punctuation

Jorge Llombart, Antonio Miguel, Alfonso Ortega, Eduardo Lleida


Documentation and analysis of multimedia resources usually requires a large pipeline with many stages. It is common to obtain texts without punctuation at some point, although later steps might need some accurate punctuation, like the ones related to natural language processing. This paper is focused on the task of recovering pause punctuation from a text without prosodic or acoustic information. We propose the use of Wide Residual Networks to predict which words should have a comma or stop from a text with removed punctuation. Wide Residual Networks are a well-known technique in image processing, but they are not commonly used in other areas as speech or natural language processing. We propose the use of Wide residual networks because they show great stability and the ability to work with long and short contextual dependencies in deep structures. Unlike for image processing, we will use 1-Dimensional convolutions because in text processing we only focus on the temporal dimension. Moreover, this architecture allows us to work with past and future context. This paper compares this architecture with Long-Short Term Memory cells which are used in this task and also combine the two architectures to get better results than each of them separately.


 DOI: 10.21437/IberSPEECH.2018-62

Cite as: Llombart, J., Miguel, A., Ortega, A., Lleida, E. (2018) Wide Residual Networks 1D for Automatic Text Punctuation. Proc. IberSPEECH 2018, 296-300, DOI: 10.21437/IberSPEECH.2018-62.


@inproceedings{Llombart2018,
  author={Jorge Llombart and Antonio Miguel and Alfonso Ortega and Eduardo Lleida},
  title={{Wide Residual Networks 1D for Automatic Text Punctuation}},
  year=2018,
  booktitle={Proc. IberSPEECH 2018},
  pages={296--300},
  doi={10.21437/IberSPEECH.2018-62},
  url={http://dx.doi.org/10.21437/IberSPEECH.2018-62}
}