Speech and Language Technology in Education (SLaTE 2013)
This paper addresses the issue of manual transcription of non native speech in an attempt to establish rule-based strategies for labelling intermediate realizations. The problems of transcribing non canonical realizations of L2 sounds which present shared features of the target (Spanish) and the source language (Japanese) will be considered. We introduce a Japanese accented non native L2 Spanish corpus, and exemplify the use of decision trees in manual transcriptions as a systematic method for dealing with ambiguous realizations. This approach could help a potential error detection system to detect both canonical and erroneous realizations, contributing to the development of CAPT tools.
Index Terms: non native speech transcription, ASR, CAPT, L2 Spanish, L1 Japanese, non native spoken corpus
Bibliographic reference. Carranza, Mario (2013): "Intermediate phonetic realizations in a Japanese accented L2 Spanish corpus", In SLaTE-2013, 168-171.