1st Joint SIG-IL/Microsoft Workshop on Speech and Language Technologies for Iberian Languages
Porto Salvo, Portugal
This work briefly discusses the construction of the Orthographic and Phonetic Information Databases of the Portuguese Language Spoken in the State of São Paulo (São Paulo City, Campinas, Itu) in a Relational Database System. Informatics resources were used to store, process and analyze authentic oral language, and the Bases include orthographic and phonetic information about the Portuguese language as spoken in those areas of the state of São Paulo, organized, listed and stored taking into account linguistic and extralinguistic annotations. The results obtained can serve as a valuable aid, for example, in studies requiring automatic processing of the Portuguese language.
Index Terms: Linguistic Informatics, data processing technologies in Linguistic studies, CorPor project, relational database system, databanks of phonetic and orthographic information about the Portuguese language as spoken in São Paulo, electronic corpora of the Portuguese language as spoken in São Paulo
Bibliographic reference. Zapparoli, Zilda Maria (2009): "CORPOR system: corpora of the Portuguese language as spoken in So Paulo", In SLTECH-2009, 35-38.