Hauptinhalt

Topinformationen

*cespla is a linguistic corpus of Spanish conversations from the region along the Río de la Plata river (Argentina and Uruguay), which have been recorded and transcribed. It is a semi-public corpus which may be used for a variety of scientific projects.

Data

All of the recorded everyday conversations are characterized by a high level of spontaneity and authenticity. As the speakers are situated in casual contexts and interacting with (more or less) familiar persons they are only little influenced by the fact that they are being recorded.

The recordings were made during the years 2006 and 2008 mainly in the city of La Plata and partially in Buenos Aires (Capital Federal). The speakers are between 22 and 65 years old. In future, the corpus will be expanded by adding recordings from the city of Montevideo and conversations of younger speakers.

The corpus contains 103 recordings with a total duration of 108 hrs. The quality of the recordings ranges from good to excellent, depending on the individual situation.

The transcriptions are available in an aligned format. This means that not only the linguistic content of an utterance has been annotated, but also the information about its starting and ending points. Thereby a direct link between transcript and spoken word as well as a faster access to the recordings are provided when working with the transcripts.

Until now 10 hours hav ebeen transcribed, according to the „Gesprächsanalytisches Transkriptionssystem (GAT)“ transcribing system (Selting et al. 1998)..

Technology

The data of the corpus can be used with all common annotation tools such as Praat, Elan, Exmaralda, Transana, among others. It is possible to use the data within linguistic databases for multimodal corpora such as [moca] and the Transformer, as well as in other databases such as Microsoft Access.

Research

The corpus is especially suited for the investigation of spontaneous everyday talk. Up to date, projects have mainly covered the areas of conversation analysis and interactional linguistics. However, the corpus may also be helpful in addressing other kinds of research questions. The main target user group includes students involved with small and medium scale research projects (Term papers, Bachelor, Master and PhD thesis).

Conditions of Use

*cespla is a semi-public corpus. The conditions of use are adjusted to each individual case, depending on the direction of the proposed study. This serves to avoid thematic overlaps with other ongoing studies. In general, users of the corpus contribute to the corpus by transcribing missing parts.

Editor

Oliver Ehmer

Contact

Prof. Dr. Oliver Ehmer
Universität Osnabrück
Institut für Romanistik und Latinistik
Neuer Graben 40
49074 Osnabrück

‭+49 541 9694340‬
oliver.ehmer@uni-osnabrueck.de
http://www.oliverehmer.de