Prosodic classification of discourse markers
DOI:
https://doi.org/10.26334/2183-9077/rapln2ano2016a4Keywords:
discourse markers, prosody, speech processing, multiclass automatic classificationAbstract
This work describes the discourse markers present in two corpora for European Portuguese, in different domains (university lectures and map-task dialogues). In this study, we also perform a multiclass automatic classification task based on prosodic features to verify in both corpora which words are discourse markers, which are disfluencies, and which are sentence like-units (SUs). Results show that the selection of discourse markers varies across domain and between speakers. As for the classification task, results show that the discourse markers are better classified in the lectures corpus (87%) than in the dialogue corpus (84%). However, cross-domain experiments evidenced that data trained with the dialogue corpus predicts better the events in the lecture corpus, since this domain displays more speakers and therefore complex patterns. In both corpora, markers are more easily classified as SUs than as disfluencies.
Downloads
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Authors retain copyright and concede to the journal the right of first publication. The articles are simultaneously licensed under the Creative Commons Attribution License, which allows sharing of the work with an acknowledgement of authorship and initial publication in this journal.
The authors have permission to make the version of the text published in RAPL available in institutional repositories or other platforms for the distribution of academic papers (e.g., ResearchGate).


