Prosodic classification of discourse markers

Authors

  • Vera Cabarrão L2F, INESC-ID - FLUL/CLUL
  • Helena Moniz L2F, INESC-ID - FLUL/CLUL
  • Jaime Ferreira L2F, INESC-ID
  • Fernando Batista L2F, INESC-ID - ISCTE-IUL
  • Isabel Trancoso L2F, INESC-ID - IST – Universidade de Lisboa
  • Ana Isabel Mata FLUL/CLUL
  • Sérgio Curto L2F, INESC-ID

DOI:

https://doi.org/10.26334/2183-9077/rapln2ano2016a4

Keywords:

discourse markers, prosody, speech processing, multiclass automatic classification

Abstract

This work describes the discourse markers present in two corpora for European Portuguese, in different domains (university lectures and map-task dialogues). In this study, we also perform a multiclass automatic classification task based on prosodic features to verify in both corpora which words are discourse markers, which are disfluencies, and which are sentence like-units (SUs). Results show that the selection of discourse markers varies across domain and between speakers. As for the classification task, results show that the discourse markers are better classified in the lectures corpus (87%) than in the dialogue corpus (84%). However, cross-domain experiments evidenced that data trained with the dialogue corpus predicts better the events in the lecture corpus, since this domain displays more speakers and therefore complex patterns. In both corpora, markers are more easily classified as SUs than as disfluencies.

Downloads

Download data is not yet available.

Published

2016-10-31

How to Cite

Cabarrão V., Moniz, H., Jaime Ferreira, Fernando Batista, Isabel Trancoso, Ana Isabel Mata, & Sérgio Curto. (2016). Prosodic classification of discourse markers. Journal of the Portuguese Linguistics Association, (2), 69–95. https://doi.org/10.26334/2183-9077/rapln2ano2016a4