Uma abordagem de aprendizagem semissupervisionada para a classificação automática de personalidade baseada em pistas acústico-prosódicas

Autores

  • Rubén Solera-Ureña L2F, INESC-ID, Lisboa
  • Helena Moniz L2F, INESC-ID, Lisboa; Universidade de Lisboa, CLUL; Unbabel Lda, Lisboa
  • Fernando Batista L2F, INESC-ID, Lisboa; ISCTE, Instituto Universitário de Lisboa
  • Vera Cabarrão L2F, INESC-ID, Lisboa; Universidade de Lisboa, CLUL
  • Anna Pompili L2F, INESC-ID, Lisboa; IST, Universidade de Lisboa
  • Ramón Fernández-Astudillo L2F, INESC-ID, Lisboa; IBM Research AI, Yorktown Heights, NY
  • Isabel Trancoso L2F, INESC-ID, Lisboa; IST, Universidade de Lisboa

DOI:

https://doi.org/10.26334/2183-9077/rapln5ano2019a23

Palavras-chave:

análise paralinguística computacional, classificação automática de personalidade, línguas distintas, faixas etárias diferentes, pistas acústico-prosódicas

Resumo

Automatic personality analysis has gained great attention in the last years as a fundamental dimension in human-machine interactions. However, the development of this technology in some domains, such as the classification of children’s personality, has been hindered by the limited number and size of the available speech corpora due to ethical concerns on collecting such corpora. To circumvent the lack of data, we have investigated the application of a semi-supervised training approach that makes use of heterogeneous (age and language mismatches) and partially non-labelled data sets. Namely, preliminary personality models trained using a small labelled data set with French speaking adults are iteratively refined using a larger unlabeled set of Portuguese children’s speech, whereas a labelled corpus of Portuguese children is used for evaluation. We also investigated speech representations based on prior linguistic knowledge on acoustic-prosodic clues for personality classification tasks and have analysed their relevance in the assessment of each personality trait. The results point out to the potential of applying semi-supervised learning approaches with heterogeneous data sets to overcome the lack of labelled data in under-resourced domains, and to the existence of acousticprosodic clues shared by speakers with different languages and ages, which allows for the classification of personality independently of these variables.

Downloads

Não há dados estatísticos.

Downloads

Publicado

2019-11-21

Como Citar

Solera-Ureña, R., Moniz, H., Batista, F., Cabarrão, V., Pompili, A., Fernández-Astudillo, R., & Trancoso, I. (2019). Uma abordagem de aprendizagem semissupervisionada para a classificação automática de personalidade baseada em pistas acústico-prosódicas. Revista Da Associação Portuguesa De Linguística, (5), 348–364. https://doi.org/10.26334/2183-9077/rapln5ano2019a23