What is it about?
An important goal of Affective Computing is to develop machines that own emotional intelligence - ideally, machines that are able to identify emotions the same way as a real human. This process involves training Artificial Intelligence (AI) models with data from a variety of modalities - for instance, from text or speech. We do not know yet, however, how well machines, in comparison to human beings, are performing in Speech Emotion Recognition (SER). In part, this lack of knowledge is due to several fallacies prevalent in traditional Affective Computing, including the fact that promising SER performance typically results from assessing just a few basic emotions produced in clean in-the-lab environments. In contrast, “real” emotions are often mixed and the environments are noisy. In addition, it is well known that training AI models usually requires a large amount of data, a resource which is considerably limited in the context of SER. The extent to which the performance of SER is impacted by the training data size is still unknown. These are examples of some of the variables playing a role in SER performance which are often disregarded.
Featured Image
Photo by Alex Knight on Unsplash
Why is it important?
This study is important because it contributes to a more adequate modelling of emotions encoded in speech, by addressing some of the fallacies prevalent in traditional Affective Computing.
Perspectives
Read the Original
This page is a summary of: Perception and classification of emotions in nonsense speech: Humans versus machines, PLoS ONE, January 2023, PLOS,
DOI: 10.1371/journal.pone.0281079.
You can read the full text:
Resources
Contributors
The following have contributed to this page