Invited talk
in
Workshop: Machine Learning for Audio
A multi-view approach for audio-based speech emotion recognition
Dimitra Emmanouilidou
Abstract:
The area of speech emotion recognition (SER) has seen significant advances with the wider availability of pre-trained models and embeddings, and the creation of larger publicly available corpora. In this talk we will touch upon some of the challenges that continue to riddle audio-based SER, such as domain adaptation, data augmentation and output generalization, and further discuss the advantages of a multi-view model approach, one that jointly learns from both categorical and dimensional affect labels.
Chat is not available.