Machine Learning Techniques for Speech Emotion Classification

Melo Locumber, Noe; Fabian, Junior

Publicación:
Machine Learning Techniques for Speech Emotion Classification

Portada

Citas bibliográficas

Gestores Bibliográficos

Indexadores

Código QR

Autores

Melo Locumber, Noe

Fabian, Junior

Editor

Springer Science and Business Media Deutschland GmbH

Tipo de Material

http://purl.org/coar/resource_type/c_2f33

Fecha

2021

Palabras clave

Audio processing

Machine learning

Speech emotion classification

Es Parte de

Communications in Computer and Information Science

Resumen

In this paper we propose and evaluate different models for speech emotion classification through audio signal processing, machine learning and deep learning techniques. For this purpose, we have collected from two databases (RAVDESS and TESS), a total of 5252 audio samples with 8 emotional classes (neutral, calm, happy, sad, angry, fearful, disgust and surprised). We have divided our experiments in 3 main stages. In the first stage, we have used feature engineering to extract relevant features from the time, spectral and cepstral domains. Features like ZCR, energy, spectral centroid, chroma, MFCC etc. were used to train a SVM classifier. The best model obtained an accuracy of 91.1%. In the second stage, we only have considered 40 MFCC coefficients for training several Deep Neural Networks such as CNN, LSTM and MLP were trained, the best model obtained an accuracy of 89.5% with an MLP architecture. Finally, for the third stage we have trained an end-to-end CNN network (SampleCNN) at the sample level. This last approach does not require features engineering, but directly the audio signal. In this stage, we achieve a precision of 81.7%. The experiments show that the results achieved are competitive and some experiments have surpassed in accuracy the related works. © 2021, Springer Nature Switzerland AG.

URI

https://cris.esan.edu.pe/handle/20.500.12640/764

Identificador DOI

10.1007/978-3-030-76228-5_6

Colecciones

Publicaciones

Página completa Ver Estadísticas de uso

Publicación:
Machine Learning Techniques for Speech Emotion Classification

Portada

Citas bibliográficas

Gestores Bibliográficos

Indexadores

Código QR

Autores

Autor corporativo

Recolector de datos

Otros/Desconocido

Director audiovisual

Editor

Tipo de Material

Fecha

Palabras clave

Citación

Título de serie/ reporte/ volumen/ colección

Es Parte de

Resumen

Descripción

Notas

URL del Recurso

URI

Identificador ISBN

Identificador ISSN

Identificador DOI

Página de inicio

Es Parte del Libro

Colecciones

Publicación: Machine Learning Techniques for Speech Emotion Classification

Portada

Citas bibliográficas

Gestores Bibliográficos

Indexadores

Código QR

Autores

Autor corporativo

Recolector de datos

Otros/Desconocido

Director audiovisual

Editor

Tipo de Material

Fecha

Palabras clave

Citación

Título de serie/ reporte/ volumen/ colección

Es Parte de

Resumen

Descripción

Notas

URL del Recurso

URI

Identificador ISBN

Identificador ISSN

Identificador DOI

Página de inicio

Es Parte del Libro

Colecciones

Publicación:
Machine Learning Techniques for Speech Emotion Classification