A Comprehensive Review of Speech Emotion Recognition: Advances, Challenges, and Future Directions

Alkahla, Lubna Thanoon; Hussein, Maher Khalaf; Alqassab, Asmaa; Aliyu, Dahiru

doi:10.69513/jncs.v1.i1.a5

A Comprehensive Review of Speech Emotion Recognition: Advances, Challenges, and Future Directions

Document Type : Review Article

Authors

Lubna Thanoon Alkahla ¹

Maher Khalaf Hussein ²

Asmaa Alqassab ³

Dahiru Aliyu ⁴

¹ Ninevah University

² University of Telafer

³ University of Mosul

⁴ Universiti Teknologi PETRONAS, Malaysia.

https://doi.org/10.69513/jncs.v1.i1.a5

Abstract

Automated detection of human emotion from speech signals is a relatively new area in artificial intelligence aimed at determining the emotions people express through their speech. Traditionally, SER did feature extraction recognition with handcrafted ones and classical machine learning ones such as SVM (support vector machines) and HMM (hidden Markov models). The richness of emotions made these methodologies however challenging. The evolution of deep learning, in particular CNNs, RNNs, and other Transformer-based structures, has greatly improved the accuracy and robustness of SER systems. In this work, the SER is studied in depth taking into account the most relevant methods and feature extraction methods as well as an introduction of benchmark databases. It also includes augmentation methods, evaluation measures and the difficulties of real-time processing. Regardless of the advancements, SER continues to encounter challenges, including scarcity of datasets, imbalance between classes, domain adaptation, and high computational requirements. The review highlights unanswered questions regarding research and analyses. future directions, including multimodal fusion, self-supervised learning, and Explainable AI.

Keywords

Speech Emotion Recognition

Deep Learning

Transformer Models

Feature Extraction

Human-Computer Interaction

Multimodal Learning

Subjects

Artificial Intelligence

Al-Noor Journal for Information Technology and Cybersecurity

Volume 2, Issue 1 - Serial Number 1
June 2025
Pages 31-36

XML

PDF 290.54 K

Article View	271
PDF Download	234

Advanced Search

Al-Noor Journal for Information Technology and Cybersecurity

A Comprehensive Review of Speech Emotion Recognition: Advances, Challenges, and Future Directions

Volume 2, Issue 1 - Serial Number 1
June 2025
Pages 31-36

Submit Manuscript

Home

Reviewers

Contact Us

Advertising policy

AI and Authorship Policy

Allegations of Misconduct

Appeal and complaints

Authorship

Copyright Policies

Creative Commons Licens

Guide for Authors

Digital Archiving

Privacy Statemen

Plagiarism Policy

Guide of Reviwer

Open Access Statement

Al-Noor Journal for Information Technology and Cybersecurity

A Comprehensive Review of Speech Emotion Recognition: Advances, Challenges, and Future Directions

Volume 2, Issue 1 - Serial Number 1June 2025Pages 31-36

Files

Share

How to cite

Statistics

Journal Info

Browse

Volume 2, Issue 1 - Serial Number 1
June 2025
Pages 31-36