Real-Time Emotion Recognition in Human-Robot Interaction Using Deep AI Models
DOI:
https://doi.org/10.31185/wjcms.426Keywords:
Human-Robot Interaction, HRI, CNN.Abstract
Emotion recognition in human-robot interaction (HRI) is important in order to develop socially aware and responsive robotic systems. In this work, a near real-time emotion recognition architecture is introduced which combines deep AI models such as CNN, and recurrent architectures (LSTM/ GRU) for audio and visual-based emotion detection. The system is tested on the benchmark datasets like FER-2013 and RAVDESS. The goal is to provide a strong and scalable approach that can serve as a step toward the full integration of emotionally intelligent robots into everyday life, to encourage empathic, adaptive, and rewarding human-robot interaction. The proposed deep AI model was tested on a dataset comprising 5,000 multimodal samples of human emotional expressions collected in controlled and real-world human-robot interaction scenarios. The dataset included five emotional categories: Happy, Sad, Angry, Neutral, and Surprise. Experimental results show the effectiveness of the proposed system based on the public datasets, and its practical use in the simulated human-robot interaction (HRI) scenario. Also, the proposed approach provides high accuracy and low inference delay, which can support robotic agents to have effective emotion-adaptive behaviors in live-interaction environments.
Downloads
References
[1] S. Poria, E. Cambria, R. Bajpai, and A. Hussain, “A review of affective computing: From unimodal analysis to multimodal fusion,” Information Fusion, vol. 37, pp. 98–125, 2017.
[2] P. Barros, D. Jirak, and S. Wermter, “Multimodal emotional state recognition using facial, vocal, and gestural cues in social robots,” IEEE Trans. Cogn. Dev. Syst., vol. 13, no. 2, pp. 370–382, 2021.
[3] A. Dzedzickis, M. Tamosiunaite, and A. Gudi, “Human emotion recognition: Review of sensors and methods,” Sensors, vol. 20, no. 3, p. 592, 2020.
[4] A. Bera, S. Kim, T. Randhavane, A. Pratapa, and D. Manocha, “The socially invisible robot: Navigation in the social world using robot’s emotional intelligence,” IEEE Robot. Autom. Lett., vol. 4, no. 2, pp. 324–331, 2019.
[5] S. Li and W. Deng, “Deep facial expression recognition: A survey,” IEEE Trans. Affect. Comput., vol. 13, no. 3, pp. 1195–1215, 2020.
[6] B. P. Majumder, S. Li, J. Ni, and J. McAuley, “Generating personalized recipes from historical user preferences,” in Proc. Conf. Empirical Methods in Natural Language Processing and Int. Joint Conf. Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 5976–5982.
[7] F. Noroozi et al., “Survey on emotional body gesture recognition,” IEEE Trans. Affect. Comput., vol. 9, no. 3, pp. 325–345, 2018.
[8] X. Chen, L. Zhao, and H. Liu, “Real-time emotion recognition system based on lightweight CNN for social robots,” Sensors, vol. 22, no. 14, p. 5302, 2022. [Online]. Available: https://doi.org/10.3390/s22145302
[9] D. Kollias et al., “Deep affect prediction in-the-wild: Aff-Wild database and challenge, deep architectures, and beyond,” Int. J. Comput. Vis., vol. 127, no. 6, pp. 620–641, 2020.
[10] S. Latif, R. Rana, S. Khalifa, R. Jurdak, and B. Schuller, “Deep architecture enhancement for cross-corpus speech emotion recognition,” IEEE Trans. Affect. Comput., vol. 13, no. 1, pp. 1–13, 2020.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Hiba alaamaidi

This work is licensed under a Creative Commons Attribution 4.0 International License.



