Real-Time Emotion-Aware Adaptive Learning System Using Multimodal Facial and Voice Recognition for Affective Personalization in Digital Instruction

Main Article Content

👤 Felinda Aprilia Rahma
🏢 Master’s Program in Teacher Education, School of Postgraduate Studies, Universitas Pendidikan Indonesia, Bandung, Indonesia
👤 Siti Zayyana Ulfah
🏢 Master’s Program in Teacher Education, School of Postgraduate Studies, Universitas Pendidikan Indonesia, Bandung, Indonesia

This study proposes and evaluates a Real-Time Emotion-Aware Adaptive Learning System that integrates facial expression recognition and voice-based affect modeling into an online instructional workflow. The system captured emotional signals from 42 participants during 25-minute learning sessions using webcam and microphone streaming. The CNN-based facial model achieved peak accuracy of 92% for happiness and 88% for neutral affect but decreased to 75%, 72%, and 69% when identifying sadness, anger, and fear. The Bi-LSTM voice model demonstrated precision values of 0.89 and 0.85 for happiness and neutrality, while sadness, anger, and fear dropped to 0.68, 0.72, and 0.65, respectively. A multimodal fusion mechanism improved overall recognition accuracy to 88%, representing gains of 9–13% over single-channel models. Adaptive interventions triggered by emotional signals produced measurable behavioral improvements. Difficulty reduction during confusion increased task completion by 17%, time extensions during anxiety lowered error rate by 11%, and encouragement prompts during frustration improved retry behavior by 22%. Gamified stimulation for boredom increased engagement duration by 26%. Overall, results indicate that emotional adaptivity doubled learning effectiveness, reduced negative affect accumulation, and embedded real-time personalization without disrupting instructional flow. The study concludes that multimodal affect monitoring constitutes a viable and necessary mechanism for next-generation intelligent tutoring.

Rahma, F. A., & Ulfah, S. Z. (2025). Real-Time Emotion-Aware Adaptive Learning System Using Multimodal Facial and Voice Recognition for Affective Personalization in Digital Instruction. Adaptive Learning, 1(1), 59–74. Retrieved from https://al.mbicore.com/index.php/al/article/view/22

Article Details

Section
Articles