Electroencephalogram (EEG) -based emotion recognition is an essential intelligent technique for health assessment and clinical intervention. However, EEG signals exhibit complex and complementary non-linear correlations across spatio-temporal-frequency domains, posing significant challenges to effective feature modeling and downstream emotion recognition performance. To address these challenges, an Emotional Spatio-Temporal-Spectral Cross-attention Network (ESTSCA-Net) is proposed. The proposed model adopts a dual-branch feature fusion architecture: in the spatio-temporal branch, a multi-scale 2D convolutional network is designed to sequentially process spatio-temporal information, adaptively capturing the contextual dependencies of neural activities; in the spatio-spectral branch, a 3D bottleneck residual network with channel-wise and cross-frequency attention mechanisms is developed to selectively encode critical spatio-spectral neural oscillations. Furthermore, a bidirectional multi-head cross-attention interaction strategy is introduced to achieve deep fusion of spatio-temporal-spectral features, forming an effective emotion representation classifier. Experimental results on the public DEAP and MEEG datasets demonstrate that ESTSCA-Net can comprehensively extract spatio-temporal-spectral EEG features across different emotional states and consistently outperforms state-of-the-art baseline models in both arousal and valence metrics.
Publications
- Article type
- Year
- Co-author
Year
Open Access
Issue
Journal of Guangdong University of Technology 2026, 43(1): 10-21
Published: 24 December 2025
Downloads:1
Total 1
京公网安备11010802044758号