Scholar - SciOpen

Deep learning has found widespread application across diverse domains owing to its exceptional performance. Nevertheless, the lack of transparency in deep learning models’ decision-making processes undermines their usability, especially in critical contexts. While researchers have made noteworthy advancements in explaining these models, they have frequently overlooked the differences between static and temporal models during explanation generation. In temporal models, features change over time, posing new challenges in the generation of explanations. Though extensive research has been dedicated to surmounting these hurdles, a survey summarizing these contributions is currently absent. To bridge this gap, this paper endeavors to summarize existing methods and their contributions in terms of both static and temporal models, highlighting their disparities. Additionally, we propose an innovative classification approach based on the comprehensibility of explanations, demonstrating that different explanation methods vary in their understandability for users. Finally, to assess the limitations of the explanation capabilities of existing methods, we specifically choose knowledge tracing to analyze the evolution of explanation methods in this context of temporal modeling and interpretations.

Open Access Research Article Issue

BDA: Bi-directional attention for zero-shot learning

Junseok Lee, Jinming Cao, Yifang Yin, Jihie Kim, Roger Zimmermann, Seongsik Park

Computational Visual Media 2025, 11(5): 983-1003

Published: 04 July 2025

Abstract

PDF (13.1 MB) Collect Collected

Downloads：26

Zero-shot learning (ZSL) is an important and rapidly growing area of machine learning that aims to recognize new classes without prior training data. Despite its significance, ZSL has faced challenges with overfitting in embedding-based methods and limitations in traditional one-directional attention (ODA) based approaches. To bridge these gaps, this paper proposes the use of bi-directional attention (BDA) to integrate insights from both embedding and attention-based approaches. The proposed BDA system consists of a bi-directional attention network (BDAN) and a synthesized visual embedding network (SVEN) that facilitates visual-semantic interaction for ZSL classification. More specifically, the BDAN employs region self-attention (RSA), semantic synthesis attention (SSA), and visual synthesis attention (VSA) to overcome the overfitting issue in embedding methods and enhance transferability, to associate visual features with semantic property information, and to learn locally improved visual features. Extensive testing on CUB, SUN, and AWA2 datasets confirm the superiority of our proposed method over traditional approaches. Code is available at https://github.com/JunseokLee3/BDA.

Total 2