Hyperspectral image (HSI) is of great significance for target detection in remote sensing images because of its rich spectral information. However, HSI mostly has low spatial resolution, which limits the performance on weak target detection. In this work, a weak target detection Mamba based on the fusion of panchromatic and hyperspectral images is proposed. First, the learning paradigm of multi-level supervision was introduced to fully fuse the complementary spatial and hyperspectral information of the panchromatic and hyperspectral images, whose training was integrated with the downstream target detection task. Second, the improved Mamba network was utilized to obtain the global information and grasp the target semantic features with higher precision. Finally, a new vision embedding method was designed to enhance the network’s perception of weak targets. The proposed method was validated on a hyperspectral-panchromatic fusion image target detection dataset, the results showed that it had a higher precision for weak target detection.
- Article type
- Year
- Co-author
Open Access
Just Accepted
Open Access
Issue
Single-source Domain Generalization (SDG) is a promising yet challenging technology that aims to transfer knowledge from a singular source domain to multiple and unfamiliar target domains. Existing SDG methods typically rely on domain expansion to implement data variation and broaden the coverage of the training domain. However, due to the lack of proper semantic consistency and sample diversity constraints, these methods have limited improvement in generalization performance for most practical applications. In this paper, we propose a Causality-Aware Single-source Domain Generalization (CASDG) method to utilize both semantic consistency and diversity during the data transformation process. First, a causality-aware module is designed to accurately measure the causal effect between latent features and labels. Then, we introduce a causal domain expansion module, which utilizes the causal effect matrix as a semantic consistency constraint and mutual information as a sample diversity constraint. These two constraints are jointly used to encourage the style transformer to generate new auxiliary samples that are undeviated from the original samples. The image classification model using our method can produce the best classification performance for unknown domain data compared to the state-of-the-art methods.
Open Access
Issue
Learning domain-invariant feature representations is critical to alleviate the distribution differences between training and testing domains. The existing mainstream domain generalization approaches primarily pursue to align the across-domain distributions to extract the transferable feature representations. However, these representations may be insufficient and unstable. Moreover, these networks may also undergo catastrophic forgetting because the previous learned knowledge is replaced by the new learned knowledge. To cope with these issues, we propose a novel causality-based contrastive incremental learning model for domain generalization, which mainly includes three components: (1) intra-domain causal factorization, (2) inter-domain Mahalanobis similarity metric, and (3) contrastive knowledge distillation. The model extracts intra and inter domain-invariant knowledge to improve model generalization. Specifically, we first introduce a causal factorization to extract intra-domain invariant knowledge. Then, we design a Mahalanobis similarity metric to extract common inter-domain invariant knowledge. Finally, we propose a contrastive knowledge distillation with exponential moving average to distill model parameters in a smooth way to preserve the previous learned knowledge and mitigate model forgetting. Extensive experiments on several domain generalization benchmarks prove that our model achieves the state-of-the-art results, which sufficiently show the effectiveness of our model.
Open Access
Issue
Customized keyword spotting needs to adapt quickly to small user samples. Current methods primarily solve the problem under moderate noise conditions. Recent work increases the level of difficulty in detecting keywords by introducing keyword interference. However, the current solution has been explored on large models with many parameters, making it unsuitable for deployment on small devices. When applying the current solution to lightweight models with minimal training data, the performance degrades compared to the baseline model. Therefore, we propose a light-weight multi-task architecture (< 9.0×104 parameters) created from integrating the triplet attention module in the ConvMixer networks and a new auxiliary mixed labeling encoding to address the challenge. The results of our experiment show that the proposed model outperforms similar light-weight models for keyword spotting, with accuracy gains ranging from 0.73% to 2.95% for a clean set and from 2.01% to 3.37% for a mixed set under different scales of training set. Furthermore, our model shows its robustness in different low-resource language datasets while converging faster.
京公网安备11010802044758号