Underwater object detection technology is essential for maintaining marine ecological health and supporting economic development. However, the underwater environment poses significant challenges, including low contrast, small object sizes, and complex backgrounds. Existing generic object detectors often fail to identify these organisms effectively. This paper proposes a Joint Multi-scale channel attention and Multi-perception head Network (JMM-Net), a detection algorithm for underwater organisms. JMM-Net comprises three main components: Multi-Scale Channel Attention (MSCA)-based backbone network, Multi-Perception Parallel detection head (MPPhead), and lightweight GSconv-Path Aggregation Network (GS-PAN). MSCA is embedded into the backbone to enhance feature extraction for blurred and small-sized objects in low-quality environments by integrating local and global channel attention through multi-scale parallel sub-networks and cross-channel learning. MPPhead enhances the model’s classification and localization capabilities by leveraging scale, spatial, and task perception, thereby enhancing the detection of marine organisms in complex backgrounds. The adoption of GS-PAN over the traditional Path Aggregation Network (PAN) structure significantly reduces the model’s parameters and computational load, making it more suitable for deployment on edge devices. Extensive experiments on three public underwater datasets demonstrate that our method achieves excellent performance on underwater object detection at a lightweight cost.
- Article type
- Year
- Co-author
Open Access
Issue
Open Access
Research Article
Issue
It has been widely acknowledged that learning-based super-resolution (SR) methods are effective to recover a high resolution (HR) image from a single low resolution (LR) input image. However, there exist two main challenges in learning-based SR methods currently: the quality of training samples and the demand for computation. We proposed a novel framework for single image SR tasks aiming at these issues, which consists of blind blurring kernel estimation (BKE) and SR recovery with anchored space mapping (ASM). BKE is realized via minimizing the cross-scale dissimilarity of the image iteratively, and SR recovery with ASM is performed based on iterative least square dictionary learning algorithm (ILS-DLA). BKE is capable of improving the compatibility of training samples and testing samples effectively and ASM can reduce consumed time during SR recovery radically. Moreover, a selective patch processing (SPP) strategy measured by average gradient amplitude
京公网安备11010802044758号