Scholar - SciOpen

Accurate medical image segmentation is essential for effective diagnosis and treatment. Previously we proposed PraNet-V1 as a means to enhance polyp segmentation, introducing a reverse attention (RA) module that utilizes background information. However, PraNet-V1 struggles with multi-class segmentation tasks. To address this limitation, we here propose PraNet-V2, which can effectively handle a broader range of tasks, including multi-class segmentation. At the core of PraNet-V2 is our dual-supervised reverse attention (DSRA) module, which incorporates explicit background supervision, independent background modeling, and semantically enriched attention fusion. Our PraNet-V2 framework exhibits strong performance on four polyp segmentation datasets. Moreover, the integration of DSRA into three state-of-the-art semantic segmentation models enables iterative refinement of foreground segmentation, yielding improvements of up to 1.36% in mean Dice score. Jittor code and supplementary materials are available at https://github.com/ai4colonoscopy/PraNet-V2/tree/main/binary_seg/jittor.

Open Access Research Article Issue

Full-duplex strategy for video object segmentation

Ge-Peng Ji, Deng-Ping Fan, Keren Fu, Zhe Wu, Jianbing Shen, Ling Shao

Computational Visual Media 2023, 9(1): 155-175

Published: 18 October 2022

Abstract

PDF (7.9 MB) Collect Collected

Downloads：91

Previous video object segmentation appro-aches mainly focus on simplex solutions linking appearanceand motion, limiting effective feature collaboration between these two cues. In this work, we study anovel and efficient full-duplex strategy network (FSNet) to address this issue, by considering a better mutual restraint scheme linking motion and appearance allowing exploitation of cross-modal features from the fusion and decoding stage. Specifically, we introduce a relational cross-attention module (RCAM) to achieve bidirectional message propagation across embedding sub-spaces. To improve the model’s robustness and update inconsistent features from the spatiotemporal embeddings, we adopt a bidirectional purification module after the RCAM. Extensive experiments on five popular benchmarks show that our FSNet is robust to various challenging scenarios (e.g., motion blur and occlusion), and compares well to leading methods both for video object segmentation and video salient object detection. The project is publicly available at https://github.com/GewelsJI/FSNet.

Open Access Review Article Issue

Light field salient object detection: A review and benchmark

Keren Fu, Yao Jiang, Ge-Peng Ji, Tao Zhou, Qijun Zhao, Deng-Ping Fan

Computational Visual Media 2022, 8(4): 509-534

Published: 16 May 2022

Abstract

PDF (8.2 MB) Collect Collected

Downloads：181

Salient object detection (SOD) is a long-standing research topic in computer vision with increasing interest in the past decade. Since light fields record comprehensive information of natural scenes that benefit SOD in a number of ways, using light field inputs to improve saliency detection over conventional RGB inputs is an emerging trend. This paper provides the first comprehensive review and a benchmark for light field SOD, which has long been lacking in the saliency community. Firstly, we introduce light fields, including theory and data forms, and then review existing studies on light field SOD, covering ten traditional models, seven deep learning-based models, a comparative study, and a brief review. Existing datasets for light field SOD are also summarized. Secondly, we benchmark nine representative light field SOD models together with several cutting-edge RGB-D SOD models on four widely used light field datasets, providing insightful discussions and analyses, including a comparison between light field SOD and RGB-D SOD models. Due to the inconsistency of current datasets, we further generate complete data and supplement focal stacks, depth maps, and multi-view images for them, making them consistent and uniform. Our supplemental data make a universal benchmark possible. Lastly, light field SOD is a specialised problem, because of its diverse data representations and high dependency on acquisition hardware, so it differs greatly from other saliency detection tasks. We provide nine observations on challenges and future directions, and outline several open issues. All the materials including models, datasets, benchmarking results, and supplemented light field datasets are publicly available at https://github.com/kerenfu/LFSOD-Survey.

Total 3