AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (12.4 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Efficient and comprehensive visual solution for a smart apple harvesting robot in complex settings via multi-class instance segmentation

College of Mechanical and Electronic Engineering, Northwest A&F University, Yangling 712100, Shaanxi, China
Academy of Agricultural Planning and Engineering, Ministry of Agriculture and Rural Affairs, Beijing 100125, China
Chinese Society of Agricultural Engineering, Beijing 100125, China
School of Design, Xi’an Technological University, Xi’an 710021, China
Laboratory of Bio-Mechatronics, Faculty of Engineering, Kitami Institute of Technology, Hokkaido 090-8507, Japan
Show Author Information

Abstract

To enable efficient and low-cost automated apple harvesting, this study presented a multi-class instance segmentation model, SCAL (Star-CAA-LADH), which utilizes a single RGB sensor for image acquisition. The model achieves accurate segmentation of fruits, fruit-bearing branches, and main branches using only a single RGB image, providing comprehensive visual inputs for robotic harvesting. A Star-CAA module was proposed by integrating Star operation with a Context-Anchored Attention mechanism (CAA), enhancing directional sensitivity and multi-scale feature perception. The Backbone and Neck networks were equipped with hierarchically structured SCA-T/F modules to improve the fusion of high- and low-level features, resulting in more continuous masks and sharper boundaries. In the Head network, a Segment_LADH module was employed to optimize classification, bounding box regression, and mask generation, thereby improving segmentation accuracy for small and adherent targets. To enhance robustness in adverse weather conditions, a Chain-of-Thought Prompted Adaptive Enhancer (CPA) module was integrated, thereby increasing model resilience in degraded environments. Experimental results demonstrate that SCAL achieves 94.9% AP_M and 95.1% mAP_M, outperforming YOLOv11s by 6.6% and 4.6%, respectively. Under multi-weather testing conditions, the CPA-SCAL variant consistently outperforms other comparison models in accuracy. After INT8 quantization, the model size was reduced to 14.5 MB, with an inference speed of 47.2 frames per second (fps) on the NVIDIA Jetson AGX Xavier. Experiments conducted in simulated orchard environments validate the effectiveness and generalization capabilities of the SCAL model, demonstrating its suitability as an efficient and comprehensive visual solution for intelligent harvesting in complex agricultural settings.

References

【1】
【1】
 
 
International Journal of Agricultural and Biological Engineering
Pages 200-215

{{item.num}}

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Close
Close
Cite this article:
Wen S, Ge Y, Wang Y, et al. Efficient and comprehensive visual solution for a smart apple harvesting robot in complex settings via multi-class instance segmentation. International Journal of Agricultural and Biological Engineering, 2025, 18(4): 200-215. https://doi.org/10.25165/j.ijabe.20251804.9619

341

Views

8

Downloads

1

Crossref

2

Web of Science

2

Scopus

Received: 10 January 2025
Accepted: 15 July 2025
Published: 31 August 2025
© The Author(s) 2025

We adopt the latest version of license CC BY 4.0, https://creativecommons.org/licenses/by/4.0/