Efficient and comprehensive visual solution for a smart apple harvesting robot in complex settings via multi-class instance segmentation

Shiwei Wen; Yahao Ge; Yingkuan Wang; Naishuo Wei; Jianguo Zhou; Guangrui Hu; Liangliang Yang; Jun Chen

doi:10.25165/j.ijabe.20251804.9619

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

PDF (12.4 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Efficient and comprehensive visual solution for a smart apple harvesting robot in complex settings via multi-class instance segmentation

Shiwei Wen^¹, Yahao Ge^¹, Yingkuan Wang^{²^,³}(

), Naishuo Wei^¹, Jianguo Zhou^¹, Guangrui Hu^⁴, Liangliang Yang^⁵, Jun Chen^¹(

)

1College of Mechanical and Electronic Engineering, Northwest A&F University, Yangling 712100, Shaanxi, China

2Academy of Agricultural Planning and Engineering, Ministry of Agriculture and Rural Affairs, Beijing 100125, China

3Chinese Society of Agricultural Engineering, Beijing 100125, China

4School of Design, Xi’an Technological University, Xi’an 710021, China

5Laboratory of Bio-Mechatronics, Faculty of Engineering, Kitami Institute of Technology, Hokkaido 090-8507, Japan

Show Author Information

Abstract

To enable efficient and low-cost automated apple harvesting, this study presented a multi-class instance segmentation model, SCAL (Star-CAA-LADH), which utilizes a single RGB sensor for image acquisition. The model achieves accurate segmentation of fruits, fruit-bearing branches, and main branches using only a single RGB image, providing comprehensive visual inputs for robotic harvesting. A Star-CAA module was proposed by integrating Star operation with a Context-Anchored Attention mechanism (CAA), enhancing directional sensitivity and multi-scale feature perception. The Backbone and Neck networks were equipped with hierarchically structured SCA-T/F modules to improve the fusion of high- and low-level features, resulting in more continuous masks and sharper boundaries. In the Head network, a Segment_LADH module was employed to optimize classification, bounding box regression, and mask generation, thereby improving segmentation accuracy for small and adherent targets. To enhance robustness in adverse weather conditions, a Chain-of-Thought Prompted Adaptive Enhancer (CPA) module was integrated, thereby increasing model resilience in degraded environments. Experimental results demonstrate that SCAL achieves 94.9% AP_M and 95.1% mAP_M, outperforming YOLOv11s by 6.6% and 4.6%, respectively. Under multi-weather testing conditions, the CPA-SCAL variant consistently outperforms other comparison models in accuracy. After INT8 quantization, the model size was reduced to 14.5 MB, with an inference speed of 47.2 frames per second (fps) on the NVIDIA Jetson AGX Xavier. Experiments conducted in simulated orchard environments validate the effectiveness and generalization capabilities of the SCAL model, demonstrating its suitability as an efficient and comprehensive visual solution for intelligent harvesting in complex agricultural settings.

Keywords

apple harvesting instance segmentation multi-weather condition star operation edge computing device

References

【1】

Crossref Google Scholar

International Journal of Agricultural and Biological Engineering

Volume 18 Issue 4,
August 2025

Pages 200-215

DOI: 10.25165/j.ijabe.20251804.9619

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Wen S, Ge Y, Wang Y, et al. Efficient and comprehensive visual solution for a smart apple harvesting robot in complex settings via multi-class instance segmentation. International Journal of Agricultural and Biological Engineering, 2025, 18(4): 200-215. https://doi.org/10.25165/j.ijabe.20251804.9619

341

Views

Downloads

Crossref

Web of Science

Scopus

Google Scholar
Citation

Received: 10 January 2025

Accepted: 15 July 2025

Published: 31 August 2025

We adopt the latest version of license CC BY 4.0, https://creativecommons.org/licenses/by/4.0/