AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (3.4 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Real-time instance segmentation based on contour learning

Rui GE1,2Dengfeng LIU1,2( )Haojie ZHOU1,2Zhilei CHAI1,2Qin WU1,2
School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
Jiangsu Provincial Engineering Laboratory Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi 214122, China
Show Author Information

Abstract

Instance segmentation plays an important role in image processing. The Deep Snake algorithm based on contour iteration deforms an initial bounding box to an instance contour end-to-end, which can improve the performance of instance segmentation, but has defects such as slow segmentation speed and sub-optimal initial contour. To solve these problems, a real-time instance segmentation algorithm based on contour learning was proposed. Firstly, ShuffleNet V2 was used as backbone network, and the receptive field of the model was expanded by using a 5×5 convolution kernel. Secondly, a lightweight up-sampling module, multi-stage aggregation (MSA), performs residual fusion of multi-layer features, which not only improves segmentation speed, but also extracts effective features more comprehensively. Thirdly, a contour initialization method for network learning was designed, and a global contour feature aggregation mechanism was used to return a coarse contour, which solves the problem of excessive error between manually initialized contour and real contour. Finally, the Snake deformation module was used to iteratively optimize the coarse contour to obtain the final instance contour. The experimental results showed that the proposed method improved the instance segmentation accuracy on semantic boundaries dataset(SBD), Cityscapes and Kins datasets, and the average precision reached 55.8 on the SBD; Compared with Deep Snake, the model parameters were reduced by 87.2%, calculation amount was reduced by 78.3%, and segmentation speed reached 39.8 frame·s-1 when instance segmentation was performed on an image with a size of 512×512 pixels on a 2080Ti GPU. The proposed method can reduce resource consumption, realize instance segmentation tasks quickly and accurately, and therefore is more suitable for embedded platforms with limited resources.

References

[1]
HARIHARAN B, ARBELÁEZ P, GIRSHICK R, et al. Simultaneous detection and segmentation//European Conference on Computer Vision, September 5-12, 2014, Zurich, Switzerland. Cham: Springer, 2014: 297-312.
[2]

SU L, SUN Y X, YUAN S Z. A Survey of instance segmentation based on deep learning. Journal of Intelligent Systems, 2022, 17(1):16-31.

[3]

LI X X, HU X G, QUAN Y C, et al. Research progress of instance segmentation based on deep learning. Computer Engineering and Applications, 2021, 57(9): 60-67.

[4]

HAFIZ A M, BHAT G M. A survey on instance segmentation: state of the art. International Journal of Multimedia Information Retrieval, 2020, 9(3): 171-189.

[5]
HE K, GKIOXARI G, DOLLÁR P, et al. Mask r-cnn//International Conference on Computer Vision, October 22-29, 2017, Venice, Italy. Cham: IEEE, 2017: 2980-2988.
[6]

REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networkrs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149.

[7]
WANG K, LIEW J H, ZOU Y, et al. PANet: tew-shot image semantic segmentation with prototype alignment. International Conference on Computer Vision, October 27-November 2, 2019, Seoul, Korea. Cham: IEEE, 2019: 9196-9205.
[8]
HUANG Z, HUANG L, GONG Y, et al. Mask scoring R-CNN//IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 16-20, 2019, Long Beach, CA, USA. Cham: IEEE, 2019: 6402-6411.
[9]
DAI J, HE K, LI Y, et al. Instance-sensitive fully convolutional networks. European Conference on Computer Vision, October 10-16, 2016, Amsterdam, Netherlands. Berlin: Springer, 2016: 534-549.
[10]

WANG X, ZHANG R, KONG T, et al. Solov2: Dynamic and fast instance segmentation. Advances in Neural Information Processing Systems, 2020, 33: 17721-17732.

[11]
BOLYA D, ZHOU C, XIAO F Y, et al. YOLACT: realtime instance segmentation//2019 IEEE/CVF International Conference on Computer Vision, October 27-November 2, 2019, Seoul, Korea (South). New York: IEEE, 2019: 9156-9165.
[12]
XIE E Z, SUN P Z, SONG X G, et al. PolarMask: single shot instance segmentation with polar representation//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 14-19, 2020, Seattle, WA, USA. IEEE, 2020: 12190-12199.
[13]
TIAN Z, SHEN C H, CHEN H, et al. FCOS: fully convolutional one-stage object detection//2019 IEEE/CVF International Conference on Computer Vision, October 27-November 2, 2019, Seoul, Korea (South). New York: IEEE, 2019: 9626-9635.
[14]
PENG S, JIANG W, PI H, et al. Deep snake for real-time instance segmentation//IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 14-19, 2020, Seattle, WA, USA. IEEE, 2020: 8533-8542.
[15]
HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications. 2017: 1704. 04861. http://arxiv.org/abs/1704.04861v1.
[16]
TAN M X, LE Q V. Efficientnet: Rethinking model scaling for convolutional neural networks. 2019: 1905. 11946. http://arxiv.org/abs/1905.11946v5.
[17]
ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-22, 2018, Salt Lake City, UT, USA. New York: IEEE, 2018: 6848-6856.
[18]
MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design//European Conference on Computer Vision, September 8-14, 2018, Munich, Germany. Berlin: Cham: Springer, 2018: 122-138.
[19]
ZHOU X Y, WANG D Q, KRÄHENBÜHL P. Objects as points. 2019: 1904. 07850. http://arxiv.org/abs/1904.07850v2.
[20]
DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks//2017 IEEE International Conference on Computer Vision, October 22-29, 2017, Venice, Italy. New York: IEEE, 2017: 764-773.
[21]
HARIHARAN B, ARBELÁEZ P, BOURDEV L, et al. Semantic contours from inverse detectors//2011 International Conference on Computer Vision, November 6-13, 2011, Barcelona, Spain. New York: IEEE, 2011: 991-998.
[22]
CORDTS M, OMRAN M, RAMOS S, et al. The cityscapes dataset for semantic urban scene understanding//2016 IEEE Conference on Computer Vision and Pattern Recognition, June 26-July 1, 2016, Las Vegas, NV, USA. New York: IEEE, 2016: 3213-3223.
[23]
QI L, JIANG L, LIU S, et al. Amodal instance segmentation with KINS dataset//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 16-20, 2019, Long Beach, CA, USA. New York: IEEE, 2019: 3009-3018.
[24]
LI Y, QI H Z, DAI J F, et al. Fully convolutional instance-aware semantic segmentation//2017 IEEE Conference on Computer Vision and Pattern Recognition, July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 2017: 4438-4446.
[25]
XU W Q, WANG H Y, QI F B, et al. Explicit shape encoding for real-time instance segmentation// 2019 IEEE/CVF International Conference on Computer Vision, June 16-20, 2019, Seoul, Korea (South). New York: IEEE, 2019: 5167-5176.
Journal of Measurement Science and Instrumentation
Pages 328-337
Cite this article:
GE R, LIU D, ZHOU H, et al. Real-time instance segmentation based on contour learning. Journal of Measurement Science and Instrumentation, 2024, 15(3): 328-337. https://doi.org/10.62756/jmsi.1674-8042.2024034

82

Views

7

Downloads

0

Crossref

0

CSCD

Altmetrics

Received: 03 November 2023
Revised: 14 January 2024
Accepted: 21 January 2024
Published: 30 September 2024
© The Author(s) 2024.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Return