Journal Home > Volume 9 , Issue 4

Interactive image segmentation (IIS) is an important technique for obtaining pixel-level anno-tations. In many cases, target objects share similar semantics. However, IIS methods neglect this con-nection and in particular the cues provided by representations of previously segmented objects, previous user interaction, and previous prediction masks, which can all provide suitable priors for the current annotation. In this paper, we formulate a sequential interactive image segmentation (SIIS) task for minimizing user interaction when segmenting sequences of related images, and we provide a practical approach to this task using two pertinent designs. The first is a novel interaction mode. When annotating a new sample, our method can automatically propose an initial click proposal based on previous annotation. This dramatically helps to reduce the interaction burden on the user. The second is an online opti-mization strategy, with the goal of providing seman-tic information when annotating specific targets, optimizing the model with dense supervision from previously labeled samples. Experiments demonstrate the effectiveness of regarding SIIS as a particular task, and our methods for addressing it.


menu
Abstract
Full text
Outline
About this article

Sequential interactive image segmentation

Show Author's information Zheng Lin1Zhao Zhang1Zi-Yue Zhu1Deng-Ping Fan2( )Xia-Lei Liu1
TKLNDST, College of Computer Science, Nankai University, Tianjin, China
Computer Vision Lab, ETH Zurich, Zurich, Switzerland

Abstract

Interactive image segmentation (IIS) is an important technique for obtaining pixel-level anno-tations. In many cases, target objects share similar semantics. However, IIS methods neglect this con-nection and in particular the cues provided by representations of previously segmented objects, previous user interaction, and previous prediction masks, which can all provide suitable priors for the current annotation. In this paper, we formulate a sequential interactive image segmentation (SIIS) task for minimizing user interaction when segmenting sequences of related images, and we provide a practical approach to this task using two pertinent designs. The first is a novel interaction mode. When annotating a new sample, our method can automatically propose an initial click proposal based on previous annotation. This dramatically helps to reduce the interaction burden on the user. The second is an online opti-mization strategy, with the goal of providing seman-tic information when annotating specific targets, optimizing the model with dense supervision from previously labeled samples. Experiments demonstrate the effectiveness of regarding SIIS as a particular task, and our methods for addressing it.

Keywords: user interaction, object segmentation, interactive segmentation

References(54)

[1]
Maninis, K. K.; Caelles, S.; Pont-Tuset, J.; Van Gool, L. Deep extreme cut: From extreme points to object segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 616–625, 2018.
DOI
[2]
Le, H.; Mai, L.; Price, B.; Cohen, S.; Jin, H. L.; Liu, F. Interactive boundary prediction for object selection. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11218. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 20–36, 2018.
DOI
[3]
Jain, S. D.; Grauman, K. Click carving: Interactive object segmentation in images and videos with point clicks. International Journal of Computer Vision Vol. 127, No. 9, 1321–1344, 2019.
[4]
Xu, N.; Price, B.; Cohen, S.; Yang, J. M.; Huang, T. Deep GrabCut for object selection. In: Proceedings of the British Machine Vision Conference, 182.1–182.12, 2017.
DOI
[5]
Majumder, S.; Rai, A.; Khurana, A.; Yao, A. Two-in-one refinement for interactive segmentation. In: Proceedings of the 31st British Machine Vision Conference, 2020.
[6]
Zhang, S. Y.; Liew, J. H.; Wei, Y. C.; Wei, S. K.; Zhao, Y. Interactive object segmentation with inside-outside guidance. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12231–12241, 2020.
DOI
[7]
Li, Z. W.; Chen, Q. F.; Koltun, V. Interactive image segmentation with latent diversity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 577–585, 2018.
DOI
[8]
Liew, J. H.; Cohen, S.; Price, B.; Mai, L.; Ong, S. H.; Feng, J. S. MultiSeg: Semantically meaningful, scale-diverse segmentations from minimal user input. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 662–670, 2019.
DOI
[9]
Mahadevan, S.; Voigtlaender, P.; Leibe, B. Iteratively trained interactive segmentation. arXiv preprint arXiv:1805.04398, 2018.
[10]
Majumder, S.; Yao, A. Content-aware multi-level guidance for interactive instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11594–11603, 2019.
DOI
[11]
Lin, Z.; Zhang, Z.; Chen, L. Z.; Cheng, M. M.; Lu, S. P. Interactive image segmentation with first click attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13336–13345, 2020.
DOI
[12]
Jang, W. D.; Kim, C. S. Interactive image seg-mentation via backpropagating refinement scheme. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5292–5301, 2019.
DOI
[13]
Sofiiuk, K.; Petrov, I.; Barinova, O.; Konushin, A. F-BRS: Rethinking backpropagating refinement for interactive segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8620–8629, 2020.
DOI
[14]
Kontogianni, T.; Gygli, M.; Uijlings, J.; Ferrari, V. Continuous adaptation for interactive object segmentation by learning from corrections. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12361. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 579–596, 2020.
DOI
[15]
Gong, L. X.; Zhang, Y. Q.; Zhang, Y. K.; Yang, Y.; Xu, W. W. Erroneous pixel prediction for semantic image segmentation. Computational Visual Media Vol. 8, No. 1, 165–175, 2022.
[16]
Zhang, X. Y.; Wang, L. J.; Xie, J.; Zhu, P. F. Human-in-the-loop image segmentation and annotation. Science China Information Sciences Vol. 63, No. 11, 219101, 2020.
[17]
Vezhnevets, V.; Konouchine, V. “GrowCut” - Interactive multi-label N-D image segmentation by cellular automata. Proc. of Graph. Vol. 1, No. 4, 150–156, 2005.
[18]
Bai, X.; Sapiro, G. Geodesic matting: A framework for fast interactive image and video segmentation and matting. International Journal of Computer Vision Vol. 82, No. 2, 113–132, 2009.
[19]
Gulshan, V.; Rother, C.; Criminisi, A.; Blake, A.; Zisserman, A. Geodesic star convexity for interactive image segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 3129–3136, 2010.
DOI
[20]
Kim, T. H.; Lee, K. M.; Lee, S. U. Nonparametric higher-order learning for interactive segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 3201–3208, 2010.
DOI
[21]
Jian, M.; Jung, C. Interactive image segmentation using adaptive constraint propagation. IEEE Transactions on Image Processing Vol. 25, No. 3, 1301–1311, 2016.
[22]
Wang, T.; Yang, J.; Ji, Z. X.; Sun, Q. S. Probabilistic diffusion for interactive image segmentation. IEEE Transactions on Image Processing Vol. 28, No. 1, 330–342, 2019.
[23]
Wu, J. J.; Zhao, Y. B.; Zhu, J. Y.; Luo, S. W.; Tu, Z. W. MILCut: A sweeping line multiple instance learning paradigm for interactive image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 256–263, 2014.
DOI
[24]
Bai, J. J.; Wu, X. D. Error-tolerant scribbles based interactive image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 392–399, 2014.
DOI
[25]
Rother, C.; Kolmogorov, V.; Blake, A. “GrabCut”: Interactive foreground extraction using iterated graph cuts. In: Proceedings of the ACM SIGGRAPH 2004 Papers, 309–314, 2004.
DOI
[26]
Mortensen, E. N.; Barrett, W. A. Intelligent scissors for image composition. In: Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques, 191–198, 1995.
DOI
[27]
Li, Y.; Sun, J. A.; Tang, C. K.; Shum, H. Y. Lazy snapping. ACM Transactions on Graphics Vol. 23, No. 3, 303–308, 2004.
[28]
Xu, N.; Price, B.; Cohen, S.; Yang, J. M.; Huang, T.Deep interactive object selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 373–381, 2016.
DOI
[29]
Boykov, Y. Y.; Jolly, M.-P. Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. In: Proceedings of the 8th IEEE International Conference on Computer Vision, 105–112, 2001.
[30]
Boykov, Y.; Kolmogorov, V. An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 26, No. 9, 1124–1137, 2004.
[31]
Grady, L. Random walks for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 28, No. 11, 1768–1783, 2006.
[32]
Kim, T. H.; Lee, K. M.; Lee, S. U. Generative image segmentation using random walks with restart. In: Computer Vision – ECCV 2008. Lecture Notes in Computer Science, Vol. 5304. Forsyth, D.; Torr, P.; Zisserman, A. Eds. Springer Berlin Heidelberg, 264–275, 2008.
DOI
[33]
Castrejón, L.; Kundu, K.; Urtasun, R.; Fidler, S. Annotating object instances with a polygon-RNN. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4485–4493, 2017.
DOI
[34]
Acuna, D.; Ling, H.; Kar, A.; Fidler, S. Effi-cient interactive annotation of segmentation datasets with polygon-RNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 859–868, 2018.
DOI
[35]
Ling, H.; Gao, J.; Kar, A.; Chen, W. Z.; Fidler, S. Fast interactive object annotation with curve-GCN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5252–5261, 2019.
DOI
[36]
Lee, K. M.; Myeong, H.; Song, G. SeedNet: Automatic seed generation with deep reinforcement learning for robust interactive segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1760–1768, 2018.
DOI
[37]
Liew, J.; Wei, Y. C.; Xiong, W.; Ong, S. H.; Feng, J. S. Regional interactive image segmentation networks. In: Proceedings of the IEEE International Conference on Computer Vision, 746–2754, 2017.
DOI
[38]
Hu, Y.; Soltoggio, A.; Lock, R.; Carter, S. A fully convolutional two-stream fusion network for interactive image segmentation. Neural Networks Vol. 109, 31–42, 2019.
[39]
Benenson, R.; Popov, S.; Ferrari, V. Large-scale interactive object segmentation with human annotators. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11692–11701, 2019.
DOI
[40]
Lin, Z.; Duan, Z. P.; Zhang, Z.; Guo, C. L.; Cheng, M. M. FocusCut: Diving into a focus view in interactive segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2627–2636, 2022.
DOI
[41]
Zhang, C. B.; Xiao, J. W.; Liu, X. L.; Chen, Y. C.; Cheng, M. M. Representation compensation networks for continual semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7043–7054, 2022.
DOI
[42]
Cermelli, F.; Mancini, M.; Rota Bulò, S.; Ricci, E.; Caputo, B. Modeling the background for incremental learning in semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9230–9239, 2020.
DOI
[43]
Chen, L. C.; Zhu, Y. K.; Papandreou, G.; Schroff, F.; Adam, H. Encoder–decoder with atrous separable convolution for semantic image segmentation. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11211. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 833–851, 2018.
DOI
[44]
He, K. M.; Zhang, X. Y.; Ren, S. Q.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016.
DOI
[45]
Everingham, M.; Gool, L.; Williams, C. K. I.; Winn, J.; Zisserman, A. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision Vol. 88, No. 2, 303–338, 2010.
[46]
Hariharan, B.; Arbeláez, P.; Bourdev, L.; Maji, S.; Malik, J. Semantic contours from inverse detectors. In: Proceedings of the International Conference on Computer Vision, 991–998, 2011.
DOI
[47]
Lin, T. Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C. L. Microsoft COCO: Common objects in context. In: Computer Vision – ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 740–755, 2014.
DOI
[48]
Fan, D. P.; Lin, Z.; Ji, G. P.; Zhang, D. W.; Fu, H. Z.; Cheng, M. M. Taking a deeper look at co-salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2916–2926, 2020.
DOI
[49]
Fan, D. P.; Li, T. P.; Lin, Z.; Ji, G. P.; Zhang, D. W.; Cheng, M. M.; Fu, H. Z.; Shen, J. B. Re-thinking co-salient object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 44, No. 8, 4339–4354, 2022.
[50]
Zhang, Z.; Jin, W. D.; Xu, J.; Cheng, M. M. Gradient-induced co-saliency detection. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12357. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 455–472, 2020.
DOI
[51]
Jia, M. L.; Shi, M. Y.; Sirotenko, M.; Cui, Y.; Cardie, C.; Hariharan, B.; Adam, H.; Belongie, S. Fashionpedia: Ontology, segmentation, and an attribute localization dataset. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12346. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 316–332, 2020.
DOI
[52]
Wang, J.; Markert, K.; Everingham, M. Learning models for object recognition from natural language descriptions. In: Proceedings of the British Machine Vision Conference, 2.1–2.11, 2009.
DOI
[53]
Deng, J.; Dong, W.; Socher, R.; Li, L. J.; Kai, L.; Li, F. F. ImageNet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 248–255, 2009.
DOI
[54]
Steiner, B.; DeVito, Z.; Chintala, S.; Gross, S.; Paszke, A.; Massa, F.; Lerer, A.; Chanan, G.; Lin, Z.; Yang, E.; et al. PyTorch: An imperative style, high-performance deep learning library. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, Article No. 721, 8026–8037, 2019.
Publication history
Copyright
Rights and permissions

Publication history

Received: 11 February 2022
Accepted: 17 June 2022
Published: 05 July 2023
Issue date: December 2023

Copyright

© The Author(s) 2023.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return