This paper presents a multi-task gradual inference model, MTGINet, for automatic portrait matting. It handles the subtasks of automatic portrait matting, namely portrait–transition–background trimap segmentation and transition region matting, with a single encoder–decoder structure. First, we enrich the highest stage of features from the encoder with portrait shape context via a shape context aggregation (SCA) module for trimap segmentation. Then, we fuse the SCA-enhanced features with detailed clues from the encoder for transition-region-aware alpha matting. The gradual inference model naturally allows sufficient interaction between the subtasks via forward computation and backwards propagation during training, and therefore achieves high accuracy while maintaining low complexity. In addition, considering the discrepancies in feature requirements across subtasks, we adapt the features from the encoders before reusing them via a feature rectification module. In addition to the MTGINet model, we have constructed a new large-scale dataset, HPM-17K, for half-body portrait matting. It consists of 16,967 images with diverse backgrounds. Comparative experiments with existing deep models on the public P3M-10K dataset and our HPM-17K dataset demonstrate that the proposed model exhibits state-of-the-art performance.
Publications
- Article type
- Year
- Co-author
Article type
Year
Open Access
Research Article
Issue
Computational Visual Media 2025, 11(6): 1385-1398
Published: 12 December 2025
Downloads:47
Total 1
京公网安备11010802044758号