Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
Language-guided fashion image editing is challenging, as fashion image editing is local and requires high precision, while natural language cannot provide precise visual information for guidance. In this paper, we propose LucIE, a novel unsupervised language-guided local image editing method for fashion images. LucIE adopts and modifies recent text-to-image synthesis network, DF-GAN, as its backbone. However, the synthesis backbone often changes the global structure of the input image, making local image editing impractical. To increase structural consistency between input and edited images, we propose Content-Preserving Fusion Module (CPFM). Different from existing fusion modules, CPFM prevents iterative refinement on visual feature maps and accumulates additive modifications on RGB maps. LucIE achieves local image editing explicitly with language-guided image segmentation and mask-guided image blending while only using image and text pairs. Results on the DeepFashion dataset shows that LucIE achieves state-of-the-art results. Compared with previous methods, images generated by LucIE also exhibit fewer artifacts. We provide visualizations and perform ablation studies to validate LucIE and the CPFM. We also demonstrate and analyze limitations of LucIE, to provide a better understanding of LucIE.
Portenier, T.; Hu, Q.; Szabó, A.; Bigdeli, S.; Favaro P, Zwicker M. Faceshop: Deep sketch-based face image editing. ACM Transactions on Graphics Vol. 37, No. 4, Article No. 99, 2018.
Perez, E.; Strub, F.; De Vries, H.; Dumoulin, V.; Courville, A. FiLM: Visual reasoning with a general conditioning layer. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 32, No. 1, 3942–3951, 2018.
Zhang, Y.; Li, L.; Song, L.; Xie, R.; Zhang, W. FACT: Fused attention for clothing transfer with generative adversarial networks. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 12894–12901, 2020.
Ge, S.; Jin, X.; Ye, Q.; Luo, Z.; Li, Q. Image editing by object-aware optimal boundary searching and mixed-domain composition. Computational Visual Media Vol. 4, No. 1, 71–82, 2018.
Sun, R.; Huang, C.; Zhu, H.; Ma, L. Mask-aware photorealistic facial attribute manipulation. Computational Visual Media Vol. 7, No. 3, 363–374, 2021.
Zheng, Z. H.; Zhang, H. T.; Zhang, F. L.; Mu, T. J. Image-based clothes changing system. Computational Visual Media Vol. 3, No. 4, 337–347, 2017.
Xue, Y.; Guo, Y. C.; Zhang, H.; Xu, T.; Zhang, S. H.; Huang, X. Deep image synthesis from intuitive user input: A review and perspectives. Computational Visual Media Vol. 8, No. 1, 3–31, 2022.
Mao, F.; Ma, B.; Chang, H.; Shan, S.; Chen, X. Learning efficient text-to-image synthesis via interstage cross-sample similarity distillation. Science China Information Sciences Vol. 64, No. 2, Article No. 120102, 2020.
191
Views
10
Downloads
0
Crossref
0
Web of Science
0
Scopus
0
CSCD
Altmetrics
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
To submit a manuscript, please go to https://jcvm.org.
Comments on this article