AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (23.3 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

WDFSR: Normalizing flow based on the wavelet-domain for super-resolution

Zhejiang Gongshang University, Hangzhou 310018, China
Department of Computer Science, University of Durham, Durham DHI 3LE, UK
Show Author Information

Graphical Abstract

Abstract

We propose a normalizing flow based on the wavelet framework for super-resolution (SR) called WDFSR. It learns the conditional distribution mapping between low-resolution images in the RGB domain and high-resolution images in the wavelet domain to simultaneously generate high-resolution images of different styles. To address the issue of some flow-based models being sensitive to datasets, which results in training fluctuations that reduce the mapping ability of the model and weaken generalization, we designed a method that combines a T-distribution and QR decomposition layer. Our method alleviates this problem while maintaining the ability of the model to map different distributions and produce higher-quality images. Good contextual conditional features can promote model training and enhance the distribution mapping capabilities for conditional distribution mapping. Therefore, we propose a Refinement layer combined with an attention mechanism to refine and fuse the extracted condition features to improve image quality. Extensive experiments on several SR datasets demonstrate that WDFSR outperforms most general CNN- and flow-based models in terms of PSNR value and perception quality. We also demonstrated that our framework works well for other low-level vision tasks, such as low-light enhancement. The pretrained models and source code with guidance for reference are available at https://github.com/Lisbegin/WDFSR.

References

[1]

Li, W.; Zhou, K.; Qi, L.; Lu, L.; Lu, J. Best-buddy GANs for highly detailed image super-resolution. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 36, No. 2, 1412–1420, 2022.

[2]
Soh, J. W.; Park, G. Y.; Jo, J.; Cho, N. I. Natural and realistic single image super-resolution with explicit natural manifold discrimination. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8122–8131, 2019.
[3]
Li, X.; He, H.; Li, X.; Li, D.; Cheng, G.; Shi, J.; Weng, L.; Tong, Y.; Lin, Z. PointFlow: Flowing semantics through points for aerial image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4217–4226, 2021.
[4]
Ping, W.; Peng, K.; Zhao, K.; Song, Z. WaveFlow: A compact flow-based model for raw audio. arXiv preprint arXiv: 1912.01219, 2019.
[5]
Li, Q.; Shen, L.; Guo, S.; Lai, Z. Wavelet integrated CNNs for noise-robust image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7245–7254, 2020.
[6]
Xu, K.; Qin, M.; Sun, F.; Wang, Y.; Chen, Y. K.; Ren, F. Learning in the frequency domain. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1740–1746, 2020.
[7]
Lin, Z.; Gao, Y.; Sang, J. Investigating and explaining the frequency bias in image classification. arXiv preprint arXiv: 2205.03154, 2022.
[8]

Shannon, C. E. Communication in the presence of noise. Proceedings of the IRE Vol. 37, No. 1, 10–21, 1949.

[9]
Huang, H.; He, R.; Sun, Z.; Tan, T. Wavelet-SRNet: A wavelet-based CNN for multi-scale face super resolution. In: Proceedings of the IEEE International Conference on Computer Vision, 1698–1706, 2017.
[10]
Liu, Y.; Qin, Z.; Anwar, S.; Caldwell, S.; Gedeon, T. Are deep neural architectures losing information? Invertibility is indispensable. arXiv preprint arXiv: 2009.03173, 2020.
[11]
Lu, L.; Li, W.; Tao, X.; Lu, J.; Jia, J. MASA-SR: Matching acceleration and spatial adaptation for reference-based image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6368–6377, 2021.
[12]
Liang, J.; Zhang, K.; Gu, S.; Gool, L. V.; Timofte, R. Flow-based kernel prior with application to blind super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10601–10610, 2021.
[13]
Haris, M.; Shakhnarovich, G.; Ukita, N. Deep back-projection networks for super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1664–1673, 2018.
[14]

Liu, S.; Gang, R.; Li, C.; Song, R. Adaptive deep residual network for single image super-resolution. Computational Visual Media Vol. 5, No. 4, 391–401, 2019.

[15]
Lim, B.; Son, S.; Kim, H.; Nah, S.; Lee, K. M. Enhanced deep residual networks for single image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 136–144, 2017.
[16]
Li, Z.; Yang, J.; Liu, Z.; Yang, X.; Jeon, G.; Wu, W. Feedback network for image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3867–3876, 2019.
[17]
Wang, X.; Yu, K.; Dong, C.; Change Loy, C. Recovering realistic texture in image super-resolution by deep spatial feature transform. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 606–615, 2018.
[18]
Zhang, W.; Liu, Y.; Dong, C.; Qiao, Y. RankSRGAN: Generative adversarial networks with ranker for image super-resolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 3096–3105, 2019.
[19]
Lu, Z.; Li, J.; Liu, H.; Huang, C.; Zhang, L.; Zeng, T. Transformer for single image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 457–466, 2022.
[20]

Ma, C.; Rao, Y.; Lu, J.; Zhou, J. Structure-preserving image super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 44, No. 11, 7898–7911, 2022.

[21]

Wu, J.; Cong, R.; Fang, L.; Guo, C.; Zhang, B.; Ghamisi, P. Unpaired remote sensing image super-resolution with content-preserving weak supervision neural network. Science China Information Sciences Vol. 66, No. 1, Article No. 119105, 2022.

[22]
Lee, J.; Jin, K. H. Local texture estimator for implicit representation function. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1929–1938, 2022.
[23]
Menon, S.; Damian, A.; Hu, S.; Ravi, N.; Rudin, C. PULSE: Self-supervised photo upsampling via latent space exploration of generative models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2437–2445, 2020.
[24]

Park, S. H.; Moon, Y. S.; Cho, N. I. Flexible style image super-resolution using conditional objective. IEEE Access Vol. 10, 9774–9792, 2022.

[25]
Dinh, L.; Krueger, D.; Bengio, Y. NICE: Non-linear independent components estimation. arXiv preprint arXiv: 1410.8516, 2014.
[26]
Kingma, D. P.; Dhariwal, P. Glow: Generative flow with invertible 1x1 convolutions. In: Proceedings of the 32nd Conference on Neural Information Processing Systems, 2018.
[27]
Wolf, V.; Lugmayr, A.; Danelljan, M.; Van Gool, L.; Timofte, R. DeFlow: Learning complex image degradations from unpaired data with conditional flows. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 94–103, 2021.
[28]

Grover, A.; Chute, C.; Shu R.; Cao Z.; Ermon, S. AlignFlow: Cycle consistent learning from multiple domains via normalizing flows. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 4, 4028–4035, 2020.

[29]
Lugmayr, A.; Danelljan, M.; Van Gool, L.; Timofte, R. SRFlow: Learning the super-resolution space with normalizing flow. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12350. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J.-M. Eds. Springer Cham, 715–732, 2020.
[30]
Liang, J.; Lugmayr, A.; Zhang, K.; Danelljan, M.; Gool, L. V.; Timofte, R. Hierarchical conditional flow: A unified framework for image super-resolution and image rescaling. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4076–4085, 2021.
[31]
Ho, J.; Chen, X.; Srinivas, A.; Duan, Y.; Abbeel, P. Flow++: Improving flow-based generative models with variational dequantization and architecture design. In: Proceedings of the 36th International Conference on Machine Learning, 2722–2730, 2019.
[32]
Rezende, D. J.; Mohamed, S. Variational inference with normalizing flows. In: Proceedings of the 32nd International Conference on Machine Learning, 1530–1538, 2015.
[33]
Chen, J.; Lu, C.; Chenli, B.; Zhu, J.; Tian, T. VFlow: More expressive generative flows with variational data augmentation. arXiv preprint arXiv: 2002.09741, 2020.
[34]
Yu, J. J.; Derpanis, K. G.; Brubaker, M. A. Wavelet flow: Fast training of high resolution normalizing flows. In: Proceedings of the 34th Conference on Neural Information Processing Systems, 6184–6196, 2020.
[35]

Gal, R.; Hochberg, D. C.; Bermano, A.; Cohen-Or, D. SWAGAN: A style-based wavelet-driven generative model. ACM Transactions on Graphics Vol. 40, No. 4, Article No. 134, 2021.

[36]
Xiao, M.; Zheng, S.; Liu, C.; Wang, Y.; He, D.; Ke, G.; Bian, J.; Lin, Z.; Liu, T. Invertible image rescaling. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12346. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J.-M. Eds. Springer Cham, 126–144, 2020.
[37]
Liu, P.; Zhang, H.; Zhang, K.; Lin, L.; Zuo, W. Multi-level wavelet-CNN for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 773–782, 2018.
[38]

Hsu, W. Y.; Jian, P. W. Detail-enhanced wavelet residual network for single image super-resolution. IEEE Transactions on Instrumentation and Measurement Vol. 71, Article No. 5016913, 2022.

[39]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7132–7141, 2018.
[40]
Park, J.; Woo, S.; Lee, J. Y.; Kweon, I. S. BAM: Bottleneck attention module. arXiv preprint arXiv: 1807.06514, 2018.
[41]
Woo, S.; Park, J.; Lee, J. Y.; Kweon, I. S. CBAM: Convolutional block attention module. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11211. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 3–19, 2018.
[42]
Zhang, Y.; Li, K.; Li, K.; Wang, L.; Zhong, B.; Fu, Y. Image super-resolution using very deep residual channel attention networks. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11211. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 294–310, 2018.
[43]
Kirichenko, P.; Izmailov, P.; Wilson, A. G. Why normalizing flows fail to detect out-of-distribution data. arXiv preprint arXiv: 2006.08545, 2020.
[44]
Dinh, L.; Sohl-Dickstein, J.; Bengio, S. Density estimation using real NVP. arXiv preprint arXiv: 1605.08803, 2016.
[45]
Hoogeboom, E.; van den Berg, R.; Welling, M. Emerging convolutions for generative normalizing flows. In: Proceedings of the 36th International Conference on Machine Learning, 2771–2780, 2019.
[46]
Wang, X.; Yu, K.; Wu, S.; Gu, J.; Liu, Y.; Dong, C.; Qiao, Y.; Loy, C. C. ESRGAN: Enhanced super-resolution generative adversarial networks. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11133. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 63–79, 2019.
[47]
Liu, L.; Liu, J.; Yuan, S.; Slabaugh, G. G.; Leonardis, A.; Zhou, W.; Tian, Q. Wavelet-based dual-branch network for image Demoiréing. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12358. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J.-M. Eds. Springer Cham, 86–102, 2020.
[48]
Alexanderson, S.; Henter, G. E. Robust model training and generalisation with Studentising flows. arXiv preprint arXiv: 2006.06599, 2020.
[49]
Johnson, J.; Alahi, A.; Li, F. F.; Yan, Y.; Yan, Y.; Peng, J.; Wang, H.; Fu, X. Perceptual losses for real-time style transfer and super-resolution. arXiv preprint arXiv: 1603.08155, 2016.
[50]
Agustsson, E.; Timofte, R. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 126–135, 2017.
[51]
Huang, J. B.; Singh, A.; Ahuja, N. Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5197–5206, 2015.
[52]
Wei, C.; Wang, W.; Yang, W.; Liu, J. Deep retinex decomposition for low-light enhancement. arXiv preprint arXiv: 1808.04560, 2018.
[53]
Guo, C.; Li, C.; Guo, J.; Loy, C. C.; Hou, J.; Kwong, S.; Cong, R. Zero-reference deep curve estimation for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1780–1789, 2020.
[54]

Guo, X.; Li, Y.; Ling, H. LIME: Low-light image enhancement via illumination map estimation. IEEE Transactions on Image Processing Vol. 26, No. 2, 982–993, 2017.

[55]

Jiang, Y.; Gong, X.; Liu, D.; Cheng, Y.; Fang, C.; Shen, X.; Yang, J.; Zhou, P.; Wang, Z. EnlightenGAN: Deep light enhancement without paired supervision. IEEE Transactions on Image Processing Vol. 30, 2340–2349, 2021.

[56]
Liu, R.; Ma, L.; Zhang, J.; Fan, X.; Luo, Z. Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10561–10570, 2021.
[57]
Zhang, Y.; Zhang, J.; Guo, X. Kindling the darkness: A practical low-light image enhancer. In: Proceedings of the 27th ACM International Conference on Multimedia, 1632–1640, 2019.
[58]

Zhang, Y.; Guo, X.; Ma, J.; Liu, W.; Zhang, J. Beyond brightening low-light images. International Journal of Computer Vision Vol. 129, No. 4, 1013–1037, 2021.

[59]
Ma, L.; Ma, T.; Liu, R.; Fan, X.; Luo, Z. Toward fast, flexible, and robust low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5637–5646, 2022.
[60]
Wu, W.; Weng, J.; Zhang, P.; Wang, X.; Yang, W.; Jiang, J. URetinex-net: Retinex-based deep unfolding network for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5901–5910, 2022.
[61]
Wang, Y.; Wan, R.; Yang, W.; Li, H.; Chau, L. P.; Kot, A. C. Low-light image enhancement with normalizing flow. arXiv preprint arXiv: 2109.05923, 2021.
[62]
Zeyde, R.; Elad, M.; Protter, M. On single image scale-up using sparse-representations. In: Curves and Surfaces. Lecture Notes in Computer Science, Vol. 6920. Boissonnat, J. D.; Chenin, Albert.; Cohen, A.; Gout, Christian.; Lyche, T.; Mazure, M. L.; Schumaker, L. Eds. Springer Cham, 711–730, 2019.
[63]
Bevilacqua, M.; Roumy, A.; Guillemot, C.; Morel, M. A. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In: Proceedings of the 23rd British Machine Vision Conference, 1–10, 2012.
[64]
Martin, D.; Fowlkes, C.; Tal, D.; Malik, J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of the 8th IEEE International Conference on Computer Vision, 416–423, 2001.
[65]
Wang, X.; Xie, L.; Yu, K.; Chan, K. C.; Loy, C. C.; Dong, C. BasicSR: Open source image and video restoration toolbox. 2022. Available at https://github.com/XPixelGroup/BasicSR
[66]
Kirichenko, P.; Izmailov, P.; Wilson, A. G. Why normalizing flows fail to detect out-of-distribution data. arXiv preprint arXiv: 2006.08545, 2020.
[67]
Liu, J.; Shen, Z.; He, Y.; Zhang, X.; Xu, R.; Yu, H.; Cui, P. Towards out-of-distribution generalization: A survey. arXiv preprint arXiv: 2108.13624, 2021.
[68]

Breckenridge, M. B.; Tallia, A. F.; Like, R. C. Display of small-area variation in health-related data: A methodology using resistant statistics. Social Science & Medicine Vol. 26, No. 1, 141–151, 1988.

[69]

Huang, Y.; Li, J.; Hu, Y.; Gao, X.; Huang, H. Transitional learning: Exploring the transition states of degradation for blind super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 45, No. 5, 6495–6510, 2023.

[70]
Luo, Z.; Huang, H.; Yu, L.; Li, Y.; Fan, H.; Liu, S. Deep constrained least squares for blind image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 17642–17652, 2022.
[71]
Gu, J.; Lu, H.; Zuo, W.; Dong, C. Blind super-resolution with iterative kernel correction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1604–1613, 2019.
Computational Visual Media
Pages 381-404
Cite this article:
Song C, Li S, Li FWB, et al. WDFSR: Normalizing flow based on the wavelet-domain for super-resolution. Computational Visual Media, 2025, 11(2): 381-404. https://doi.org/10.26599/CVM.2025.9450374

25

Views

1

Downloads

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 12 April 2023
Accepted: 22 August 2023
Published: 08 May 2025
© The Author(s) 2025.

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

To submit a manuscript, please go to https://jcvm.org.

Return