Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
Video snow removal has tremendous potential in enhancing video quality and boosting the performance of computer vision tasks. Recently, Transformers have gained attention for the self-attention mechanism. However, the memory consumption of self-attention is considerable, limiting its application in high-resolution video restoration. In this paper, we propose an efficient video desnowing spatio-temporal Transformer, which utilizes spatio-temporal sequence attention to parallelly capture intra-frame spatial information and inter-frame temporal information, with much lower memory consumption compared to standard self-attention. Additionally, we mitigate the impact of snowflake occlusion on video frame alignment by leveraging an atmospheric scattering model. Furthermore, we introduce the concept of Neural Representation for Videos (NeRV) and effectively reconstruct compressed videos after multi-resolution feature extraction using the recovery NeRV module, thereby further reducing computational consumption. Extensive experiments demonstrate that the model achieves superior performance in video snow removal while significantly reducing computational resources.
The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).
Comments on this article