Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhihui Wei

Zero-Shot Solving of Imaging Inverse Problems via Noise-Refined Likelihood Guided Diffusion Models

Jun 16, 2025

Zhen Wang, Hongyi Liu, Zhihui Wei

Abstract:Diffusion models have achieved remarkable success in imaging inverse problems owing to their powerful generative capabilities. However, existing approaches typically rely on models trained for specific degradation types, limiting their generalizability to various degradation scenarios. To address this limitation, we propose a zero-shot framework capable of handling various imaging inverse problems without model retraining. We introduce a likelihood-guided noise refinement mechanism that derives a closed-form approximation of the likelihood score, simplifying score estimation and avoiding expensive gradient computations. This estimated score is subsequently utilized to refine the model-predicted noise, thereby better aligning the restoration process with the generative framework of diffusion models. In addition, we integrate the Denoising Diffusion Implicit Models (DDIM) sampling strategy to further improve inference efficiency. The proposed mechanism can be applied to both optimization-based and sampling-based schemes, providing an effective and flexible zero-shot solution for imaging inverse problems. Extensive experiments demonstrate that our method achieves superior performance across multiple inverse problems, particularly in compressive sensing, delivering high-quality reconstructions even at an extremely low sampling rate (5%).

Via

Access Paper or Ask Questions

Self-Learning Hyperspectral and Multispectral Image Fusion via Adaptive Residual Guided Subspace Diffusion Model

May 17, 2025

Jian Zhu, He Wang, Yang Xu, Zebin Wu, Zhihui Wei

Abstract:Hyperspectral and multispectral image (HSI-MSI) fusion involves combining a low-resolution hyperspectral image (LR-HSI) with a high-resolution multispectral image (HR-MSI) to generate a high-resolution hyperspectral image (HR-HSI). Most deep learning-based methods for HSI-MSI fusion rely on large amounts of hyperspectral data for supervised training, which is often scarce in practical applications. In this paper, we propose a self-learning Adaptive Residual Guided Subspace Diffusion Model (ARGS-Diff), which only utilizes the observed images without any extra training data. Specifically, as the LR-HSI contains spectral information and the HR-MSI contains spatial information, we design two lightweight spectral and spatial diffusion models to separately learn the spectral and spatial distributions from them. Then, we use these two models to reconstruct HR-HSI from two low-dimensional components, i.e, the spectral basis and the reduced coefficient, during the reverse diffusion process. Furthermore, we introduce an Adaptive Residual Guided Module (ARGM), which refines the two components through a residual guided function at each sampling step, thereby stabilizing the sampling process. Extensive experimental results demonstrate that ARGS-Diff outperforms existing state-of-the-art methods in terms of both performance and computational efficiency in the field of HSI-MSI fusion. Code is available at https://github.com/Zhu1116/ARGS-Diff.

* cvpr

Via

Access Paper or Ask Questions

Explicit Change Relation Learning for Change Detection in VHR Remote Sensing Images

Nov 14, 2023

Dalong Zheng, Zebin Wu, Jia Liu, Chih-Cheng Hung, Zhihui Wei

Figure 1 for Explicit Change Relation Learning for Change Detection in VHR Remote Sensing Images

Figure 2 for Explicit Change Relation Learning for Change Detection in VHR Remote Sensing Images

Figure 3 for Explicit Change Relation Learning for Change Detection in VHR Remote Sensing Images

Figure 4 for Explicit Change Relation Learning for Change Detection in VHR Remote Sensing Images

Abstract:Change detection has always been a concerned task in the interpretation of remote sensing images. It is essentially a unique binary classification task with two inputs, and there is a change relationship between these two inputs. At present, the mining of change relationship features is usually implicit in the network architectures that contain single-branch or two-branch encoders. However, due to the lack of artificial prior design for change relationship features, these networks cannot learn enough change semantic information and lose more accurate change detection performance. So we propose a network architecture NAME for the explicit mining of change relation features. In our opinion, the change features of change detection should be divided into pre-changed image features, post-changed image features and change relation features. In order to fully mine these three kinds of change features, we propose the triple branch network combining the transformer and convolutional neural network (CNN) to extract and fuse these change features from two perspectives of global information and local information, respectively. In addition, we design the continuous change relation (CCR) branch to further obtain the continuous and detail change relation features to improve the change discrimination capability of the model. The experimental results show that our network performs better, in terms of F1, IoU, and OA, than those of the existing advanced networks for change detection on four public very high-resolution (VHR) remote sensing datasets. Our source code is available at https://github.com/DalongZ/NAME.

Via

Access Paper or Ask Questions

SwinV2DNet: Pyramid and Self-Supervision Compounded Feature Learning for Remote Sensing Images Change Detection

Aug 22, 2023

Dalong Zheng, Zebin Wu, Jia Liu, Zhihui Wei

Abstract:Among the current mainstream change detection networks, transformer is deficient in the ability to capture accurate low-level details, while convolutional neural network (CNN) is wanting in the capacity to understand global information and establish remote spatial relationships. Meanwhile, both of the widely used early fusion and late fusion frameworks are not able to well learn complete change features. Therefore, based on swin transformer V2 (Swin V2) and VGG16, we propose an end-to-end compounded dense network SwinV2DNet to inherit the advantages of both transformer and CNN and overcome the shortcomings of existing networks in feature learning. Firstly, it captures the change relationship features through the densely connected Swin V2 backbone, and provides the low-level pre-changed and post-changed features through a CNN branch. Based on these three change features, we accomplish accurate change detection results. Secondly, combined with transformer and CNN, we propose mixed feature pyramid (MFP) which provides inter-layer interaction information and intra-layer multi-scale information for complete feature learning. MFP is a plug and play module which is experimentally proven to be also effective in other change detection networks. Further more, we impose a self-supervision strategy to guide a new CNN branch, which solves the untrainable problem of the CNN branch and provides the semantic change information for the features of encoder. The state-of-the-art (SOTA) change detection scores and fine-grained change maps were obtained compared with other advanced methods on four commonly used public remote sensing datasets. The code is available at https://github.com/DalongZ/SwinV2DNet.

Via

Access Paper or Ask Questions

ReAFFPN: Rotation-equivariant Attention Feature Fusion Pyramid Networks for Aerial Object Detection

Oct 17, 2022

Chongyu Sun, Yang Xu, Zebin Wu, Zhihui Wei

Figure 1 for ReAFFPN: Rotation-equivariant Attention Feature Fusion Pyramid Networks for Aerial Object Detection

Figure 2 for ReAFFPN: Rotation-equivariant Attention Feature Fusion Pyramid Networks for Aerial Object Detection

Figure 3 for ReAFFPN: Rotation-equivariant Attention Feature Fusion Pyramid Networks for Aerial Object Detection

Figure 4 for ReAFFPN: Rotation-equivariant Attention Feature Fusion Pyramid Networks for Aerial Object Detection

Abstract:This paper proposes a Rotation-equivariant Attention Feature Fusion Pyramid Networks for Aerial Object Detection named ReAFFPN. ReAFFPN aims at improving the effect of rotation-equivariant features fusion between adjacent layers which suffers from the semantic and scale discontinuity. Due to the particularity of rotational equivariant convolution, general methods are unable to achieve their original effect while ensuring rotation equivariance of the network. To solve this problem, we design a new Rotation-equivariant Channel Attention which has the ability to both generate channel attention and keep rotation equivariance. Then we embed a new channel attention function into Iterative Attentional Feature Fusion (iAFF) module to realize Rotation-equivariant Attention Feature Fusion. Experimental results demonstrate that ReAFFPN achieves a better rotation-equivariant feature fusion ability and significantly improve the accuracy of the Rotation-equivariant Convolutional Networks.

* IGARSS, 4 pages, 3 figures

Via

Access Paper or Ask Questions

Pyramidal Dense Attention Networks for Lightweight Image Super-Resolution

Jun 13, 2021

Huapeng Wu, Jie Gui, Jun Zhang, James T. Kwok, Zhihui Wei

Figure 1 for Pyramidal Dense Attention Networks for Lightweight Image Super-Resolution

Figure 2 for Pyramidal Dense Attention Networks for Lightweight Image Super-Resolution

Figure 3 for Pyramidal Dense Attention Networks for Lightweight Image Super-Resolution

Figure 4 for Pyramidal Dense Attention Networks for Lightweight Image Super-Resolution

Abstract:Recently, deep convolutional neural network methods have achieved an excellent performance in image superresolution (SR), but they can not be easily applied to embedded devices due to large memory cost. To solve this problem, we propose a pyramidal dense attention network (PDAN) for lightweight image super-resolution in this paper. In our method, the proposed pyramidal dense learning can gradually increase the width of the densely connected layer inside a pyramidal dense block to extract deep features efficiently. Meanwhile, the adaptive group convolution that the number of groups grows linearly with dense convolutional layers is introduced to relieve the parameter explosion. Besides, we also present a novel joint attention to capture cross-dimension interaction between the spatial dimensions and channel dimension in an efficient way for providing rich discriminative feature representations. Extensive experimental results show that our method achieves superior performance in comparison with the state-of-the-art lightweight SR methods.

Via

Access Paper or Ask Questions

Feedback Pyramid Attention Networks for Single Image Super-Resolution

Jun 13, 2021

Huapeng Wu, Jie Gui, Jun Zhang, James T. Kwok, Zhihui Wei

Figure 1 for Feedback Pyramid Attention Networks for Single Image Super-Resolution

Figure 2 for Feedback Pyramid Attention Networks for Single Image Super-Resolution

Figure 3 for Feedback Pyramid Attention Networks for Single Image Super-Resolution

Figure 4 for Feedback Pyramid Attention Networks for Single Image Super-Resolution

Abstract:Recently, convolutional neural network (CNN) based image super-resolution (SR) methods have achieved significant performance improvement. However, most CNN-based methods mainly focus on feed-forward architecture design and neglect to explore the feedback mechanism, which usually exists in the human visual system. In this paper, we propose feedback pyramid attention networks (FPAN) to fully exploit the mutual dependencies of features. Specifically, a novel feedback connection structure is developed to enhance low-level feature expression with high-level information. In our method, the output of each layer in the first stage is also used as the input of the corresponding layer in the next state to re-update the previous low-level filters. Moreover, we introduce a pyramid non-local structure to model global contextual information in different scales and improve the discriminative representation of the network. Extensive experimental results on various datasets demonstrate the superiority of our FPAN in comparison with the state-of-the-art SR methods.

Via

Access Paper or Ask Questions

Multi-grained Attention Networks for Single Image Super-Resolution

Sep 29, 2019

Huapeng Wu, Zhengxia Zou, Jie Gui, Wen-Jun Zeng, Jieping Ye, Jun Zhang, Hongyi Liu, Zhihui Wei

Figure 1 for Multi-grained Attention Networks for Single Image Super-Resolution

Figure 2 for Multi-grained Attention Networks for Single Image Super-Resolution

Figure 3 for Multi-grained Attention Networks for Single Image Super-Resolution

Figure 4 for Multi-grained Attention Networks for Single Image Super-Resolution

Abstract:Deep Convolutional Neural Networks (CNN) have drawn great attention in image super-resolution (SR). Recently, visual attention mechanism, which exploits both of the feature importance and contextual cues, has been introduced to image SR and proves to be effective to improve CNN-based SR performance. In this paper, we make a thorough investigation on the attention mechanisms in a SR model and shed light on how simple and effective improvements on these ideas improve the state-of-the-arts. We further propose a unified approach called "multi-grained attention networks (MGAN)" which fully exploits the advantages of multi-scale and attention mechanisms in SR tasks. In our method, the importance of each neuron is computed according to its surrounding regions in a multi-grained fashion and then is used to adaptively re-scale the feature responses. More importantly, the "channel attention" and "spatial attention" strategies in previous methods can be essentially considered as two special cases of our method. We also introduce multi-scale dense connections to extract the image features at multiple scales and capture the features of different layers through dense skip connections. Ablation studies on benchmark datasets demonstrate the effectiveness of our method. In comparison with other state-of-the-art SR methods, our method shows the superiority in terms of both accuracy and model size.

Via

Access Paper or Ask Questions