Field
Type
Image & Video Restoration

Orthogonal Decoupling Contrastive Regularization: Toward Uncorrelated Feature Decoupling for Unpaired Image Restoration

Author:Zhongze Wang, Jingchao Peng, Haitao Zhao, Lujian Yao, Kaijie Zhao

Year:2026

Publication:IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Orthogonal Decoupling Contrastive Regularization Toward Uncorrelated Feature Decoupling for Unpaired Image Restoration.jpg

Unpaired image restoration (UIR) is a significant task due to the difficulty of acquiring paired degraded/clear images with identical backgrounds. In this paper, we propose a novel UIR method based on the assumption that an image contains both degradation-related features, which affect the level of degradation, and degradation-unrelated features, such as texture and semantic information. Our method aims to ensure that the degradation-related features of the restoration result closely resemble those of the clear image, while the degradation-unrelated features align with the input degraded image. Specifically, we introduce a Feature Orthogonalization Module optimized on Stiefel manifold to decouple image features, ensuring feature uncorrelation. A task-driven Depth-wise Feature Classifier is proposed to assign weights to uncorrelated features based on their relevance to degradation prediction. To avoid the dependence of the training process on the quality of the clear image in a single pair of input data, we propose to maintain several degradation-related proxies describing the degradation level of clear images to enhance the model’s robustness. Finally, a weighted PatchNCE loss is introduced to pull degradation-related features in the output image toward those of clear images, while bringing degradation-unrelated features close to those of the degraded input.

Paper
Image Super-Resolution

Local Texture Pattern Estimation for Image Detail Super-Resolution

Author:Fan Fan, Yang Zhao, Yuan Chen, Nannan Li, Wei Jia, Ronggang Wang

Year:2025

Publication:IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Local Texture Pattern Estimation for Image Detail Super-Resolution.jpg

In the image super-resolution (SR) field, recovering missing high-frequency textures has always been an important goal. However, deep SR networks based on pixel-level constraints tend to focus on stable edge details and cannot effectively restore random high-frequency textures. It was not until the emergence of the generative adversarial network (GAN) that GAN-based SR models achieved realistic texture restoration and quickly became the mainstream method for texture SR. However, GAN-based SR models still have some drawbacks, such as relying on a large number of parameters and generating fake textures that are inconsistent with ground truth. Inspired by traditional texture analysis research, this paper proposes a novel SR network based on local texture pattern estimation (LTPE), which can restore fine high-frequency texture details without GAN. A differentiable local texture operator is first designed to extract local texture structures, and a texture enhancement branch is used to predict the high-resolution local texture distribution based on the LTPE. Then, the predicted high-resolution texture structure map can be used as a reference for the texture fusion SR branch to obtain high-quality texture reconstruction. Finally, L1 loss and Gram loss are simultaneously used to optimize the network. Experimental results demonstrate that the proposed method can effectively recover high-frequency texture without using GAN structures. In addition, the restored high-frequency details are constrained by local texture distribution, thereby reducing significant errors in texture generation.

Paper
Code
Image Super-Resolution

Enhanced Generative Structure Prior for Chinese Text Image Super-Resolution

Author:Xiaoming Li, Wangmeng Zuo, Chen Change Loy

Year:2025

Publication:IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Enhanced Generative Structure Prior for Chinese Text Image Super-Resolution.jpg

Faithful text image super-resolution (SR) is challenging because each character has a unique structure and usually exhibits diverse font styles and layouts. While existing methods primarily focus on English text, less attention has been paid to more complex scripts like Chinese. In this paper, we introduce a high-quality text image SR framework designed to restore the precise strokes of low-resolution (LR) Chinese characters. Unlike methods that rely on character recognition priors to regularize the SR task, we propose a novel structure prior that offers structure-level guidance to enhance visual quality. Our framework incorporates this structure prior within a StyleGAN model, leveraging its generative capabilities for restoration. To maintain the integrity of character structures while accommodating various font styles and layouts, we implement a codebook-based mechanism that restricts the generative space of StyleGAN. Each code in the codebook represents the structure of a specific character, while the vector w in StyleGAN controls the character’s style, including typeface, orientation, and location. Through the collaborative interaction between the codebook and style, we generate a high-resolution structure prior that aligns with LR characters both spatially and structurally. Experiments demonstrate that this structure prior provides robust, character-specific guidance, enabling the accurate restoration of clear strokes in degraded characters, even for real-world LR Chinese text with irregular layouts.

Paper
Code
Image Super-Resolution

Test-Time Training for Hyperspectral Image Super-Resolution

Author:Ke Li, Luc Van Gool, Dengxin Dai

Year:2025

Publication:IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Test-Time Training for Hyperspectral Image Super-Resolution.jpg

The progress on Hyperspectral image (HSI) super-resolution (SR) is still lagging behind the research of RGB image SR. HSIs usually have a high number of spectral bands, so accurately modeling spectral band interaction for HSI SR is hard. Also, training data for HSI SR is hard to obtain so the dataset is usually rather small. In this work, we propose a new test-time training method to tackle this problem. Specifically, a novel self-training framework is developed, where more accurate pseudo-labels and more accurate LR-HR relationships are generated so that the model can be further trained with them to improve performance. In order to better support our test-time training method, we also propose a new network architecture to learn HSI SR without modeling spectral band interaction and propose a new data augmentation method Spectral Mixup to increase the diversity of the training data at test time. We also collect a new HSI dataset with a diverse set of images of interesting objects ranging from food to vegetation, to materials, and to general scenes. Extensive experiments on multiple datasets show that our method can improve the performance of pre-trained models significantly after test-time training and outperform competing methods significantly for HSI SR.

Paper
Image Super-Resolution

Rotation Equivariant Arbitrary-Scale Image Super-Resolution

Author:Qi Xie, Jiahong Fu, Zongben Xu, Deyu Meng

Year:2025

Publication:IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Rotation Equivariant Arbitrary-Scale Image Super-Resolution.jpg

The arbitrary-scale image super-resolution (ASISR), a recent popular topic in computer vision, aims to achieve arbitrary-scale high-resolution recoveries from a low-resolution input image. This task is realized by representing the image as a continuous implicit function through two fundamental modules, a deep-network-based encoder and an implicit neural representation (INR) module. Despite achieving notable progress, a crucial challenge of such a highly ill-posed setting is that many common geometric patterns, such as repetitive textures, edges, or shapes, are seriously warped and deformed in the low-resolution images, naturally leading to unexpected artifacts appearing in their high-resolution recoveries. Embedding rotation equivariance into the ASISR network is thus necessary, as it has been widely demonstrated that this enhancement enables the recovery to faithfully maintain the original orientations and structural integrity of geometric patterns underlying the input image. Motivated by this, we make efforts to construct a rotation equivariant ASISR method in this study. Specifically, we elaborately redesign the basic architectures of INR and encoder modules, incorporating intrinsic rotation equivariance capabilities beyond those of conventional ASISR networks. Through such amelioration, the ASISR network can, for the first time, be implemented with end-to-end rotational equivariance maintained from input to output. We also provide a solid theoretical analysis to evaluate its intrinsic equivariance error, demonstrating its inherent nature of embedding such an equivariance structure. The superiority of the proposed method is substantiated by experiments conducted on both simulated and real datasets. We also validate that the proposed framework can be readily integrated into current ASISR methods in a plug & play manner to further enhance their performance.

Paper
Code
Image Super-Resolution

Towards Lightweight Super-Resolution With Dual Regression Learning

Author:Yong Guo, Mingkui Tan, Zeshuai Deng, Jingdong Wang, Qi Chen, Jiezhang Cao

Year:2025

Publication:IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Towards Lightweight Super-Resolution With Dual Regression Learning.jpg

Deep neural networks have exhibited remarkable performance in image super-resolution (SR) tasks by learning a mapping from low-resolution (LR) images to high-resolution (HR) images. However, the SR problem is typically an ill-posed problem and existing methods would come with several limitations. First, the possible mapping space of SR can be extremely large since there may exist many different HR images that can be super-resolved from the same LR image. As a result, it is hard to directly learn a promising SR mapping from such a large space. Second, it is often inevitable to develop very large models with extremely high computational cost to yield promising SR performance. In practice, one can use model compression techniques to obtain compact models by reducing model redundancy. Nevertheless, it is hard for existing model compression methods to accurately identify the redundant components due to the extremely large SR mapping space. To alleviate the first challenge, we propose a dual regression learning scheme to reduce the space of possible SR mappings. Specifically, in addition to the mapping from LR to HR images, we learn an additional dual regression mapping to estimate the downsampling kernel and reconstruct LR images. In this way, the dual mapping acts as a constraint to reduce the space of possible mappings. To address the second challenge, we propose a dual regression compression (DRC) method to reduce model redundancy in both layer-level and channel-level based on channel pruning. Specifically, we first develop a channel number search method that minimizes the dual regression loss to determine the redundancy of each layer. Given the searched channel numbers, we further exploit the dual regression manner to evaluate the importance of channels and prune the redundant ones. Extensive experiments show the effectiveness of our method in obtaining accurate and efficient SR models.

Paper
Code
Image Super-Resolution

Dual-Level Cross-Modality Neural Architecture Search for Guided Image Super-Resolution

Author:Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao, Shiqi Wang

Year:2025

Publication:IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Dual-Level Cross-Modality Neural Architecture Search for Guided Image Super-Resolution.jpg

Guided image super-resolution (GISR) aims to reconstruct a high-resolution (HR) target image from its low-resolution (LR) counterpart with the guidance of a HR image from another modality. Existing learning-based methods typically employ symmetric two-stream networks to extract features from both the guidance and target images, and then fuse these features at either an early or late stage through manually designed modules to facilitate joint inference. Despite significant performance, these methods still face several issues: i) the symmetric architectures treat images from different modalities equally, which may overlook the inherent differences between them; ii) lower-level features contain detailed information while higher-level features capture semantic structures. However, determining which layers should be fused and which fusion operations should be selected remain unresolved; iii) most methods achieve performance gains at the cost of increased computational complexity, so balancing the trade-off between computational complexity and model performance remains a critical issue. To address these issues, we propose a Dual-level Cross-modality Neural Architecture Search (DCNAS) framework to automatically design efficient GISR models. Specifically, we propose a dual-level search space that enables the NAS algorithm to identify effective architectures and optimal fusion strategies. Moreover, we propose a supernet training strategy that employs a pairwise ranking loss trained performance predictor to guide the supernet training process. To the best of our knowledge, this is the first attempt to introduce the NAS algorithm into GISR tasks. Extensive experiments demonstrate that the discovered model family, DCNAS-Tiny and DCNAS, achieve significant improvements on several GISR tasks, including guided depth map super-resolution, guided saliency map super-resolution, guided thermal image super-resolution, and pan-sharpening. Furthermore, we analyze the architectures searched by our method and provide some new insights for future research.

Paper
Image Super-Resolution

Self-Supervised Learning for Real-World Super-Resolution From Dual and Multiple Zoomed Observations

Author:Zhilu Zhang, Ruohao Wang, Hongzhi Zhang, Wangmeng Zuo

Year:2025

Publication:IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Self-Supervised Learning for Real-World Super-Resolution From Dual and Multiple Zoomed Observations.jpg

In this paper, we consider two challenging issues in reference-based super-resolution (RefSR) for smartphone, (i) how to choose a proper reference image, and (ii) how to learn RefSR in a self-supervised manner. Particularly, we propose a novel self-supervised learning approach for real-world RefSR from observations at dual and multiple camera zooms. Firstly, considering the popularity of multiple cameras in modern smartphones, the more zoomed (telephoto) image can be naturally leveraged as the reference to guide the super-resolution (SR) of the lesser zoomed (ultra-wide) image, which gives us a chance to learn a deep network that performs SR from the dual zoomed observations (DZSR). Secondly, for self-supervised learning of DZSR, we take the telephoto image instead of an additional high-resolution image as the supervision information, and select a center patch from it as the reference to super-resolve the corresponding ultra-wide image patch. To mitigate the effect of the misalignment between ultra-wide low-resolution (LR) patch and telephoto ground-truth (GT) image during training, we propose a two-stage alignment method, including patch-based optical flow alignment and auxiliary-LR guiding alignment. To generate visually pleasing results, we present local overlapped sliced Wasserstein loss. Furthermore, we take multiple zoomed observations to explore self-supervised RefSR, and present a progressive fusion scheme for the effective utilization of reference images. Experiments show that our methods achieve better quantitative and qualitative performance against state-of-the-arts.

Paper
Code
Image Super-Resolution

A Generalized Tensor Formulation for Hyperspectral Image Super-Resolution Under General Spatial Blurring

Author:Yinjian Wang, Wei Li, Yuanyuan Gui, Qian Du, James E. Fowler

Year:2025

Publication:IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

A Generalized Tensor Formulation for Hyperspectral Image Super-Resolution Under General Spatial Blurring.jpg

Hyperspectral super-resolution is commonly accomplished by the fusing of a hyperspectral imaging of low spatial resolution with a multispectral image of high spatial resolution, and many tensor-based approaches to this task have been recently proposed. Yet, it is assumed in such tensor-based methods that the spatial-blurring operation that creates the observed hyperspectral image from the desired super-resolved image is separable into independent horizontal and vertical blurring. Recent work has argued that such separable spatial degradation is ill-equipped to model the operation of real sensors which may exhibit, for example, anisotropic blurring. To accommodate this fact, a generalized tensor formulation based on a Kronecker decomposition is proposed to handle any general spatial-degradation matrix, including those that are not separable as previously assumed. Analysis of the generalized formulation reveals conditions under which exact recovery of the desired super-resolved image is guaranteed, and a practical algorithm for such recovery, driven by a blockwise-group-sparsity regularization, is proposed. Extensive experimental results demonstrate that the proposed generalized tensor approach outperforms not only traditional matrix-based techniques but also state-of-the-art tensor-based methods; the gains with respect to the latter are especially significant in cases of anisotropic spatial blurring.

Paper
1 2 3 ... 206 Jump topage