Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pier Luigi Dragotti

Fellow, IEEE

LotteryCodec: Searching the Implicit Representation in a Random Network for Low-Complexity Image Compression

Jul 01, 2025

Haotian Wu, Gongpu Chen, Pier Luigi Dragotti, Deniz Gündüz

Abstract:We introduce and validate the lottery codec hypothesis, which states that untrained subnetworks within randomly initialized networks can serve as synthesis networks for overfitted image compression, achieving rate-distortion (RD) performance comparable to trained networks. This hypothesis leads to a new paradigm for image compression by encoding image statistics into the network substructure. Building on this hypothesis, we propose LotteryCodec, which overfits a binary mask to an individual image, leveraging an over-parameterized and randomly initialized network shared by the encoder and the decoder. To address over-parameterization challenges and streamline subnetwork search, we develop a rewind modulation mechanism that improves the RD performance. LotteryCodec outperforms VTM and sets a new state-of-the-art in single-image compression. LotteryCodec also enables adaptive decoding complexity through adjustable mask ratios, offering flexible compression solutions for diverse device constraints and application requirements.

* International Conference on Machine Learning (2025)

Via

Access Paper or Ask Questions

LatentINDIGO: An INN-Guided Latent Diffusion Algorithm for Image Restoration

May 19, 2025

Di You, Daniel Siromani, Pier Luigi Dragotti

Abstract:There is a growing interest in the use of latent diffusion models (LDMs) for image restoration (IR) tasks due to their ability to model effectively the distribution of natural images. While significant progress has been made, there are still key challenges that need to be addressed. First, many approaches depend on a predefined degradation operator, making them ill-suited for complex or unknown degradations that deviate from standard analytical models. Second, many methods struggle to provide a stable guidance in the latent space and finally most methods convert latent representations back to the pixel domain for guidance at every sampling iteration, which significantly increases computational and memory overhead. To overcome these limitations, we introduce a wavelet-inspired invertible neural network (INN) that simulates degradations through a forward transform and reconstructs lost details via the inverse transform. We further integrate this design into a latent diffusion pipeline through two proposed approaches: LatentINDIGO-PixelINN, which operates in the pixel domain, and LatentINDIGO-LatentINN, which stays fully in the latent space to reduce complexity. Both approaches alternate between updating intermediate latent variables under the guidance of our INN and refining the INN forward model to handle unknown degradations. In addition, a regularization step preserves the proximity of latent variables to the natural image manifold. Experiments demonstrate that our algorithm achieves state-of-the-art performance on synthetic and real-world low-quality images, and can be readily adapted to arbitrary output sizes.

* Submitted to IEEE Transactions on Image Processing (TIP)

Via

Access Paper or Ask Questions

Trustworthy Image Super-Resolution via Generative Pseudoinverse

May 18, 2025

Andreas Floros, Seyed-Mohsen Moosavi-Dezfooli, Pier Luigi Dragotti

Abstract:We consider the problem of trustworthy image restoration, taking the form of a constrained optimization over the prior density. To this end, we develop generative models for the task of image super-resolution that respect the degradation process and that can be made asymptotically consistent with the low-resolution measurements, outperforming existing methods by a large margin in that respect.

Via

Access Paper or Ask Questions

SING: Semantic Image Communications using Null-Space and INN-Guided Diffusion Models

Mar 16, 2025

Jiakang Chen, Selim F. Yilmaz, Di You, Pier Luigi Dragotti, Deniz Gündüz

Abstract:Joint source-channel coding systems based on deep neural networks (DeepJSCC) have recently demonstrated remarkable performance in wireless image transmission. Existing methods primarily focus on minimizing distortion between the transmitted image and the reconstructed version at the receiver, often overlooking perceptual quality. This can lead to severe perceptual degradation when transmitting images under extreme conditions, such as low bandwidth compression ratios (BCRs) and low signal-to-noise ratios (SNRs). In this work, we propose SING, a novel two-stage JSCC framework that formulates the recovery of high-quality source images from corrupted reconstructions as an inverse problem. Depending on the availability of information about the DeepJSCC encoder/decoder and the channel at the receiver, SING can either approximate the stochastic degradation as a linear transformation, or leverage invertible neural networks (INNs) for precise modeling. Both approaches enable the seamless integration of diffusion models into the reconstruction process, enhancing perceptual quality. Experimental results demonstrate that SING outperforms DeepJSCC and other approaches, delivering superior perceptual quality even under extremely challenging conditions, including scenarios with significant distribution mismatches between the training and test data.

Via

Access Paper or Ask Questions

SMILENet: Unleashing Extra-Large Capacity Image Steganography via a Synergistic Mosaic InvertibLE Hiding Network

Mar 07, 2025

Jun-Jie Huang, Zihan Chen, Tianrui Liu, Wentao Zhao, Xin Deng, Xinwang Liu, Meng Wang, Pier Luigi Dragotti

Abstract:Existing image steganography methods face fundamental limitations in hiding capacity (typically $1\sim7$ images) due to severe information interference and uncoordinated capacity-distortion trade-off. We propose SMILENet, a novel synergistic framework that achieves 25 image hiding through three key innovations: (i) A synergistic network architecture coordinates reversible and non-reversible operations to efficiently exploit information redundancy in both secret and cover images. The reversible Invertible Cover-Driven Mosaic (ICDM) module and Invertible Mosaic Secret Embedding (IMSE) module establish cover-guided mosaic transformations and representation embedding with mathematically guaranteed invertibility for distortion-free embedding. The non-reversible Secret Information Selection (SIS) module and Secret Detail Enhancement (SDE) module implement learnable feature modulation for critical information selection and enhancement. (ii) A unified training strategy that coordinates complementary modules to achieve 3.0x higher capacity than existing methods with superior visual quality. (iii) Last but not least, we introduce a new metric to model Capacity-Distortion Trade-off for evaluating the image steganography algorithms that jointly considers hiding capacity and distortion, and provides a unified evaluation approach for accessing results with different number of secret image. Extensive experiments on DIV2K, Paris StreetView and ImageNet1K show that SMILENet outperforms state-of-the-art methods in terms of hiding capacity, recovery quality as well as security against steganalysis methods.

Via

Access Paper or Ask Questions

A Lightweight Deep Exclusion Unfolding Network for Single Image Reflection Removal

Mar 03, 2025

Jun-Jie Huang, Tianrui Liu, Zihan Chen, Xinwang Liu, Meng Wang, Pier Luigi Dragotti

Abstract:Single Image Reflection Removal (SIRR) is a canonical blind source separation problem and refers to the issue of separating a reflection-contaminated image into a transmission and a reflection image. The core challenge lies in minimizing the commonalities among different sources. Existing deep learning approaches either neglect the significance of feature interactions or rely on heuristically designed architectures. In this paper, we propose a novel Deep Exclusion unfolding Network (DExNet), a lightweight, interpretable, and effective network architecture for SIRR. DExNet is principally constructed by unfolding and parameterizing a simple iterative Sparse and Auxiliary Feature Update (i-SAFU) algorithm, which is specifically designed to solve a new model-based SIRR optimization formulation incorporating a general exclusion prior. This general exclusion prior enables the unfolded SAFU module to inherently identify and penalize commonalities between the transmission and reflection features, ensuring more accurate separation. The principled design of DExNet not only enhances its interpretability but also significantly improves its performance. Comprehensive experiments on four benchmark datasets demonstrate that DExNet achieves state-of-the-art visual and quantitative results while utilizing only approximately 8\% of the parameters required by leading methods.

Via

Access Paper or Ask Questions

INDIGO+: A Unified INN-Guided Probabilistic Diffusion Algorithm for Blind and Non-Blind Image Restoration

Jan 23, 2025

Di You, Pier Luigi Dragotti

Abstract:Generative diffusion models are becoming one of the most popular prior in image restoration (IR) tasks due to their remarkable ability to generate realistic natural images. Despite achieving satisfactory results, IR methods based on diffusion models present several limitations. First of all, most non-blind approaches require an analytical expression of the degradation model to guide the sampling process. Secondly, most existing blind approaches rely on families of pre-defined degradation models for training their deep networks. The above issues limit the flexibility of these approaches and so their ability to handle real-world degradation tasks. In this paper, we propose a novel INN-guided probabilistic diffusion algorithm for non-blind and blind image restoration, namely INDIGO and BlindINDIGO, which combines the merits of the perfect reconstruction property of invertible neural networks (INN) with the strong generative capabilities of pre-trained diffusion models. Specifically, we train the forward process of the INN to simulate an arbitrary degradation process and use the inverse to obtain an intermediate image that we use to guide the reverse diffusion sampling process through a gradient step. We also introduce an initialization strategy, to further improve the performance and inference speed of our algorithm. Experiments demonstrate that our algorithm obtains competitive results compared with recently leading methods both quantitatively and visually on synthetic and real-world low-quality images.

* Accepted by IEEE Journal of Selected Topics in Signal Processing (JSTSP)

Via

Access Paper or Ask Questions

Tracing the Roots: Leveraging Temporal Dynamics in Diffusion Trajectories for Origin Attribution

Nov 12, 2024

Andreas Floros, Seyed-Mohsen Moosavi-Dezfooli, Pier Luigi Dragotti

Abstract:Diffusion models have revolutionized image synthesis, garnering significant research interest in recent years. Diffusion is an iterative algorithm in which samples are generated step-by-step, starting from pure noise. This process introduces the notion of diffusion trajectories, i.e., paths from the standard Gaussian distribution to the target image distribution. In this context, we study discriminative algorithms operating on these trajectories. Specifically, given a pre-trained diffusion model, we consider the problem of classifying images as part of the training dataset, generated by the model or originating from an external source. Our approach demonstrates the presence of patterns across steps that can be leveraged for classification. We also conduct ablation studies, which reveal that using higher-order gradient features to characterize the trajectories leads to significant performance gains and more robust algorithms.

Via

Access Paper or Ask Questions

Adversarial Deep-Unfolding Network for MA-XRF Super-Resolution on Old Master Paintings Using Minimal Training Data

Sep 14, 2024

Herman Verinaz-Jadan, Su Yan, Catherine Higgitt, Pier Luigi Dragotti

Abstract:High-quality element distribution maps enable precise analysis of the material composition and condition of Old Master paintings. These maps are typically produced from data acquired through Macro X-ray fluorescence (MA-XRF) scanning, a non-invasive technique that collects spectral information. However, MA-XRF is often limited by a trade-off between acquisition time and resolution. Achieving higher resolution requires longer scanning times, which can be impractical for detailed analysis of large artworks. Super-resolution MA-XRF provides an alternative solution by enhancing the quality of MA-XRF scans while reducing the need for extended scanning sessions. This paper introduces a tailored super-resolution approach to improve MA-XRF analysis of Old Master paintings. Our method proposes a novel adversarial neural network architecture for MA-XRF, inspired by the Learned Iterative Shrinkage-Thresholding Algorithm. It is specifically designed to work in an unsupervised manner, making efficient use of the limited available data. This design avoids the need for extensive datasets or pre-trained networks, allowing it to be trained using just a single high-resolution RGB image alongside low-resolution MA-XRF data. Numerical results demonstrate that our method outperforms existing state-of-the-art super-resolution techniques for MA-XRF scans of Old Master paintings.

Via

Access Paper or Ask Questions

Reconstructing classes of 3D FRI signals from sampled tomographic projections at unknown angles

Apr 15, 2024

Renke Wang, Francien G. Bossema, Thierry Blu, Pier Luigi Dragotti

Figure 1 for Reconstructing classes of 3D FRI signals from sampled tomographic projections at unknown angles

Figure 2 for Reconstructing classes of 3D FRI signals from sampled tomographic projections at unknown angles

Figure 3 for Reconstructing classes of 3D FRI signals from sampled tomographic projections at unknown angles

Figure 4 for Reconstructing classes of 3D FRI signals from sampled tomographic projections at unknown angles

Abstract:Traditional sampling schemes often assume that the sampling locations are known. Motivated by the recent bioimaging technique known as cryogenic electron microscopy (cryoEM), we consider the problem of reconstructing an unknown 3D structure from samples of its 2D tomographic projections at unknown angles. We focus on 3D convex bilevel polyhedra and 3D point sources and show that the exact estimation of these 3D structures and of the projection angles can be achieved up to an orthogonal transformation. Moreover, we are able to show that the minimum number of projections needed to achieve perfect reconstruction is independent of the complexity of the signal model. By using the divergence theorem, we are able to retrieve the projected vertices of the polyhedron from the sampled tomographic projections, and then we show how to retrieve the 3D object and the projection angles from this information. The proof of our theorem is constructive and leads to a robust reconstruction algorithm, which we validate under various conditions. Finally, we apply aspects of the proposed framework to calibration of X-ray computed tomography (CT) data.

Via

Access Paper or Ask Questions