Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Giovanni Bellitto

TrustErase: Auditable Instant Machine Unlearning with Passport-Embedded Representations

Jun 15, 2026

Rutger Hendrix, Leonardo G. Russo, Concetto Spampinato, Matteo Pennisi, Giovanni Bellitto

Abstract:The demand for privacy-compliant AI has amplified the need for machine unlearning; yet, existing retraining or distillation-based methods remain unverifiable and computationally costly. We introduce TrustErase, a verifiable, data-free unlearning framework leveraging passport-embedded representations for instant, modular, and auditable forgetting. By treating passports as cryptographic keys within parameter-efficient adaptation layers, TrustErase enables the removal of specific classes or datasets through simple deactivation, without retraining, fine-tuning, or access to the original data. A singular value based decomposition conceals passports within model weights, ensuring that unlearning actions remain transparent and provably compliant. Evaluations on MNIST, CIFAR10 and CIFAR100 show that TrustErase matches or exceeds state-of-the-art benchmarks such as DELETE, L2UL, and Boundary Shrink, while operating in a strictly data-free regime. Ultimately, TrustErase establishes a new paradigm for trustworthy, accountable, and instantly forgettable AI systems.

Via

Access Paper or Ask Questions

Routing without Forgetting

Mar 10, 2026

Alessio Masano, Giovanni Bellitto, Dipam Goswani, Joost Van de Weijer, Concetto Spampinato

Abstract:Continual learning in transformers is commonly addressed through parameter-efficient adaptation: prompts, adapters, or LoRA modules are specialized per task while the backbone remains frozen. Although effective in controlled multi-epoch settings, these approaches rely on gradual gradient-based specialization and struggle in Online Continual Learning (OCL), where data arrive as a non-stationary stream and each sample may be observed only once. We recast continual learning in transformers as a routing problem: under strict online constraints, the model must dynamically select the appropriate representational subspace for each input without explicit task identifiers or repeated optimization. We thus introduce Routing without Forgetting (RwF), a transformer architecture augmented with energy-based associative retrieval layers inspired by Modern Hopfield Networks. Instead of storing or merging task-specific prompts, RwF generates dynamic prompts through single-step associative retrieval over the transformer token embeddings at each layer. Retrieval corresponds to the closed-form minimization of a strictly convex free-energy functional, enabling input-conditioned routing within each forward pass, independently of iterative gradient refinement. Across challenging class-incremental benchmarks, RwF improves over existing prompt-based methods. On Split-ImageNet-R and Split-ImageNet-S, RwF outperforms prior prompt-based approaches by a large margin, even in few-shot learning regimes. These results indicate that embedding energy-based associative routing directly within the transformer backbone provides a principled and effective foundation for OCL.

Via

Access Paper or Ask Questions

Dream2Learn: Structured Generative Dreaming for Continual Learning

Mar 02, 2026

Salvatore Calcagno, Matteo Pennisi, Federica Proietto Salanitri, Amelia Sorrenti, Simone Palazzo, Concetto Spampinato, Giovanni Bellitto

Abstract:Continual learning requires balancing plasticity and stability while mitigating catastrophic forgetting. Inspired by human dreaming as a mechanism for internal simulation and knowledge restructuring, we introduce Dream2Learn (D2L), a framework in which a model autonomously generates structured synthetic experiences from its own internal representations and uses them for self-improvement. Rather than reconstructing past data as in generative replay, D2L enables a classifier to create novel, semantically distinct dreamed classes that are coherent with its learned knowledge yet do not correspond to previously observed data. These dreamed samples are produced by conditioning a frozen diffusion model through soft prompt optimization driven by the classifier itself. The generated data are not used to replace memory, but to expand and reorganize the representation space, effectively allowing the network to self-train on internally synthesized concepts. By integrating dreamed classes into continual training, D2L proactively structures latent features to support forward knowledge transfer and adaptation to future tasks. This prospective self-training mechanism mirrors the role of sleep in consolidating and reorganizing memory, turning internal simulations into a tool for improved generalization. Experiments on Mini-ImageNet, FG-ImageNet, and ImageNet-R demonstrate that D2L consistently outperforms strong rehearsal-based baselines and achieves positive forward transfer, confirming its ability to enhance adaptability through internally generated training signals.

Via

Access Paper or Ask Questions

Zero-Shot Decentralized Federated Learning

Sep 30, 2025

Alessio Masano, Matteo Pennisi, Federica Proietto Salanitri, Concetto Spampinato, Giovanni Bellitto

Abstract:CLIP has revolutionized zero-shot learning by enabling task generalization without fine-tuning. While prompting techniques like CoOp and CoCoOp enhance CLIP's adaptability, their effectiveness in Federated Learning (FL) remains an open challenge. Existing federated prompt learning approaches, such as FedCoOp and FedTPG, improve performance but face generalization issues, high communication costs, and reliance on a central server, limiting scalability and privacy. We propose Zero-shot Decentralized Federated Learning (ZeroDFL), a fully decentralized framework that enables zero-shot adaptation across distributed clients without a central coordinator. ZeroDFL employs an iterative prompt-sharing mechanism, allowing clients to optimize and exchange textual prompts to enhance generalization while drastically reducing communication overhead. We validate ZeroDFL on nine diverse image classification datasets, demonstrating that it consistently outperforms--or remains on par with--state-of-the-art federated prompt learning methods. More importantly, ZeroDFL achieves this performance in a fully decentralized setting while reducing communication overhead by 118x compared to FedTPG. These results highlight that our approach not only enhances generalization in federated zero-shot learning but also improves scalability, efficiency, and privacy preservation--paving the way for decentralized adaptation of large vision-language models in real-world applications.

* Accepted at International Joint Conference on Neural Networks (IJCNN) 2025. Code available at https://github.com/perceivelab/ZeroDFL

Via

Access Paper or Ask Questions

FedRewind: Rewinding Continual Model Exchange for Decentralized Federated Learning

Nov 14, 2024

Luca Palazzo, Matteo Pennisi, Federica Proietto Salanitri, Giovanni Bellitto, Simone Palazzo, Concetto Spampinato

Figure 1 for FedRewind: Rewinding Continual Model Exchange for Decentralized Federated Learning

Figure 2 for FedRewind: Rewinding Continual Model Exchange for Decentralized Federated Learning

Figure 3 for FedRewind: Rewinding Continual Model Exchange for Decentralized Federated Learning

Figure 4 for FedRewind: Rewinding Continual Model Exchange for Decentralized Federated Learning

Abstract:In this paper, we present FedRewind, a novel approach to decentralized federated learning that leverages model exchange among nodes to address the issue of data distribution shift. Drawing inspiration from continual learning (CL) principles and cognitive neuroscience theories for memory retention, FedRewind implements a decentralized routing mechanism where nodes send/receive models to/from other nodes in the federation to address spatial distribution challenges inherent in distributed learning (FL). During local training, federation nodes periodically send their models back (i.e., rewind) to the nodes they received them from for a limited number of iterations. This strategy reduces the distribution shift between nodes' data, leading to enhanced learning and generalization performance. We evaluate our method on multiple benchmarks, demonstrating its superiority over standard decentralized federated learning methods and those enforcing specific routing schemes within the federation. Furthermore, the combination of federated and continual learning concepts enables our method to tackle the more challenging federated continual learning task, with data shifts over both space and time, surpassing existing baselines.

Via

Access Paper or Ask Questions

Diffexplainer: Towards Cross-modal Global Explanations with Diffusion Models

Apr 03, 2024

Matteo Pennisi, Giovanni Bellitto, Simone Palazzo, Mubarak Shah, Concetto Spampinato

Figure 1 for Diffexplainer: Towards Cross-modal Global Explanations with Diffusion Models

Figure 2 for Diffexplainer: Towards Cross-modal Global Explanations with Diffusion Models

Figure 3 for Diffexplainer: Towards Cross-modal Global Explanations with Diffusion Models

Figure 4 for Diffexplainer: Towards Cross-modal Global Explanations with Diffusion Models

Abstract:We present DiffExplainer, a novel framework that, leveraging language-vision models, enables multimodal global explainability. DiffExplainer employs diffusion models conditioned on optimized text prompts, synthesizing images that maximize class outputs and hidden features of a classifier, thus providing a visual tool for explaining decisions. Moreover, the analysis of generated visual descriptions allows for automatic identification of biases and spurious features, as opposed to traditional methods that often rely on manual intervention. The cross-modal transferability of language-vision models also enables the possibility to describe decisions in a more human-interpretable way, i.e., through text. We conduct comprehensive experiments, which include an extensive user study, demonstrating the effectiveness of DiffExplainer on 1) the generation of high-quality images explaining model decisions, surpassing existing activation maximization methods, and 2) the automated identification of biases and spurious features.

Via

Access Paper or Ask Questions

Selective Attention-based Modulation for Continual Learning

Mar 29, 2024

Giovanni Bellitto, Federica Proietto Salanitri, Matteo Pennisi, Matteo Boschini, Angelo Porrello, Simone Calderara, Simone Palazzo, Concetto Spampinato

Figure 1 for Selective Attention-based Modulation for Continual Learning

Figure 2 for Selective Attention-based Modulation for Continual Learning

Figure 3 for Selective Attention-based Modulation for Continual Learning

Figure 4 for Selective Attention-based Modulation for Continual Learning

Abstract:We present SAM, a biologically-plausible selective attention-driven modulation approach to enhance classification models in a continual learning setting. Inspired by neurophysiological evidence that the primary visual cortex does not contribute to object manifold untangling for categorization and that primordial attention biases are still embedded in the modern brain, we propose to employ auxiliary saliency prediction features as a modulation signal to drive and stabilize the learning of a sequence of non-i.i.d. classification tasks. Experimental results confirm that SAM effectively enhances the performance (in some cases up to about twenty percent points) of state-of-the-art continual learning methods, both in class-incremental and task-incremental settings. Moreover, we show that attention-based modulation successfully encourages the learning of features that are more robust to the presence of spurious features and to adversarial attacks than baseline methods. Code is available at: https://github.com/perceivelab/SAM.

Via

Access Paper or Ask Questions

A Privacy-Preserving Walk in the Latent Space of Generative Models for Medical Applications

Jul 06, 2023

Matteo Pennisi, Federica Proietto Salanitri, Giovanni Bellitto, Simone Palazzo, Ulas Bagci, Concetto Spampinato

Figure 1 for A Privacy-Preserving Walk in the Latent Space of Generative Models for Medical Applications

Figure 2 for A Privacy-Preserving Walk in the Latent Space of Generative Models for Medical Applications

Figure 3 for A Privacy-Preserving Walk in the Latent Space of Generative Models for Medical Applications

Figure 4 for A Privacy-Preserving Walk in the Latent Space of Generative Models for Medical Applications

Abstract:Generative Adversarial Networks (GANs) have demonstrated their ability to generate synthetic samples that match a target distribution. However, from a privacy perspective, using GANs as a proxy for data sharing is not a safe solution, as they tend to embed near-duplicates of real samples in the latent space. Recent works, inspired by k-anonymity principles, address this issue through sample aggregation in the latent space, with the drawback of reducing the dataset by a factor of k. Our work aims to mitigate this problem by proposing a latent space navigation strategy able to generate diverse synthetic samples that may support effective training of deep models, while addressing privacy concerns in a principled way. Our approach leverages an auxiliary identity classifier as a guide to non-linearly walk between points in the latent space, minimizing the risk of collision with near-duplicates of real samples. We empirically demonstrate that, given any random pair of points in the latent space, our walking strategy is safer than linear interpolation. We then test our path-finding strategy combined to k-same methods and demonstrate, on two benchmarks for tuberculosis and diabetic retinopathy classification, that training a model using samples generated by our approach mitigate drops in performance, while keeping privacy preservation.

* Accepted at MICCAI 2023

Via

Access Paper or Ask Questions

On the Effectiveness of Equivariant Regularization for Robust Online Continual Learning

May 05, 2023

Lorenzo Bonicelli, Matteo Boschini, Emanuele Frascaroli, Angelo Porrello, Matteo Pennisi, Giovanni Bellitto, Simone Palazzo, Concetto Spampinato, Simone Calderara

Figure 1 for On the Effectiveness of Equivariant Regularization for Robust Online Continual Learning

Figure 2 for On the Effectiveness of Equivariant Regularization for Robust Online Continual Learning

Figure 3 for On the Effectiveness of Equivariant Regularization for Robust Online Continual Learning

Figure 4 for On the Effectiveness of Equivariant Regularization for Robust Online Continual Learning

Abstract:Humans can learn incrementally, whereas neural networks forget previously acquired information catastrophically. Continual Learning (CL) approaches seek to bridge this gap by facilitating the transfer of knowledge to both previous tasks (backward transfer) and future ones (forward transfer) during training. Recent research has shown that self-supervision can produce versatile models that can generalize well to diverse downstream tasks. However, contrastive self-supervised learning (CSSL), a popular self-supervision technique, has limited effectiveness in online CL (OCL). OCL only permits one iteration of the input dataset, and CSSL's low sample efficiency hinders its use on the input data-stream. In this work, we propose Continual Learning via Equivariant Regularization (CLER), an OCL approach that leverages equivariant tasks for self-supervision, avoiding CSSL's limitations. Our method represents the first attempt at combining equivariant knowledge with CL and can be easily integrated with existing OCL methods. Extensive ablations shed light on how equivariant pretext tasks affect the network's information flow and its impact on CL dynamics.

* 10 pages, 4 figures

Via

Access Paper or Ask Questions

TinyHD: Efficient Video Saliency Prediction with Heterogeneous Decoders using Hierarchical Maps Distillation

Jan 11, 2023

Feiyan Hu, Simone Palazzo, Federica Proietto Salanitri, Giovanni Bellitto, Morteza Moradi, Concetto Spampinato, Kevin McGuinness

Figure 1 for TinyHD: Efficient Video Saliency Prediction with Heterogeneous Decoders using Hierarchical Maps Distillation

Figure 2 for TinyHD: Efficient Video Saliency Prediction with Heterogeneous Decoders using Hierarchical Maps Distillation

Figure 3 for TinyHD: Efficient Video Saliency Prediction with Heterogeneous Decoders using Hierarchical Maps Distillation

Figure 4 for TinyHD: Efficient Video Saliency Prediction with Heterogeneous Decoders using Hierarchical Maps Distillation

Abstract:Video saliency prediction has recently attracted attention of the research community, as it is an upstream task for several practical applications. However, current solutions are particularly computationally demanding, especially due to the wide usage of spatio-temporal 3D convolutions. We observe that, while different model architectures achieve similar performance on benchmarks, visual variations between predicted saliency maps are still significant. Inspired by this intuition, we propose a lightweight model that employs multiple simple heterogeneous decoders and adopts several practical approaches to improve accuracy while keeping computational costs low, such as hierarchical multi-map knowledge distillation, multi-output saliency prediction, unlabeled auxiliary datasets and channel reduction with teacher assistant supervision. Our approach achieves saliency prediction accuracy on par or better than state-of-the-art methods on DFH1K, UCF-Sports and Hollywood2 benchmarks, while enhancing significantly the efficiency of the model. Code is on https://github.com/feiyanhu/tinyHD

* WACV2023

Via

Access Paper or Ask Questions