Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xuming Ran

REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration

Oct 02, 2025

Yisu Wang, Ming Wang, Haoyuan Song, Wenjie Huang, Chaozheng Wang, Yi Xie, Xuming Ran

Abstract:Post-training for large language models (LLMs) is constrained by the high cost of acquiring new knowledge or correcting errors and by the unintended side effects that frequently arise from retraining. To address these issues, we introduce REPAIR (Robust Editing via Progressive Adaptive Intervention and Reintegration), a lifelong editing framework designed to support precise and low-cost model updates while preserving non-target knowledge. REPAIR mitigates the instability and conflicts of large-scale sequential edits through a closed-loop feedback mechanism coupled with dynamic memory management. Furthermore, by incorporating frequent knowledge fusion and enforcing strong locality guards, REPAIR effectively addresses the shortcomings of traditional distribution-agnostic approaches that often overlook unintended ripple effects. Our experiments demonstrate that REPAIR boosts editing accuracy by 10%-30% across multiple model families and significantly reduces knowledge forgetting. This work introduces a robust framework for developing reliable, scalable, and continually evolving LLMs.

Via

Access Paper or Ask Questions

Brain-inspired continual pre-trained learner via silent synaptic consolidation

Oct 08, 2024

Xuming Ran, Juntao Yao, Yusong Wang, Mingkun Xu, Dianbo Liu

Figure 1 for Brain-inspired continual pre-trained learner via silent synaptic consolidation

Figure 2 for Brain-inspired continual pre-trained learner via silent synaptic consolidation

Figure 3 for Brain-inspired continual pre-trained learner via silent synaptic consolidation

Figure 4 for Brain-inspired continual pre-trained learner via silent synaptic consolidation

Abstract:Pre-trained models have demonstrated impressive generalization capabilities, yet they remain vulnerable to catastrophic forgetting when incrementally trained on new tasks. Existing architecture-based strategies encounter two primary challenges: 1) Integrating a pre-trained network with a trainable sub-network complicates the delicate balance between learning plasticity and memory stability across evolving tasks during learning. 2) The absence of robust interconnections between pre-trained networks and various sub-networks limits the effective retrieval of pertinent information during inference. In this study, we introduce the Artsy, inspired by the activation mechanisms of silent synapses via spike-timing-dependent plasticity observed in mature brains, to enhance the continual learning capabilities of pre-trained models. The Artsy integrates two key components: During training, the Artsy mimics mature brain dynamics by maintaining memory stability for previously learned knowledge within the pre-trained network while simultaneously promoting learning plasticity in task-specific sub-networks. During inference, artificial silent and functional synapses are utilized to establish precise connections between the pre-synaptic neurons in the pre-trained network and the post-synaptic neurons in the sub-networks, facilitated through synaptic consolidation, thereby enabling effective extraction of relevant information from test samples. Comprehensive experimental evaluations reveal that our model significantly outperforms conventional methods on class-incremental learning tasks, while also providing enhanced biological interpretability for architecture-based approaches. Moreover, we propose that the Artsy offers a promising avenue for simulating biological synaptic mechanisms, potentially advancing our understanding of neural plasticity in both artificial and biological systems.

Via

Access Paper or Ask Questions

AdaViPro: Region-based Adaptive Visual Prompt for Large-Scale Models Adapting

Mar 20, 2024

Mengyu Yang, Ye Tian, Lanshan Zhang, Xiao Liang, Xuming Ran, Wendong Wang

Figure 1 for AdaViPro: Region-based Adaptive Visual Prompt for Large-Scale Models Adapting

Figure 2 for AdaViPro: Region-based Adaptive Visual Prompt for Large-Scale Models Adapting

Figure 3 for AdaViPro: Region-based Adaptive Visual Prompt for Large-Scale Models Adapting

Figure 4 for AdaViPro: Region-based Adaptive Visual Prompt for Large-Scale Models Adapting

Abstract:Recently, prompt-based methods have emerged as a new alternative `parameter-efficient fine-tuning' paradigm, which only fine-tunes a small number of additional parameters while keeping the original model frozen. However, despite achieving notable results, existing prompt methods mainly focus on `what to add', while overlooking the equally important aspect of `where to add', typically relying on the manually crafted placement. To this end, we propose a region-based Adaptive Visual Prompt, named AdaViPro, which integrates the `where to add' optimization of the prompt into the learning process. Specifically, we reconceptualize the `where to add' optimization as a problem of regional decision-making. During inference, AdaViPro generates a regionalized mask map for the whole image, which is composed of 0 and 1, to designate whether to apply or discard the prompt in each specific area. Therefore, we employ Gumbel-Softmax sampling to enable AdaViPro's end-to-end learning through standard back-propagation. Extensive experiments demonstrate that our AdaViPro yields new efficiency and accuracy trade-offs for adapting pre-trained models.

* Submitted to ICIP 2024

Via

Access Paper or Ask Questions

Enhancing Adaptive History Reserving by Spiking Convolutional Block Attention Module in Recurrent Neural Networks

Jan 08, 2024

Qi Xu, Yuyuan Gao, Jiangrong Shen, Yaxin Li, Xuming Ran, Huajin Tang, Gang Pan

Figure 1 for Enhancing Adaptive History Reserving by Spiking Convolutional Block Attention Module in Recurrent Neural Networks

Figure 2 for Enhancing Adaptive History Reserving by Spiking Convolutional Block Attention Module in Recurrent Neural Networks

Figure 3 for Enhancing Adaptive History Reserving by Spiking Convolutional Block Attention Module in Recurrent Neural Networks

Figure 4 for Enhancing Adaptive History Reserving by Spiking Convolutional Block Attention Module in Recurrent Neural Networks

Abstract:Spiking neural networks (SNNs) serve as one type of efficient model to process spatio-temporal patterns in time series, such as the Address-Event Representation data collected from Dynamic Vision Sensor (DVS). Although convolutional SNNs have achieved remarkable performance on these AER datasets, benefiting from the predominant spatial feature extraction ability of convolutional structure, they ignore temporal features related to sequential time points. In this paper, we develop a recurrent spiking neural network (RSNN) model embedded with an advanced spiking convolutional block attention module (SCBAM) component to combine both spatial and temporal features of spatio-temporal patterns. It invokes the history information in spatial and temporal channels adaptively through SCBAM, which brings the advantages of efficient memory calling and history redundancy elimination. The performance of our model was evaluated in DVS128-Gesture dataset and other time-series datasets. The experimental results show that the proposed SRNN-SCBAM model makes better use of the history information in spatial and temporal dimensions with less memory space, and achieves higher accuracy compared to other models.

Via

Access Paper or Ask Questions

Deep Auto-encoder with Neural Response

Nov 30, 2021

Xuming Ran, Jie Zhang, Ziyuan Ye, Haiyan Wu, Qi Xu, Huihui Zhou, Quanying Liu

Figure 1 for Deep Auto-encoder with Neural Response

Figure 2 for Deep Auto-encoder with Neural Response

Figure 3 for Deep Auto-encoder with Neural Response

Figure 4 for Deep Auto-encoder with Neural Response

Abstract:Artificial intelligence and neuroscience are deeply interactive. Artificial neural networks (ANNs) have been a versatile tool to study the neural representation in the ventral visual stream, and the knowledge in neuroscience in return inspires ANN models to improve performance in the task. However, how to merge these two directions into a unified model has less studied. Here, we propose a hybrid model, called deep auto-encoder with the neural response (DAE-NR), which incorporates the information from the visual cortex into ANNs to achieve better image reconstruction and higher neural representation similarity between biological and artificial neurons. Specifically, the same visual stimuli (i.e., natural images) are input to both the mice brain and DAE-NR. The DAE-NR jointly learns to map a specific layer of the encoder network to the biological neural responses in the ventral visual stream by a mapping function and to reconstruct the visual input by the decoder. Our experiments demonstrate that if and only if with the joint learning, DAE-NRs can (i) improve the performance of image reconstruction and (ii) increase the representational similarity between biological neurons and artificial neurons. The DAE-NR offers a new perspective on the integration of computer vision and visual neuroscience.

Via

Access Paper or Ask Questions

Machine Learning Applications on Neuroimaging for Diagnosis and Prognosis of Epilepsy: A Review

Feb 05, 2021

Jie Yuan, Xuming Ran, Keyin Liu, Chen Yao, Yi Yao, Haiyan Wu, Quanying Liu

Figure 1 for Machine Learning Applications on Neuroimaging for Diagnosis and Prognosis of Epilepsy: A Review

Figure 2 for Machine Learning Applications on Neuroimaging for Diagnosis and Prognosis of Epilepsy: A Review

Figure 3 for Machine Learning Applications on Neuroimaging for Diagnosis and Prognosis of Epilepsy: A Review

Figure 4 for Machine Learning Applications on Neuroimaging for Diagnosis and Prognosis of Epilepsy: A Review

Abstract:Machine learning is playing an increasing important role in medical image analysis, spawning new advances in neuroimaging clinical applications. However, previous work and reviews were mainly focused on the electrophysiological signals like EEG or SEEG; the potential of neuroimaging in epilepsy research has been largely overlooked despite of its wide use in clinical practices. In this review, we highlight the interactions between neuroimaging and machine learning in the context of the epilepsy diagnosis and prognosis. We firstly outline typical neuroimaging modalities used in epilepsy clinics, \textit{e.g} MRI, DTI, fMRI and PET. We then introduce two approaches to apply machine learning methods to neuroimaging data: the two-step compositional approach which combines feature engineering and machine learning classifier, and the end-to-end approach which is usually toward deep learning. Later a detailed review on the machine learning tasks on epileptic images is presented, such as segmentation, localization and lateralization tasks, as well as the tasks directly related to the diagnosis and prognosis. In the end, we discuss current achievements, challenges, potential future directions in the field, with the hope to pave a way to computer-aided diagnosis and prognosis of epilepsy.

Via

Access Paper or Ask Questions

Bigeminal Priors Variational auto-encoder

Oct 05, 2020

Xuming Ran, Mingkun Xu, Qi Xu, Huihui Zhou, Quanying Liu

Figure 1 for Bigeminal Priors Variational auto-encoder

Figure 2 for Bigeminal Priors Variational auto-encoder

Figure 3 for Bigeminal Priors Variational auto-encoder

Figure 4 for Bigeminal Priors Variational auto-encoder

Abstract:Variational auto-encoders (VAEs) are an influential and generally-used class of likelihood-based generative models in unsupervised learning. The likelihood-based generative models have been reported to be highly robust to the out-of-distribution (OOD) inputs and can be a detector by assuming that the model assigns higher likelihoods to the samples from the in-distribution (ID) dataset than an OOD dataset. However, recent works reported a phenomenon that VAE recognizes some OOD samples as ID by assigning a higher likelihood to the OOD inputs compared to the one from ID. In this work, we introduce a new model, namely Bigeminal Priors Variational auto-encoder (BPVAE), to address this phenomenon. The BPVAE aims to enhance the robustness of the VAEs by combing the power of VAE with the two independent priors that belong to the training dataset and simple dataset, which complexity is lower than the training dataset, respectively. BPVAE learns two datasets'features, assigning a higher likelihood for the training dataset than the simple dataset. In this way, we can use BPVAE's density estimate for detecting the OOD samples. Quantitative experimental results suggest that our model has better generalization capability and stronger robustness than the standard VAEs, proving the effectiveness of the proposed approach of hybrid learning by collaborative priors. Overall, this work paves a new avenue to potentially overcome the OOD problem via multiple latent priors modeling.

Via

Access Paper or Ask Questions

Detecting Out-of-distribution Samples via Variational Auto-encoder with Reliable Uncertainty Estimation

Jul 16, 2020

Xuming Ran, Mingkun Xu, Lingrui Mei, Qi Xu, Quanying Liu

Figure 1 for Detecting Out-of-distribution Samples via Variational Auto-encoder with Reliable Uncertainty Estimation

Figure 2 for Detecting Out-of-distribution Samples via Variational Auto-encoder with Reliable Uncertainty Estimation

Figure 3 for Detecting Out-of-distribution Samples via Variational Auto-encoder with Reliable Uncertainty Estimation

Figure 4 for Detecting Out-of-distribution Samples via Variational Auto-encoder with Reliable Uncertainty Estimation

Abstract:In unsupervised learning, variational auto-encoders (VAEs) are an influential class of deep generative models with rich representational power of neural networks and Bayesian methods. However, VAEs suffer from assigning higher likelihood to out-of-distribution (OOD) inputs than in-distribution (ID) inputs. Recent studies advise that the deep generative models with reliable uncertainty estimation is critical to a deep understanding of OOD inputs. Meanwhile, noise contrastive prior (NCP) is an emerging promising method for obtaining uncertainty, with the advantages of easy to scale, being trainable, and compatibility with extensive models. Inspired by these ideas, We propose an improved noise contrastive prior (INCP) to acquire reliable uncertainty estimate for standard VAEs. By combining INCP with the encoder of VAE, patterns between OOD and ID inputs can be well captured and distinguished. Our method outperforms standard VAEs on the FashionMNIST and CIFAR10 datasets. We also demonstrate the preferred robustness of our model by the extensive experiments on anomaly detection tasks.

Via

Access Paper or Ask Questions