Abstract:Artificial neural networks face the stability-plasticity dilemma in continual learning, while the brain can maintain memories and remain adaptable. However, the biological strategies for continual learning and their potential to inspire learning algorithms in neural networks are poorly understood. This study presents a minimal model of the fly olfactory circuit to investigate the biological strategies that support continual odor learning. We introduce the fly olfactory circuit as a plug-and-play component, termed the Fly Model, which can integrate with modern machine learning methods to address this dilemma. Our findings demonstrate that the Fly Model enhances both memory stability and learning plasticity, overcoming the limitations of current continual learning strategies. We validated its effectiveness across various challenging continual learning scenarios using commonly used datasets. The fly olfactory system serves as an elegant biological circuit for lifelong learning, offering a module that enhances continual learning with minimal additional computational cost for machine learning.
Abstract:The conventional evaluation protocols on machine learning models rely heavily on a labeled, i.i.d-assumed testing dataset, which is not often present in real world applications. The Automated Model Evaluation (AutoEval) shows an alternative to this traditional workflow, by forming a proximal prediction pipeline of the testing performance without the presence of ground-truth labels. Despite its recent successes, the AutoEval frameworks still suffer from an overconfidence issue, substantial storage and computational cost. In that regard, we propose a novel measure -- Meta-Distribution Energy (MDE) -- that allows the AutoEval framework to be both more efficient and effective. The core of the MDE is to establish a meta-distribution statistic, on the information (energy) associated with individual samples, then offer a smoother representation enabled by energy-based learning. We further provide our theoretical insights by connecting the MDE with the classification loss. We provide extensive experiments across modalities, datasets and different architectural backbones to validate MDE's validity, together with its superiority compared with prior approaches. We also prove MDE's versatility by showing its seamless integration with large-scale models, and easy adaption to learning scenarios with noisy- or imbalanced- labels. Code and data are available: https://github.com/pengr/Energy_AutoEval