Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Joaquin Vanschoren

TU/e

AutoML Benchmark with shorter time constraints and early stopping

Apr 01, 2025

Israel Campero Jurado, Pieter Gijsbers, Joaquin Vanschoren

Abstract:Automated Machine Learning (AutoML) automatically builds machine learning (ML) models on data. The de facto standard for evaluating new AutoML frameworks for tabular data is the AutoML Benchmark (AMLB). AMLB proposed to evaluate AutoML frameworks using 1- and 4-hour time budgets across 104 tasks. We argue that shorter time constraints should be considered for the benchmark because of their practical value, such as when models need to be retrained with high frequency, and to make AMLB more accessible. This work considers two ways in which to reduce the overall computation used in the benchmark: smaller time constraints and the use of early stopping. We conduct evaluations of 11 AutoML frameworks on 104 tasks with different time constraints and find the relative ranking of AutoML frameworks is fairly consistent across time constraints, but that using early-stopping leads to a greater variety in model performance.

* Workshop on the Future of Machine Learning Data Practices and Repositories, ICLR 2025

Via

Access Paper or Ask Questions

Sculpting [CLS] Features for Pre-Trained Model-Based Class-Incremental Learning

Feb 20, 2025

Murat Onur Yildirim, Elif Ceren Gok Yildirim, Joaquin Vanschoren

Abstract:Class-incremental learning requires models to continually acquire knowledge of new classes without forgetting old ones. Although pre-trained models have demonstrated strong performance in class-incremental learning, they remain susceptible to catastrophic forgetting when learning new concepts. Excessive plasticity in the models breaks generalizability and causes forgetting, while strong stability results in insufficient adaptation to new classes. This necessitates effective adaptation with minimal modifications to preserve the general knowledge of pre-trained models. To address this challenge, we first introduce a new parameter-efficient fine-tuning module 'Learn and Calibrate', or LuCA, designed to acquire knowledge through an adapter-calibrator couple, enabling effective adaptation with well-refined feature representations. Second, for each learning session, we deploy a sparse LuCA module on top of the last token just before the classifier, which we refer to as 'Token-level Sparse Calibration and Adaptation', or TOSCA. This strategic design improves the orthogonality between the modules and significantly reduces both training and inference complexity. By leaving the generalization capabilities of the pre-trained models intact and adapting exclusively via the last token, our approach achieves a harmonious balance between stability and plasticity. Extensive experiments demonstrate TOSCA's state-of-the-art performance while introducing ~8 times fewer parameters compared to prior methods.

Via

Access Paper or Ask Questions

Occam's model: Selecting simpler representations for better transferability estimation

Feb 10, 2025

Prabhant Singh, Sibylle Hess, Joaquin Vanschoren

Abstract:Fine-tuning models that have been pre-trained on large datasets has become a cornerstone of modern machine learning workflows. With the widespread availability of online model repositories, such as Hugging Face, it is now easier than ever to fine-tune pre-trained models for specific tasks. This raises a critical question: which pre-trained model is most suitable for a given task? This problem is called transferability estimation. In this work, we introduce two novel and effective metrics for estimating the transferability of pre-trained models. Our approach is grounded in viewing transferability as a measure of how easily a pre-trained model's representations can be trained to separate target classes, providing a unique perspective on transferability estimation. We rigorously evaluate the proposed metrics against state-of-the-art alternatives across diverse problem settings, demonstrating their robustness and practical utility. Additionally, we present theoretical insights that explain our metrics' efficacy and adaptability to various scenarios. We experimentally show that our metrics increase Kendall's Tau by up to 32% compared to the state-of-the-art baselines.

Via

Access Paper or Ask Questions

On dataset transferability in medical image classification

Dec 28, 2024

Dovile Juodelyte, Enzo Ferrante, Yucheng Lu, Prabhant Singh, Joaquin Vanschoren, Veronika Cheplygina

Abstract:Current transferability estimation methods designed for natural image datasets are often suboptimal in medical image classification. These methods primarily focus on estimating the suitability of pre-trained source model features for a target dataset, which can lead to unrealistic predictions, such as suggesting that the target dataset is the best source for itself. To address this, we propose a novel transferability metric that combines feature quality with gradients to evaluate both the suitability and adaptability of source model features for target tasks. We evaluate our approach in two new scenarios: source dataset transferability for medical image classification and cross-domain transferability. Our results show that our method outperforms existing transferability metrics in both settings. We also provide insight into the factors influencing transfer performance in medical image classification, as well as the dynamics of cross-domain transfer from natural to medical images. Additionally, we provide ground-truth transfer performance benchmarking results to encourage further research into transferability estimation for medical image classification. Our code and experiments are available at https://github.com/DovileDo/transferability-in-medical-imaging.

Via

Access Paper or Ask Questions

Continual Learning on a Data Diet

Oct 23, 2024

Elif Ceren Gok Yildirim, Murat Onur Yildirim, Joaquin Vanschoren

Figure 1 for Continual Learning on a Data Diet

Figure 2 for Continual Learning on a Data Diet

Figure 3 for Continual Learning on a Data Diet

Figure 4 for Continual Learning on a Data Diet

Abstract:Continual Learning (CL) methods usually learn from all available data. However, this is not the case in human cognition which efficiently focuses on key experiences while disregarding the redundant information. Similarly, not all data points in a dataset have equal potential; some can be more informative than others. This disparity may significantly impact the performance, as both the quality and quantity of samples directly influence the model's generalizability and efficiency. Drawing inspiration from this, we explore the potential of learning from important samples and present an empirical study for evaluating coreset selection techniques in the context of CL to stimulate research in this unexplored area. We train different continual learners on increasing amounts of selected samples and investigate the learning-forgetting dynamics by shedding light on the underlying mechanisms driving their improved stability-plasticity balance. We present several significant observations: learning from selectively chosen samples (i) enhances incremental accuracy, (ii) improves knowledge retention of previous tasks, and (iii) refines learned representations. This analysis contributes to a deeper understanding of selective learning strategies in CL scenarios.

* 18 pages, 6 figures

Via

Access Paper or Ask Questions

Learning to Learn without Forgetting using Attention

Aug 06, 2024

Anna Vettoruzzo, Joaquin Vanschoren, Mohamed-Rafik Bouguelia, Thorsteinn Rögnvaldsson

Figure 1 for Learning to Learn without Forgetting using Attention

Figure 2 for Learning to Learn without Forgetting using Attention

Figure 3 for Learning to Learn without Forgetting using Attention

Figure 4 for Learning to Learn without Forgetting using Attention

Abstract:Continual learning (CL) refers to the ability to continually learn over time by accommodating new knowledge while retaining previously learned experience. While this concept is inherent in human learning, current machine learning methods are highly prone to overwrite previously learned patterns and thus forget past experience. Instead, model parameters should be updated selectively and carefully, avoiding unnecessary forgetting while optimally leveraging previously learned patterns to accelerate future learning. Since hand-crafting effective update mechanisms is difficult, we propose meta-learning a transformer-based optimizer to enhance CL. This meta-learned optimizer uses attention to learn the complex relationships between model parameters across a stream of tasks, and is designed to generate effective weight updates for the current task while preventing catastrophic forgetting on previously encountered tasks. Evaluations on benchmark datasets like SplitMNIST, RotatedMNIST, and SplitCIFAR-100 affirm the efficacy of the proposed approach in terms of both forward and backward transfer, even on small sets of labeled data, highlighting the advantages of integrating a meta-learned optimizer within the continual learning framework.

* Published at 3rd Conference on Lifelong Learning Agents (CoLLAs), 2024

Via

Access Paper or Ask Questions

Robust and Efficient Transfer Learning via Supernet Transfer in Warm-started Neural Architecture Search

Jul 26, 2024

Prabhant Singh, Joaquin Vanschoren

Figure 1 for Robust and Efficient Transfer Learning via Supernet Transfer in Warm-started Neural Architecture Search

Figure 2 for Robust and Efficient Transfer Learning via Supernet Transfer in Warm-started Neural Architecture Search

Figure 3 for Robust and Efficient Transfer Learning via Supernet Transfer in Warm-started Neural Architecture Search

Figure 4 for Robust and Efficient Transfer Learning via Supernet Transfer in Warm-started Neural Architecture Search

Abstract:Hand-designing Neural Networks is a tedious process that requires significant expertise. Neural Architecture Search (NAS) frameworks offer a very useful and popular solution that helps to democratize AI. However, these NAS frameworks are often computationally expensive to run, which limits their applicability and accessibility. In this paper, we propose a novel transfer learning approach, capable of effectively transferring pretrained supernets based on Optimal Transport or multi-dataset pretaining. This method can be generally applied to NAS methods based on Differentiable Architecture Search (DARTS). Through extensive experiments across dozens of image classification tasks, we demonstrate that transferring pretrained supernets in this way can not only drastically speed up the supernet training which then finds optimal models (3 to 5 times faster on average), but even yield that outperform those found when running DARTS methods from scratch. We also observe positive transfer to almost all target datasets, making it very robust. Besides drastically improving the applicability of NAS methods, this also opens up new applications for continual learning and related fields.

Via

Access Paper or Ask Questions

Can time series forecasting be automated? A benchmark and analysis

Jul 25, 2024

Anvitha Thirthapura Sreedhara, Joaquin Vanschoren

Figure 1 for Can time series forecasting be automated? A benchmark and analysis

Figure 2 for Can time series forecasting be automated? A benchmark and analysis

Figure 3 for Can time series forecasting be automated? A benchmark and analysis

Figure 4 for Can time series forecasting be automated? A benchmark and analysis

Abstract:In the field of machine learning and artificial intelligence, time series forecasting plays a pivotal role across various domains such as finance, healthcare, and weather. However, the task of selecting the most suitable forecasting method for a given dataset is a complex task due to the diversity of data patterns and characteristics. This research aims to address this challenge by proposing a comprehensive benchmark for evaluating and ranking time series forecasting methods across a wide range of datasets. This study investigates the comparative performance of many methods from two prominent time series forecasting frameworks, AutoGluon-Timeseries, and sktime to shed light on their applicability in different real-world scenarios. This research contributes to the field of time series forecasting by providing a robust benchmarking methodology and facilitating informed decision-making when choosing forecasting methods for achieving optimal prediction.

Via

Access Paper or Ask Questions

HyTAS: A Hyperspectral Image Transformer Architecture Search Benchmark and Analysis

Jul 23, 2024

Fangqin Zhou, Mert Kilickaya, Joaquin Vanschoren, Ran Piao

Abstract:Hyperspectral Imaging (HSI) plays an increasingly critical role in precise vision tasks within remote sensing, capturing a wide spectrum of visual data. Transformer architectures have significantly enhanced HSI task performance, while advancements in Transformer Architecture Search (TAS) have improved model discovery. To harness these advancements for HSI classification, we make the following contributions: i) We propose HyTAS, the first benchmark on transformer architecture search for Hyperspectral imaging, ii) We comprehensively evaluate 12 different methods to identify the optimal transformer over 5 different datasets, iii) We perform an extensive factor analysis on the Hyperspectral transformer search performance, greatly motivating future research in this direction. All benchmark materials are available at HyTAS.

* The paper is accepted at ECCV2024

Via

Access Paper or Ask Questions

CLAMS: A System for Zero-Shot Model Selection for Clustering

Jul 15, 2024

Prabhant Singh, Pieter Gijsbers, Murat Onur Yildirim, Elif Ceren Gok, Joaquin Vanschoren

Figure 1 for CLAMS: A System for Zero-Shot Model Selection for Clustering

Figure 2 for CLAMS: A System for Zero-Shot Model Selection for Clustering

Figure 3 for CLAMS: A System for Zero-Shot Model Selection for Clustering

Figure 4 for CLAMS: A System for Zero-Shot Model Selection for Clustering

Abstract:We propose an AutoML system that enables model selection on clustering problems by leveraging optimal transport-based dataset similarity. Our objective is to establish a comprehensive AutoML pipeline for clustering problems and provide recommendations for selecting the most suitable algorithms, thus opening up a new area of AutoML beyond the traditional supervised learning settings. We compare our results against multiple clustering baselines and find that it outperforms all of them, hence demonstrating the utility of similarity-based automated model selection for solving clustering applications.

Via

Access Paper or Ask Questions