Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Arlind Kadra

Tabular Data: Is Attention All You Need?

Feb 06, 2024

Guri Zabërgja, Arlind Kadra, Josif Grabocka

Abstract:Deep Learning has revolutionized the field of AI and led to remarkable achievements in applications involving image and text data. Unfortunately, there is inconclusive evidence on the merits of neural networks for structured tabular data. In this paper, we introduce a large-scale empirical study comparing neural networks against gradient-boosted decision trees on tabular data, but also transformer-based architectures against traditional multi-layer perceptrons (MLP) with residual connections. In contrast to prior work, our empirical findings indicate that neural networks are competitive against decision trees. Furthermore, we assess that transformer-based architectures do not outperform simpler variants of traditional MLP architectures on tabular datasets. As a result, this paper helps the research and practitioner communities make informed choices on deploying neural networks on future tabular data applications.

Via

Access Paper or Ask Questions

Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How

Jun 11, 2023

Sebastian Pineda Arango, Fabio Ferreira, Arlind Kadra, Frank Hutter, Josif Grabocka

Figure 1 for Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How

Figure 2 for Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How

Figure 3 for Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How

Figure 4 for Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How

Abstract:With the ever-increasing number of pretrained models, machine learning practitioners are continuously faced with which pretrained model to use, and how to finetune it for a new dataset. In this paper, we propose a methodology that jointly searches for the optimal pretrained model and the hyperparameters for finetuning it. Our method transfers knowledge about the performance of many pretrained models with multiple hyperparameter configurations on a series of datasets. To this aim, we evaluated over 20k hyperparameter configurations for finetuning 24 pretrained image classification models on 87 datasets to generate a large-scale meta-dataset. We meta-learn a multi-fidelity performance predictor on the learning curves of this meta-dataset and use it for fast hyperparameter optimization on new datasets. We empirically demonstrate that our resulting approach can quickly select an accurate pretrained model for a new dataset together with its optimal hyperparameters.

Via

Access Paper or Ask Questions

Breaking the Paradox of Explainable Deep Learning

May 22, 2023

Arlind Kadra, Sebastian Pineda Arango, Josif Grabocka

Abstract:Deep Learning has achieved tremendous results by pushing the frontier of automation in diverse domains. Unfortunately, current neural network architectures are not explainable by design. In this paper, we propose a novel method that trains deep hypernetworks to generate explainable linear models. Our models retain the accuracy of black-box deep networks while offering free lunch explainability by design. Specifically, our explainable approach requires the same runtime and memory resources as black-box deep models, ensuring practical feasibility. Through extensive experiments, we demonstrate that our explainable deep networks are as accurate as state-of-the-art classifiers on tabular data. On the other hand, we showcase the interpretability of our method on a recent benchmark by empirically comparing prediction explainers. The experimental results reveal that our models are not only as accurate as their black-box deep-learning counterparts but also as interpretable as state-of-the-art explanation techniques.

Via

Access Paper or Ask Questions

Deep Power Laws for Hyperparameter Optimization

Feb 01, 2023

Arlind Kadra, Maciej Janowski, Martin Wistuba, Josif Grabocka

Abstract:Hyperparameter optimization is an important subfield of machine learning that focuses on tuning the hyperparameters of a chosen algorithm to achieve peak performance. Recently, there has been a stream of methods that tackle the issue of hyperparameter optimization, however, most of the methods do not exploit the scaling law property of learning curves. In this work, we propose Deep Power Laws (DPL), an ensemble of neural network models conditioned to yield predictions that follow a power-law scaling pattern. Our method dynamically decides which configurations to pause and train incrementally by making use of gray-box evaluations. We compare our method against 7 state-of-the-art competitors on 3 benchmarks related to tabular, image, and NLP datasets covering 57 diverse tasks. Our method achieves the best results across all benchmarks by obtaining the best any-time results compared to all competitors.

Via

Access Paper or Ask Questions

Dynamic and Efficient Gray-Box Hyperparameter Optimization for Deep Learning

Feb 20, 2022

Martin Wistuba, Arlind Kadra, Josif Grabocka

Figure 1 for Dynamic and Efficient Gray-Box Hyperparameter Optimization for Deep Learning

Figure 2 for Dynamic and Efficient Gray-Box Hyperparameter Optimization for Deep Learning

Figure 3 for Dynamic and Efficient Gray-Box Hyperparameter Optimization for Deep Learning

Figure 4 for Dynamic and Efficient Gray-Box Hyperparameter Optimization for Deep Learning

Abstract:Gray-box hyperparameter optimization techniques have recently emerged as a promising direction for tuning Deep Learning methods. In this work, we introduce DyHPO, a method that learns to dynamically decide which configuration to try next, and for what budget. Our technique is a modification to the classical Bayesian optimization for a gray-box setup. Concretely, we propose a new surrogate for Gaussian Processes that embeds the learning curve dynamics and a new acquisition function that incorporates multi-budget information. We demonstrate the significant superiority of DyHPO against state-of-the-art hyperparameter optimization baselines through large-scale experiments comprising 50 datasets (Tabular, Image, NLP) and diverse neural networks (MLP, CNN/NAS, RNN).

Via

Access Paper or Ask Questions

Regularization is all you Need: Simple Neural Nets can Excel on Tabular Data

Jun 21, 2021

Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka

Figure 1 for Regularization is all you Need: Simple Neural Nets can Excel on Tabular Data

Figure 2 for Regularization is all you Need: Simple Neural Nets can Excel on Tabular Data

Figure 3 for Regularization is all you Need: Simple Neural Nets can Excel on Tabular Data

Figure 4 for Regularization is all you Need: Simple Neural Nets can Excel on Tabular Data

Abstract:Tabular datasets are the last "unconquered castle" for deep learning, with traditional ML methods like Gradient-Boosted Decision Trees still performing strongly even against recent specialized neural architectures. In this paper, we hypothesize that the key to boosting the performance of neural networks lies in rethinking the joint and simultaneous application of a large set of modern regularization techniques. As a result, we propose regularizing plain Multilayer Perceptron (MLP) networks by searching for the optimal combination/cocktail of 13 regularization techniques for each dataset using a joint optimization over the decision on which regularizers to apply and their subsidiary hyperparameters. We empirically assess the impact of these regularization cocktails for MLPs on a large-scale empirical study comprising 40 tabular datasets and demonstrate that (i) well-regularized plain MLPs significantly outperform recent state-of-the-art specialized neural network architectures, and (ii) they even outperform strong traditional ML methods, such as XGBoost.

Via

Access Paper or Ask Questions

OpenML-Python: an extensible Python API for OpenML

Nov 06, 2019

Matthias Feurer, Jan N. van Rijn, Arlind Kadra, Pieter Gijsbers, Neeratyoy Mallik, Sahithya Ravi, Andreas Müller, Joaquin Vanschoren, Frank Hutter

Figure 1 for OpenML-Python: an extensible Python API for OpenML

Abstract:OpenML is an online platform for open science collaboration in machine learning, used to share datasets and results of machine learning experiments. In this paper we introduce \emph{OpenML-Python}, a client API for Python, opening up the OpenML platform for a wide range of Python-based tools. It provides easy access to all datasets, tasks and experiments on OpenML from within Python. It also provides functionality to conduct machine learning experiments, upload the results to OpenML, and reproduce results which are stored on OpenML. Furthermore, it comes with a scikit-learn plugin and a plugin mechanism to easily integrate other machine learning libraries written in Python into the OpenML ecosystem. Source code and documentation is available at https://github.com/openml/openml-python/.

Via

Access Paper or Ask Questions