Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ruinan Wang

Grouped Sequential Optimization Strategy -- the Application of Hyperparameter Importance Assessment in Deep Learning

Mar 07, 2025

Ruinan Wang, Ian Nabney, Mohammad Golbabaee

Abstract:Hyperparameter optimization (HPO) is a critical component of machine learning pipelines, significantly affecting model robustness, stability, and generalization. However, HPO is often a time-consuming and computationally intensive task. Traditional HPO methods, such as grid search and random search, often suffer from inefficiency. Bayesian optimization, while more efficient, still struggles with high-dimensional search spaces. In this paper, we contribute to the field by exploring how insights gained from hyperparameter importance assessment (HIA) can be leveraged to accelerate HPO, reducing both time and computational resources. Building on prior work that quantified hyperparameter importance by evaluating 10 hyperparameters on CNNs using 10 common image classification datasets, we implement a novel HPO strategy called 'Sequential Grouping.' That prior work assessed the importance weights of the investigated hyperparameters based on their influence on model performance, providing valuable insights that we leverage to optimize our HPO process. Our experiments, validated across six additional image classification datasets, demonstrate that incorporating hyperparameter importance assessment (HIA) can significantly accelerate HPO without compromising model performance, reducing optimization time by an average of 31.9\% compared to the conventional simultaneous strategy.

* 12 pages

Via

Access Paper or Ask Questions

Efficient Hyperparameter Importance Assessment for CNNs

Oct 11, 2024

Ruinan Wang, Ian Nabney, Mohammad Golbabaee

Figure 1 for Efficient Hyperparameter Importance Assessment for CNNs

Figure 2 for Efficient Hyperparameter Importance Assessment for CNNs

Figure 3 for Efficient Hyperparameter Importance Assessment for CNNs

Figure 4 for Efficient Hyperparameter Importance Assessment for CNNs

Abstract:Hyperparameter selection is an essential aspect of the machine learning pipeline, profoundly impacting models' robustness, stability, and generalization capabilities. Given the complex hyperparameter spaces associated with Neural Networks and the constraints of computational resources and time, optimizing all hyperparameters becomes impractical. In this context, leveraging hyperparameter importance assessment (HIA) can provide valuable guidance by narrowing down the search space. This enables machine learning practitioners to focus their optimization efforts on the hyperparameters with the most significant impact on model performance while conserving time and resources. This paper aims to quantify the importance weights of some hyperparameters in Convolutional Neural Networks (CNNs) with an algorithm called N-RReliefF, laying the groundwork for applying HIA methodologies in the Deep Learning field. We conduct an extensive study by training over ten thousand CNN models across ten popular image classification datasets, thereby acquiring a comprehensive dataset containing hyperparameter configuration instances and their corresponding performance metrics. It is demonstrated that among the investigated hyperparameters, the top five important hyperparameters of the CNN model are the number of convolutional layers, learning rate, dropout rate, optimizer and epoch.

* 15 pages

Via

Access Paper or Ask Questions

Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations

Sep 16, 2022

Yilun Hao, Ruinan Wang, Zhangjie Cao, Zihan Wang, Yuchen Cui, Dorsa Sadigh

Figure 1 for Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations

Figure 2 for Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations

Figure 3 for Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations

Figure 4 for Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations

Abstract:Multimodal demonstrations provide robots with an abundance of information to make sense of the world. However, such abundance may not always lead to good performance when it comes to learning sensorimotor control policies from human demonstrations. Extraneous data modalities can lead to state over-specification, where the state contains modalities that are not only useless for decision-making but also can change data distribution across environments. State over-specification leads to issues such as the learned policy not generalizing outside of the training data distribution. In this work, we propose Masked Imitation Learning (MIL) to address state over-specification by selectively using informative modalities. Specifically, we design a masked policy network with a binary mask to block certain modalities. We develop a bi-level optimization algorithm that learns this mask to accurately filter over-specified modalities. We demonstrate empirically that MIL outperforms baseline algorithms in simulated domains including MuJoCo and a robot arm environment using the Robomimic dataset, and effectively recovers the environment-invariant modalities on a multimodal dataset collected on a real robot. Our project website presents supplemental details and videos of our results at: https://tinyurl.com/masked-il

* 13 pages

Via

Access Paper or Ask Questions