Abstract:It is typical for a machine learning system to have numerous hyperparameters that affect its learning rate and prediction quality. Finding a good combination of the hyperparameters is, however, a challenging job. This is mainly because evaluation of each combination is extremely expensive computationally; indeed, training a machine learning system on real data with just a single combination of hyperparameters usually takes hours or even days. In this paper, we address this challenge by trying to predict the performance of the machine learning system with a given combination of hyperparameters without completing the expensive learning process. Instead, we terminate the training process at an early stage, collect the model performance data and use it to predict which of the combinations of hyperparameters is most promising. Our preliminary experiments show that such a prediction improves the performance of the commonly used random search approach.