Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Manuel López-Ibañez

MO-IOHinspector: Anytime Benchmarking of Multi-Objective Algorithms using IOHprofiler

Dec 10, 2024

Diederick Vermetten, Jeroen Rook, Oliver L. Preuß, Jacob de Nobel, Carola Doerr, Manuel López-Ibañez, Heike Trautmann, Thomas Bäck

Figure 1 for MO-IOHinspector: Anytime Benchmarking of Multi-Objective Algorithms using IOHprofiler

Figure 2 for MO-IOHinspector: Anytime Benchmarking of Multi-Objective Algorithms using IOHprofiler

Figure 3 for MO-IOHinspector: Anytime Benchmarking of Multi-Objective Algorithms using IOHprofiler

Figure 4 for MO-IOHinspector: Anytime Benchmarking of Multi-Objective Algorithms using IOHprofiler

Abstract:Benchmarking is one of the key ways in which we can gain insight into the strengths and weaknesses of optimization algorithms. In sampling-based optimization, considering the anytime behavior of an algorithm can provide valuable insights for further developments. In the context of multi-objective optimization, this anytime perspective is not as widely adopted as in the single-objective context. In this paper, we propose a new software tool which uses principles from unbounded archiving as a logging structure. This leads to a clearer separation between experimental design and subsequent analysis decisions. We integrate this approach as a new Python module into the IOHprofiler framework and demonstrate the benefits of this approach by showcasing the ability to change indicators, aggregations, and ranking procedures during the analysis pipeline.

Via

Access Paper or Ask Questions

Analyzing the Impact of Undersampling on the Benchmarking and Configuration of Evolutionary Algorithms

Apr 22, 2022

Diederick Vermetten, Hao Wang, Manuel López-Ibañez, Carola Doerr, Thomas Bäck

Figure 1 for Analyzing the Impact of Undersampling on the Benchmarking and Configuration of Evolutionary Algorithms

Figure 2 for Analyzing the Impact of Undersampling on the Benchmarking and Configuration of Evolutionary Algorithms

Figure 3 for Analyzing the Impact of Undersampling on the Benchmarking and Configuration of Evolutionary Algorithms

Figure 4 for Analyzing the Impact of Undersampling on the Benchmarking and Configuration of Evolutionary Algorithms

Abstract:The stochastic nature of iterative optimization heuristics leads to inherently noisy performance measurements. Since these measurements are often gathered once and then used repeatedly, the number of collected samples will have a significant impact on the reliability of algorithm comparisons. We show that care should be taken when making decisions based on limited data. Particularly, we show that the number of runs used in many benchmarking studies, e.g., the default value of 15 suggested by the COCO environment, can be insufficient to reliably rank algorithms on well-known numerical optimization benchmarks. Additionally, methods for automated algorithm configuration are sensitive to insufficient sample sizes. This may result in the configurator choosing a `lucky' but poor-performing configuration despite exploring better ones. We show that relying on mean performance values, as many configurators do, can require a large number of runs to provide accurate comparisons between the considered configurations. Common statistical tests can greatly improve the situation in most cases but not always. We show examples of performance losses of more than 20%, even when using statistical races to dynamically adjust the number of runs, as done by irace. Our results underline the importance of appropriately considering the statistical distribution of performance values.

* To be published in proceedings of Genetic and Evolutionary Computation Conference (GECCO 22), July 9-13, 2022, Boston, MA, USA. ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/3512290.3528799

Via

Access Paper or Ask Questions