Abstract:Automated machine learning (AutoML) systems propose an end-to-end solution to a given machine learning problem, creating either fixed or flexible pipelines. Fixed pipelines are task independent constructs: their general composition remains the same, regardless of the data. In contrast, the structure of flexible pipelines varies depending on the input, making them finely tailored to individual tasks. However, flexible pipelines can be structurally overcomplicated and have poor explainability. We propose the EVOSA approach that compensates for the negative points of flexible pipelines by incorporating a sensitivity analysis which increases the robustness and interpretability of the flexible solutions. EVOSA quantitatively estimates positive and negative impact of an edge or a node on a pipeline graph, and feeds this information to the evolutionary AutoML optimizer. The correctness and efficiency of EVOSA was validated in tabular, multimodal and computer vision tasks, suggesting generalizability of the proposed approach across domains.
Abstract:Resource-intensive computations are a major factor that limits the effectiveness of automated machine learning solutions. In the paper, we propose a modular approach that can be used to increase the quality of evolutionary optimization for modelling pipelines with a graph-based structure. It consists of several stages - parallelization, caching and evaluation. Heterogeneous and remote resources can be involved in the evaluation stage. The conducted experiments confirm the correctness and effectiveness of the proposed approach. The implemented algorithms are available as a part of the open-source framework FEDOT.