Abstract:Automated machine learning (AutoML) systems propose an end-to-end solution to a given machine learning problem, creating either fixed or flexible pipelines. Fixed pipelines are task independent constructs: their general composition remains the same, regardless of the data. In contrast, the structure of flexible pipelines varies depending on the input, making them finely tailored to individual tasks. However, flexible pipelines can be structurally overcomplicated and have poor explainability. We propose the EVOSA approach that compensates for the negative points of flexible pipelines by incorporating a sensitivity analysis which increases the robustness and interpretability of the flexible solutions. EVOSA quantitatively estimates positive and negative impact of an edge or a node on a pipeline graph, and feeds this information to the evolutionary AutoML optimizer. The correctness and efficiency of EVOSA was validated in tabular, multimodal and computer vision tasks, suggesting generalizability of the proposed approach across domains.
Abstract:Geomechanical monitoring of a rock massif is an actively developing branch of geomechanics. It is almost impossible to single out a methodology and approaches for data collection and analysis in developing seismic monitoring systems. In the process of mining in rock massif, changes in the state of structural inhomogeneities are most clearly manifested. Existing natural structural inhomogeneities are revealed, there are movements in discontinuous disturbances, and new technogenic disturbances are formed, which are accompanied by a change in the natural stress state of various blocks of the massif. An important task is to develop a mining forecasting model that can take into account the structural heterogeneity of the rock massif and select the necessary forecast horizon depending on monitoring data The developed method of evaluating the results of monitoring geomechanical processes in the rock massif allowed us to forecast of zones of possible rock bursts.
Abstract:The effectiveness of the machine learning methods for real-world tasks depends on the proper structure of the modeling pipeline. The proposed approach is aimed to automate the design of composite machine learning pipelines, which is equivalent to computation workflows that consist of models and data operations. The approach combines key ideas of both automated machine learning and workflow management systems. It designs the pipelines with a customizable graph-based structure, analyzes the obtained results, and reproduces them. The evolutionary approach is used for the flexible identification of pipeline structure. The additional algorithms for sensitivity analysis, atomization, and hyperparameter tuning are implemented to improve the effectiveness of the approach. Also, the software implementation on this approach is presented as an open-source framework. The set of experiments is conducted for the different datasets and tasks (classification, regression, time series forecasting). The obtained results confirm the correctness and effectiveness of the proposed approach in the comparison with the state-of-the-art competitors and baseline solutions.
Abstract:The paper describes the usage of intelligent approaches for field development tasks that may assist a decision-making process. We focused on the problem of wells location optimization and two tasks within it: improving the quality of oil production estimation and estimation of reservoir characteristics for appropriate wells allocation and parametrization, using machine learning methods. For oil production estimation, we implemented and investigated the quality of forecasting models: physics-based, pure data-driven, and hybrid one. The CRMIP model was chosen as a physics-based approach. We compare it with the machine learning and hybrid methods in a frame of oil production forecasting task. In the investigation of reservoir characteristics for wells location choice, we automated the seismic analysis using evolutionary identification of convolutional neural network for the reservoir detection. The Volve oil field dataset was used as a case study to conduct the experiments. The implemented approaches can be used to analyze different oil fields or adapted to similar physics-related problems.
Abstract:In this paper, a multi-objective approach for the design of composite data-driven mathematical models is proposed. It allows automating the identification of graph-based heterogeneous pipelines that consist of different blocks: machine learning models, data preprocessing blocks, etc. The implemented approach is based on a parameter-free genetic algorithm (GA) for model design called GPComp@Free. It is developed to be part of automated machine learning solutions and to increase the efficiency of the modeling pipeline automation. A set of experiments was conducted to verify the correctness and efficiency of the proposed approach and substantiate the selected solutions. The experimental results confirm that a multi-objective approach to the model design allows achieving better diversity and quality of obtained models. The implemented approach is available as a part of the open-source AutoML framework FEDOT.