Abstract:Machine learning (ML)-based solutions are rapidly changing the landscape of many fields, including structural engineering. Despite their promising performance, these approaches are usually only demonstrated as proof-of-concept in structural engineering, and are rarely deployed for real-world applications. This paper aims to illustrate the challenges of developing ML models suitable for deployment through two illustrative examples. Among various pitfalls, the presented discussion focuses on model overfitting and underspecification, training data representativeness, variable omission bias, and cross-validation. The results highlight the importance of implementing rigorous model validation techniques through adaptive sampling, careful physics-informed feature selection, and considerations of both model complexity and generalizability.
Abstract:Many ecological and spatial processes are complex in nature and are not accurately modeled by linear models. Regression trees promise to handle the high-order interactions that are present in ecological and spatial datasets, but fail to produce physically realistic characterizations of the underlying landscape. The "autocart" (autocorrelated regression trees) R package extends the functionality of previously proposed spatial regression tree methods through a spatially aware splitting function and novel adaptive inverse distance weighting method in each terminal node. The efficacy of these autocart models, including an autocart extension of random forest, is demonstrated on multiple datasets. This highlights the ability of autocart to model complex interactions between spatial variables while still providing physically realistic representations of the landscape.