Abstract:In machine learning, the exponential growth of data and the associated ``curse of dimensionality'' pose significant challenges, particularly with expansive yet sparse datasets. Addressing these challenges, multi-view ensemble learning (MEL) has emerged as a transformative approach, with feature partitioning (FP) playing a pivotal role in constructing artificial views for MEL. Our study introduces the Semantic-Preserving Feature Partitioning (SPFP) algorithm, a novel method grounded in information theory. The SPFP algorithm effectively partitions datasets into multiple semantically consistent views, enhancing the MEL process. Through extensive experiments on eight real-world datasets, ranging from high-dimensional with limited instances to low-dimensional with high instances, our method demonstrates notable efficacy. It maintains model accuracy while significantly improving uncertainty measures in scenarios where high generalization performance is achievable. Conversely, it retains uncertainty metrics while enhancing accuracy where high generalization accuracy is less attainable. An effect size analysis further reveals that the SPFP algorithm outperforms benchmark models by large effect size and reduces computational demands through effective dimensionality reduction. The substantial effect sizes observed in most experiments underscore the algorithm's significant improvements in model performance.
Abstract:Soil erosion is a significant threat to the environment and long-term land management around the world. Accelerated soil erosion by human activities inflicts extreme changes in terrestrial and aquatic ecosystems, which is not fully surveyed/predicted for the present and probable future at field-scales (30-m). Here, we estimate/predict soil erosion rates by water erosion, (sheet and rill erosion), using three alternative (2.6, 4.5, and 8.5) Shared Socioeconomic Pathway and Representative Concentration Pathway (SSP-RCP) scenarios across the contiguous United States. Field Scale Soil Erosion Model (FSSLM) estimations rely on a high resolution (30-m) G2 erosion model integrated by satellite- and imagery-based estimations of land use and land cover (LULC), gauge observations of long-term precipitation, and scenarios of the Coupled Model Intercomparison Project Phase 6 (CMIP6). The baseline model (2020) estimates soil erosion rates of 2.32 Mg ha 1 yr 1 with current agricultural conservation practices (CPs). Future scenarios with current CPs indicate an increase between 8% to 21% under different combinations of SSP-RCP scenarios of climate and LULC changes. The soil erosion forecast for 2050 suggests that all the climate and LULC scenarios indicate either an increase in extreme events or a change in the spatial location of extremes largely from the southern to the eastern and northeastern regions of the United States.