Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Simona Reale

Benchmarking of a new data splitting method on volcanic eruption data

Oct 08, 2024

Simona Reale, Pietro Di Stasio, Francesco Mauro, Alessandro Sebastianelli, Paolo Gamba, Silvia Liberata Ullo

Figure 1 for Benchmarking of a new data splitting method on volcanic eruption data

Figure 2 for Benchmarking of a new data splitting method on volcanic eruption data

Figure 3 for Benchmarking of a new data splitting method on volcanic eruption data

Figure 4 for Benchmarking of a new data splitting method on volcanic eruption data

Abstract:In this paper, a novel method for data splitting is presented: an iterative procedure divides the input dataset of volcanic eruption, chosen as the proposed use case, into two parts using a dissimilarity index calculated on the cumulative histograms of these two parts. The Cumulative Histogram Dissimilarity (CHD) index is introduced as part of the design. Based on the obtained results the proposed model in this case, compared to both Random splitting and K-means implemented over different configurations, achieves the best performance, with a slightly higher number of epochs. However, this demonstrates that the model can learn more deeply from the input dataset, which is attributable to the quality of the splitting. In fact, each model was trained with early stopping, suitable in case of overfitting, and the higher number of epochs in the proposed method demonstrates that early stopping did not detect overfitting, and consequently, the learning was optimal.

* To be sumbitted to IEEE IGARSS 2025

Via

Access Paper or Ask Questions