Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Claude Poletti

Toward Formal Data Set Verification for Building Effective Machine Learning Models

Aug 25, 2021

Jorge López, Maxime Labonne, Claude Poletti

Figure 1 for Toward Formal Data Set Verification for Building Effective Machine Learning Models

Figure 2 for Toward Formal Data Set Verification for Building Effective Machine Learning Models

Abstract:In order to properly train a machine learning model, data must be properly collected. To guarantee a proper data collection, verifying that the collected data set holds certain properties is a possible solution. For example, guaranteeing that the data set contains samples across the whole input space, or that the data set is balanced w.r.t. different classes. We present a formal approach for verifying a set of arbitrarily stated properties over a data set. The proposed approach relies on the transformation of the data set into a first order logic formula, which can be later verified w.r.t. the different properties also stated in the same logic. A prototype tool, which uses the z3 solver, has been developed; the prototype can take as an input a set of properties stated in a formal language and formally verify a given data set w.r.t. to the given set of properties. Preliminary experimental results show the feasibility and performance of the proposed approach, and furthermore the flexibility for expressing properties of interest.

* Preprint submitted to IC3K 2021

Via

Access Paper or Ask Questions

Short-Term Flow-Based Bandwidth Forecasting using Machine Learning

Dec 03, 2020

Maxime Labonne, Jorge López, Claude Poletti, Jean-Baptiste Munier

Figure 1 for Short-Term Flow-Based Bandwidth Forecasting using Machine Learning

Figure 2 for Short-Term Flow-Based Bandwidth Forecasting using Machine Learning

Figure 3 for Short-Term Flow-Based Bandwidth Forecasting using Machine Learning

Figure 4 for Short-Term Flow-Based Bandwidth Forecasting using Machine Learning

Abstract:This paper proposes a novel framework to predict traffic flows' bandwidth ahead of time. Modern network management systems share a common issue: the network situation evolves between the moment the decision is made and the moment when actions (countermeasures) are applied. This framework converts packets from real-life traffic into flows containing relevant features. Machine learning models, including Decision Tree, Random Forest, XGBoost, and Deep Neural Network, are trained on these data to predict the bandwidth at the next time instance for every flow. Predictions can be fed to the management system instead of current flows bandwidth in order to take decisions on a more accurate network state. Experiments were performed on 981,774 flows and 15 different time windows (from 0.03s to 4s). They show that the Random Forest is the best performing and most reliable model, with a predictive performance consistently better than relying on the current bandwidth (+19.73% in mean absolute error and +18.00% in root mean square error). Experimental results indicate that this framework can help network management systems to take more informed decisions using a predicted network state.

* 4 pages, 1 figure 3 tables

Via

Access Paper or Ask Questions