Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniel Urda

Enhancing web traffic attacks identification through ensemble methods and feature selection

Dec 21, 2024

Daniel Urda, Branly Martínez, Nuño Basurto, Meelis Kull, Ángel Arroyo, Álvaro Herrero

Abstract:Websites, as essential digital assets, are highly vulnerable to cyberattacks because of their high traffic volume and the significant impact of breaches. This study aims to enhance the identification of web traffic attacks by leveraging machine learning techniques. A methodology was proposed to extract relevant features from HTTP traces using the CSIC2010 v2 dataset, which simulates e-commerce web traffic. Ensemble methods, such as Random Forest and Extreme Gradient Boosting, were employed and compared against baseline classifiers, including k-nearest Neighbor, LASSO, and Support Vector Machines. The results demonstrate that the ensemble methods outperform baseline classifiers by approximately 20% in predictive accuracy, achieving an Area Under the ROC Curve (AUC) of 0.989. Feature selection methods such as Information Gain, LASSO, and Random Forest further enhance the robustness of these models. This study highlights the efficacy of ensemble models in improving attack detection while minimizing performance variability, offering a practical framework for securing web traffic in diverse application contexts.

Via

Access Paper or Ask Questions

Improving the Reliability of Network Intrusion Detection Systems through Dataset Integration

Dec 02, 2021

Roberto Magán-Carrión, Daniel Urda, Ignacio Díaz-Cano, Bernabé Dorronsoro

Figure 1 for Improving the Reliability of Network Intrusion Detection Systems through Dataset Integration

Figure 2 for Improving the Reliability of Network Intrusion Detection Systems through Dataset Integration

Figure 3 for Improving the Reliability of Network Intrusion Detection Systems through Dataset Integration

Figure 4 for Improving the Reliability of Network Intrusion Detection Systems through Dataset Integration

Abstract:This work presents Reliable-NIDS (R-NIDS), a novel methodology for Machine Learning (ML) based Network Intrusion Detection Systems (NIDSs) that allows ML models to work on integrated datasets, empowering the learning process with diverse information from different datasets. Therefore, R-NIDS targets the design of more robust models, that generalize better than traditional approaches. We also propose a new dataset, called UNK21. It is built from three of the most well-known network datasets (UGR'16, USNW-NB15 and NLS-KDD), each one gathered from its own network environment, with different features and classes, by using a data aggregation approach present in R-NIDS. Following R-NIDS, in this work we propose to build two well-known ML models (a linear and a non-linear one) based on the information of three of the most common datasets in the literature for NIDS evaluation, those integrated in UNK21. The results that the proposed methodology offers show how these two ML models trained as a NIDS solution could benefit from this approach, being able to generalize better when training on the newly proposed UNK21 dataset. Furthermore, these results are carefully analyzed with statistical tools that provide high confidence on our conclusions.

* Submitted to the IEEE Transactions on Emerging Topics in Computing journal

Via

Access Paper or Ask Questions

Advanced Machine Learning Techniques for Fake News (Online Disinformation) Detection: A Systematic Mapping Study

Dec 28, 2020

Michal Choras, Konstantinos Demestichas, Agata Gielczyk, Alvaro Herrero, Pawel Ksieniewicz, Konstantina Remoundou, Daniel Urda, Michal Wozniak

Figure 1 for Advanced Machine Learning Techniques for Fake News (Online Disinformation) Detection: A Systematic Mapping Study

Figure 2 for Advanced Machine Learning Techniques for Fake News (Online Disinformation) Detection: A Systematic Mapping Study

Figure 3 for Advanced Machine Learning Techniques for Fake News (Online Disinformation) Detection: A Systematic Mapping Study

Figure 4 for Advanced Machine Learning Techniques for Fake News (Online Disinformation) Detection: A Systematic Mapping Study

Abstract:Fake news has now grown into a big problem for societies and also a major challenge for people fighting disinformation. This phenomenon plagues democratic elections, reputations of individual persons or organizations, and has negatively impacted citizens, (e.g., during the COVID-19 pandemic in the US or Brazil). Hence, developing effective tools to fight this phenomenon by employing advanced Machine Learning (ML) methods poses a significant challenge. The following paper displays the present body of knowledge on the application of such intelligent tools in the fight against disinformation. It starts by showing the historical perspective and the current role of fake news in the information war. Proposed solutions based solely on the work of experts are analysed and the most important directions of the application of intelligent systems in the detection of misinformation sources are pointed out. Additionally, the paper presents some useful resources (mainly datasets useful when assessing ML solutions for fake news detection) and provides a short overview of the most important R&D projects related to this subject. The main purpose of this work is to analyse the current state of knowledge in detecting fake news; on the one hand to show possible solutions, and on the other hand to identify the main challenges and methodological gaps to motivate future research.

Via

Access Paper or Ask Questions