Abstract:This article puts forward the use of mutual information values to replicate the expertise of security professionals in selecting features for detecting web attacks. The goal is to enhance the effectiveness of web application firewalls (WAFs). Web applications are frequently vulnerable to various security threats, making WAFs essential for their protection. WAFs analyze HTTP traffic using rule-based approaches to identify known attack patterns and to detect and block potential malicious requests. However, a major challenge is the occurrence of false positives, which can lead to blocking legitimate traffic and impact the normal functioning of the application. The problem is addressed as an approach that combines supervised learning for feature selection with a semi-supervised learning scenario for training a One-Class SVM model. The experimental findings show that the model trained with features selected by the proposed algorithm outperformed the expert-based selection approach in terms of performance. Additionally, the results obtained by the traditional rule-based WAF ModSecurity, configured with a vanilla set of OWASP CRS rules, were also improved.