Abstract:Malware detection and analysis are active research subjects in cybersecurity over the last years. Indeed, the development of obfuscation techniques, as packing, for example, requires special attention to detect recent variants of malware. The usual detection methods do not necessarily provide tools to interpret the results. Therefore, we propose a model based on the transformation of binary files into grayscale image, which achieves an accuracy rate of 88%. Furthermore, the proposed model can determine if a sample is packed or encrypted with a precision of 85%. It allows us to analyze results and act appropriately. Also, by applying attention mechanisms on detection models, we have the possibility to identify which part of the files looks suspicious. This kind of tool should be very useful for data analysts, it compensates for the lack of interpretability of the common detection models, and it can help to understand why some malicious files are undetected.
Abstract:In this paper, we consider a model called CHARME (Conditional Heteroscedastic Autoregressive Mixture of Experts), a class of generalized mixture of nonlinear nonparametric AR-ARCH time series. Under certain Lipschitz-type conditions on the autoregressive and volatility functions, we prove that this model is stationary, ergodic and $\tau$-weakly dependent. These conditions are much weaker than those presented in the literature that treats this model. Moreover, this result forms the theoretical basis for deriving an asymptotic theory of the underlying (non)parametric estimation, which we present for this model. As an application, from the universal approximation property of neural networks (NN), possibly with deep architectures, we develop a learning theory for the NN-based autoregressive functions of the model, where the strong consistency and asymptotic normality of the considered estimator of the NN weights and biases are guaranteed under weak conditions.