Abstract:Auto-encoders (AEs) have the potential to be effective and generic tools for new physics searches at colliders, requiring little to no model-dependent assumptions. New hypothetical physics signals can be considered anomalies that deviate from the well-known background processes generally expected to describe the whole dataset. We present a search formulated as an anomaly detection (AD) problem, using an AE to define a criterion to decide about the physics nature of an event. In this work, we perform an AD search for manifestations of a dark version of strong force using raw detector images, which are large and very sparse, without leveraging any physics-based pre-processing or assumption on the signals. We propose a dual-encoder design which can learn a compact latent space through conditioning. In the context of multiple AD metrics, we present a clear improvement over competitive baselines and prior approaches. It is the first time that an AE is shown to exhibit excellent discrimination against multiple dark shower models, illustrating the suitability of this method as a performant, model-independent algorithm to deploy, e.g., in the trigger stage of LHC experiments such as ATLAS and CMS.
Abstract:Signal-background classification is a central problem in High-Energy Physics, that plays a major role for the discovery of new fundamental particles. A recent method -- the Parametric Neural Network (pNN) -- leverages multiple signal mass hypotheses as an additional input feature to effectively replace a whole set of individual classifier, each providing (in principle) the best response for a single mass hypothesis. In this work we aim at deepening the understanding of pNNs in light of real-world usage. We discovered several peculiarities of parametric networks, providing intuition, metrics, and guidelines to them. We further propose an alternative parametrization scheme, resulting in a new parametrized neural network architecture: the AffinePNN; along with many other generally applicable improvements. Finally, we extensively evaluate our models on the HEPMASS dataset, along its imbalanced version (called HEPMASS-IMB) we provide here for the first time to further validate our approach. Provided results are in terms of the impact of the proposed design decisions, classification performance, and interpolation capability as well.