Abstract:Anomaly detection relies on designing a score to determine whether a particular event is uncharacteristic of a given background distribution. One way to define a score is to use autoencoders, which rely on the ability to reconstruct certain types of data (background) but not others (signals). In this paper, we study some challenges associated with variational autoencoders, such as the dependence on hyperparameters and the metric used, in the context of anomalous signal (top and $W$) jets in a QCD background. We find that the hyperparameter choices strongly affect the network performance and that the optimal parameters for one signal are non-optimal for another. In exploring the networks, we uncover a connection between the latent space of a variational autoencoder trained using mean-squared-error and the optimal transport distances within the dataset. We then show that optimal transport distances to representative events in the background dataset can be used directly for anomaly detection, with performance comparable to the autoencoders. Whether using autoencoders or optimal transport distances for anomaly detection, we find that the choices that best represent the background are not necessarily best for signal identification. These challenges with unsupervised anomaly detection bolster the case for additional exploration of semi-supervised or alternative approaches.
Abstract:Modern machine learning techniques, such as convolutional, recurrent and recursive neural networks, have shown promise for jet substructure at the Large Hadron Collider. For example, they have demonstrated effectiveness at boosted top or W boson identification or for quark/gluon discrimination. We explore these methods for the purpose of classifying jets according to their electric charge. We find that both neural networks that incorporate distance within the jet as an input and boosted decision trees including radial distance information can provide significant improvement in jet charge extraction over current methods. Specifically, convolutional, recurrent, and recursive networks can provide the largest improvement over traditional methods, in part by effectively utilizing distance within the jet or clustering history. The advantages of using a fixed-size input representation (as with the CNN) or a small input representation (as with the RNN) suggest that both convolutional and recurrent networks will be essential to the future of modern machine learning at colliders.