Abstract:Bumblebee is a foundation model for particle physics discovery, inspired by BERT. By removing positional encodings and embedding particle 4-vectors, Bumblebee captures both generator- and reconstruction-level information while ensuring sequence-order invariance. Pre-trained on a masked task, it improves dileptonic top quark reconstruction resolution by 10-20% and excels in downstream tasks, including toponium discrimination (AUROC 0.877) and initial state classification (AUROC 0.625). The flexibility of Bumblebee makes it suitable for a wide range of particle physics applications, especially the discovery of new particles.
Abstract:Interpretable graph learning is in need as many scientific applications depend on learning models to collect insights from graph-structured data. Previous works mostly focused on using post-hoc approaches to interpret a pre-trained model (graph neural network models in particular). They argue against inherently interpretable models because good interpretation of these models is often at the cost of their prediction accuracy. And, the widely used attention mechanism for inherent interpretation often fails to provide faithful interpretation in graph learning tasks. In this work, we address both issues by proposing Graph Stochastic Attention (GSAT), an attention mechanism derived from the information bottleneck principle. GSAT leverages stochastic attention to block the information from the task-irrelevant graph components while learning stochasticity-reduced attention to select the task-relevant subgraphs for interpretation. GSAT can also apply to fine-tuning and interpreting pre-trained models via stochastic attention mechanism. Extensive experiments on eight datasets show that GSAT outperforms the state-of-the-art methods by up to 20%$\uparrow$ in interpretation AUC and 5%$\uparrow$ in prediction accuracy.
Abstract:Using detailed simulations of calorimeter showers as training data, we investigate the use of deep learning algorithms for the simulation and reconstruction of particles produced in high-energy physics collisions. We train neural networks on shower data at the calorimeter-cell level, and show significant improvements for simulation and reconstruction when using these networks compared to methods which rely on currently-used state-of-the-art algorithms. We define two models: an end-to-end reconstruction network which performs simultaneous particle identification and energy regression of particles when given calorimeter shower data, and a generative network which can provide reasonable modeling of calorimeter showers for different particle types at specified angles and energies. We investigate the optimization of our models with hyperparameter scans. Furthermore, we demonstrate the applicability of the reconstruction model to shower inputs from other detector geometries, specifically ATLAS-like and CMS-like geometries. These networks can serve as fast and computationally light methods for particle shower simulation and reconstruction for current and future experiments at particle colliders.