Abstract:Training features used to analyse physical processes are often highly correlated and determining which ones are most important for the classification is a non-trivial tasks. For the use case of a search for a top-quark pair produced in association with a Higgs boson decaying to bottom-quarks at the LHC, we compare feature ranking methods for a classification BDT. Ranking methods, such as the BDT Selection Frequency commonly used in High Energy Physics and the Permutational Performance, are compared with the computationally expense Iterative Addition and Iterative Removal procedures, while the latter was found to be the most performant.
Abstract:We describe the construction of end-to-end jet image classifiers based on simulated low-level detector data to discriminate quark- vs. gluon-initiated jets with high-fidelity simulated CMS Open Data. We highlight the importance of precise spatial information and demonstrate competitive performance to existing state-of-the-art jet classifiers. We further generalize the end-to-end approach to event-level classification of quark vs. gluon di-jet QCD events. We compare the fully end-to-end approach to using hand-engineered features and demonstrate that the end-to-end algorithm is robust against the effects of underlying event and pile-up.