Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Steven Farrell

FAIR Universe HiggsML Uncertainty Challenge Competition

Oct 03, 2024

Wahid Bhimji, Paolo Calafiura, Ragansu Chakkappai, Yuan-Tang Chou, Sascha Diefenbacher, Jordan Dudley, Steven Farrell, Aishik Ghosh, Isabelle Guyon, Chris Harris(+12 more)

Figure 1 for FAIR Universe HiggsML Uncertainty Challenge Competition

Figure 2 for FAIR Universe HiggsML Uncertainty Challenge Competition

Figure 3 for FAIR Universe HiggsML Uncertainty Challenge Competition

Figure 4 for FAIR Universe HiggsML Uncertainty Challenge Competition

Abstract:The FAIR Universe -- HiggsML Uncertainty Challenge focuses on measuring the physics properties of elementary particles with imperfect simulators due to differences in modelling systematic errors. Additionally, the challenge is leveraging a large-compute-scale AI platform for sharing datasets, training models, and hosting machine learning competitions. Our challenge brings together the physics and machine learning communities to advance our understanding and methodologies in handling systematic (epistemic) uncertainties within AI techniques.

* Whitepaper for the FAIR Universe HiggsML Uncertainty Challenge Competition, available : https://fair-universe.lbl.gov

Via

Access Paper or Ask Questions

Hierarchical Graph Neural Networks for Particle Track Reconstruction

Mar 03, 2023

Ryan Liu, Paolo Calafiura, Steven Farrell, Xiangyang Ju, Daniel Thomas Murnane, Tuan Minh Pham

Figure 1 for Hierarchical Graph Neural Networks for Particle Track Reconstruction

Figure 2 for Hierarchical Graph Neural Networks for Particle Track Reconstruction

Figure 3 for Hierarchical Graph Neural Networks for Particle Track Reconstruction

Figure 4 for Hierarchical Graph Neural Networks for Particle Track Reconstruction

Abstract:We introduce a novel variant of GNN for particle tracking called Hierarchical Graph Neural Network (HGNN). The architecture creates a set of higher-level representations which correspond to tracks and assigns spacepoints to these tracks, allowing disconnected spacepoints to be assigned to the same track, as well as multiple tracks to share the same spacepoint. We propose a novel learnable pooling algorithm called GMPool to generate these higher-level representations called "super-nodes", as well as a new loss function designed for tracking problems and HGNN specifically. On a standard tracking problem, we show that, compared with previous ML-based tracking algorithms, the HGNN has better tracking efficiency performance, better robustness against inefficient input graphs, and better convergence compared with traditional GNNs.

* 7 pages, 5 figures, submitted to the 21st International Workshop on Advanced Computing and Analysis Techniques in Physics Research

Via

Access Paper or Ask Questions

MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems

Oct 26, 2021

Steven Farrell, Murali Emani, Jacob Balma, Lukas Drescher, Aleksandr Drozd, Andreas Fink, Geoffrey Fox, David Kanter, Thorsten Kurth, Peter Mattson(+33 more)

Figure 1 for MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems

Figure 2 for MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems

Figure 3 for MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems

Figure 4 for MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems

Abstract:Scientific communities are increasingly adopting machine learning and deep learning models in their applications to accelerate scientific insights. High performance computing systems are pushing the frontiers of performance with a rich diversity of hardware resources and massive scale-out capabilities. There is a critical need to understand fair and effective benchmarking of machine learning applications that are representative of real-world scientific use cases. MLPerf is a community-driven standard to benchmark machine learning workloads, focusing on end-to-end performance metrics. In this paper, we introduce MLPerf HPC, a benchmark suite of large-scale scientific machine learning training applications driven by the MLCommons Association. We present the results from the first submission round, including a diverse set of some of the world's largest HPC systems. We develop a systematic framework for their joint analysis and compare them in terms of data staging, algorithmic convergence, and compute performance. As a result, we gain a quantitative understanding of optimizations on different subsystems such as staging and on-node loading of data, compute-unit utilization, and communication scheduling, enabling overall $>10 \times$ (end-to-end) performance improvements through system scaling. Notably, our analysis shows a scale-dependent interplay between the dataset size, a system's memory hierarchy, and training convergence that underlines the importance of near-compute storage. To overcome the data-parallel scalability challenge at large batch sizes, we discuss specific learning techniques and hybrid data-and-model parallelism that are effective on large systems. We conclude by characterizing each benchmark with respect to low-level memory, I/O, and network behavior to parameterize extended roofline performance models in future rounds.

Via

Access Paper or Ask Questions

The Tracking Machine Learning challenge : Throughput phase

May 14, 2021

Sabrina Amrouche, Laurent Basara, Paolo Calafiura, Dmitry Emeliyanov, Victor Estrade, Steven Farrell, Cécile Germain, Vladimir Vava Gligorov, Tobias Golling, Sergey Gorbunov(+11 more)

Figure 1 for The Tracking Machine Learning challenge : Throughput phase

Figure 2 for The Tracking Machine Learning challenge : Throughput phase

Figure 3 for The Tracking Machine Learning challenge : Throughput phase

Figure 4 for The Tracking Machine Learning challenge : Throughput phase

Abstract:This paper reports on the second "Throughput" phase of the Tracking Machine Learning (TrackML) challenge on the Codalab platform. As in the first "Accuracy" phase, the participants had to solve a difficult experimental problem linked to tracking accurately the trajectory of particles as e.g. created at the Large Hadron Collider (LHC): given O($10^5$) points, the participants had to connect them into O($10^4$) individual groups that represent the particle trajectories which are approximated helical. While in the first phase only the accuracy mattered, the goal of this second phase was a compromise between the accuracy and the speed of inference. Both were measured on the Codalab platform where the participants had to upload their software. The best three participants had solutions with good accuracy and speed an order of magnitude faster than the state of the art when the challenge was designed. Although the core algorithms were less diverse than in the first phase, a diversity of techniques have been used and are described in this paper. The performance of the algorithms are analysed in depth and lessons derived.

* submitted to Computing and Software for Big Science

Via

Access Paper or Ask Questions

Hierarchical Roofline Performance Analysis for Deep Learning Applications

Sep 22, 2020

Yunsong Wang, Charlene Yang, Steven Farrell, Thorsten Kurth, Samuel Williams

Figure 1 for Hierarchical Roofline Performance Analysis for Deep Learning Applications

Figure 2 for Hierarchical Roofline Performance Analysis for Deep Learning Applications

Figure 3 for Hierarchical Roofline Performance Analysis for Deep Learning Applications

Figure 4 for Hierarchical Roofline Performance Analysis for Deep Learning Applications

Abstract:This paper presents a practical methodology for collecting performance data necessary to conduct hierarchical Roofline analysis on NVIDIA GPUs. It discusses the extension of the Empirical Roofline Toolkit for broader support of a range of data precisions and Tensor Core support and introduces a Nsight Compute based method to accurately collect application performance information. This methodology allows for automated machine characterization and application characterization for Roofline analysis across the entire memory hierarchy on NVIDIA GPUs, and it is validated by a complex deep learning application used for climate image segmentation. We use two versions of the code, in TensorFlow and PyTorch respectively, to demonstrate the use and effectiveness of this methodology. We highlight how the application utilizes the compute and memory capabilities on the GPU and how the implementation and performance differ in two deep learning frameworks.

* 9 pages

Via

Access Paper or Ask Questions

Time-Based Roofline for Deep Learning Performance Analysis

Sep 22, 2020

Yunsong Wang, Charlene Yang, Steven Farrell, Yan Zhang, Thorsten Kurth, Samuel Williams

Figure 1 for Time-Based Roofline for Deep Learning Performance Analysis

Figure 2 for Time-Based Roofline for Deep Learning Performance Analysis

Figure 3 for Time-Based Roofline for Deep Learning Performance Analysis

Figure 4 for Time-Based Roofline for Deep Learning Performance Analysis

Abstract:Deep learning applications are usually very compute-intensive and require a long run time for training and inference. This has been tackled by researchers from both hardware and software sides, and in this paper, we propose a Roofline-based approach to performance analysis to facilitate the optimization of these applications. This approach is an extension of the Roofline model widely used in traditional high-performance computing applications, and it incorporates both compute/bandwidth complexity and run time in its formulae to provide insights into deep learning-specific characteristics. We take two sets of representative kernels, 2D convolution and long short-term memory, to validate and demonstrate the use of this new approach, and investigate how arithmetic intensity, cache locality, auto-tuning, kernel launch overhead, and Tensor Core usage can affect performance. Compared to the common ad-hoc approach, this study helps form a more systematic way to analyze code performance and identify optimization opportunities for deep learning applications.

* 9 pages

Via

Access Paper or Ask Questions

Track Seeding and Labelling with Embedded-space Graph Neural Networks

Jun 30, 2020

Nicholas Choma, Daniel Murnane, Xiangyang Ju, Paolo Calafiura, Sean Conlon, Steven Farrell, Prabhat, Giuseppe Cerati, Lindsey Gray, Thomas Klijnsma(+9 more)

Figure 1 for Track Seeding and Labelling with Embedded-space Graph Neural Networks

Figure 2 for Track Seeding and Labelling with Embedded-space Graph Neural Networks

Figure 3 for Track Seeding and Labelling with Embedded-space Graph Neural Networks

Figure 4 for Track Seeding and Labelling with Embedded-space Graph Neural Networks

Abstract:To address the unprecedented scale of HL-LHC data, the Exa.TrkX project is investigating a variety of machine learning approaches to particle track reconstruction. The most promising of these solutions, graph neural networks (GNN), process the event as a graph that connects track measurements (detector hits corresponding to nodes) with candidate line segments between the hits (corresponding to edges). Detector information can be associated with nodes and edges, enabling a GNN to propagate the embedded parameters around the graph and predict node-, edge- and graph-level observables. Previously, message-passing GNNs have shown success in predicting doublet likelihood, and we here report updates on the state-of-the-art architectures for this task. In addition, the Exa.TrkX project has investigated innovations in both graph construction, and embedded representations, in an effort to achieve fully learned end-to-end track finding. Hence, we present a suite of extensions to the original model, with encouraging results for hitgraph classification. In addition, we explore increased performance by constructing graphs from learned representations which contain non-linear metric structure, allowing for efficient clustering and neighborhood queries of data points. We demonstrate how this framework fits in with both traditional clustering pipelines, and GNN approaches. The embedded graphs feed into high-accuracy doublet and triplet classifiers, or can be used as an end-to-end track classifier by clustering in an embedded space. A set of post-processing methods improve performance with knowledge of the detector physics. Finally, we present numerical results on the TrackML particle tracking challenge dataset, where our framework shows favorable results in both seeding and track finding.

* Proceedings submission in Connecting the Dots Workshop 2020, 10 pages

Via

Access Paper or Ask Questions