Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matthew T. Dearing

Angela

Generative AI Uses and Risks for Knowledge Workers in a Science Organization

Jan 27, 2025

Kelly B. Wagman, Matthew T. Dearing, Marshini Chetty

Abstract:Generative AI could enhance scientific discovery by supporting knowledge workers in science organizations. However, the real-world applications and perceived concerns of generative AI use in these organizations are uncertain. In this paper, we report on a collaborative study with a US national laboratory with employees spanning Science and Operations about their use of generative AI tools. We surveyed 66 employees, interviewed a subset (N=22), and measured early adoption of an internal generative AI interface called Argo lab-wide. We have four findings: (1) Argo usage data shows small but increasing use by Science and Operations employees; Common current and envisioned use cases for generative AI in this context conceptually fall into either a (2) copilot or (3) workflow agent modality; and (4) Concerns include sensitive data security, academic publishing, and job impacts. Based on our findings, we make recommendations for generative AI use in science and other organizations.

* CHI Conference on Human Factors in Computing Systems (CHI '25)

Via

Access Paper or Ask Questions

LASSI: An LLM-based Automated Self-Correcting Pipeline for Translating Parallel Scientific Codes

Jun 30, 2024

Matthew T. Dearing, Yiheng Tao, Xingfu Wu, Zhiling Lan, Valerie Taylor

Abstract:This paper addresses the problem of providing a novel approach to sourcing significant training data for LLMs focused on science and engineering. In particular, a crucial challenge is sourcing parallel scientific codes in the ranges of millions to billions of codes. To tackle this problem, we propose an automated pipeline framework, called LASSI, designed to translate between parallel programming languages by bootstrapping existing closed- or open-source LLMs. LASSI incorporates autonomous enhancement through self-correcting loops where errors encountered during compilation and execution of generated code are fed back to the LLM through guided prompting for debugging and refactoring. We highlight the bi-directional translation of existing GPU benchmarks between OpenMP target offload and CUDA to validate LASSI. The results of evaluating LASSI with different application codes across four LLMs demonstrate the effectiveness of LASSI for generating executable parallel codes, with 80% of OpenMP to CUDA translations and 85% of CUDA to OpenMP translations producing the expected output. We also observe approximately 78% of OpenMP to CUDA translations and 62% of CUDA to OpenMP translations execute within 10% of or at a faster runtime than the original benchmark code in the same language.

Via

Access Paper or Ask Questions

Analyzing the Performance of Graph Neural Networks with Pipe Parallelism

Dec 20, 2020

Matthew T. Dearing, Xiaoyan, Wang

Figure 1 for Analyzing the Performance of Graph Neural Networks with Pipe Parallelism

Figure 2 for Analyzing the Performance of Graph Neural Networks with Pipe Parallelism

Figure 3 for Analyzing the Performance of Graph Neural Networks with Pipe Parallelism

Figure 4 for Analyzing the Performance of Graph Neural Networks with Pipe Parallelism

Abstract:Many interesting datasets ubiquitous in machine learning and deep learning can be described via graphs. As the scale and complexity of graph-structured datasets increase, such as in expansive social networks, protein folding, chemical interaction networks, and material phase transitions, improving the efficiency of the machine learning techniques applied to these is crucial. In this study, we focus on Graph Neural Networks (GNN), which have found great success in tasks such as node or edge classification and link prediction. However, standard GNN models have scaling limits due to necessary recursive calculations performed through dense graph relationships that lead to memory and runtime bottlenecks. While new approaches for processing larger networks are needed to advance graph techniques, and several have been proposed, we study how GNNs could be parallelized using existing tools and frameworks that are already known to be successful in the deep learning community. In particular, we investigate applying pipeline parallelism to GNN models with GPipe, introduced by Google in 2018.

* 9 pages, 4 figures

Via

Access Paper or Ask Questions