Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

René Just

Morello: Compiling Fast Neural Networks with Dynamic Programming and Spatial Compression

May 03, 2025

Samuel J. Kaufman, René Just, Rastislav Bodik

Abstract:High-throughput neural network inference requires coordinating many optimization decisions, including parallel tiling, microkernel selection, and data layout. The product of these decisions forms a search space of programs which is typically intractably large. Existing approaches (e.g., auto-schedulers) often address this problem by sampling this space heuristically. In contrast, we introduce a dynamic-programming-based approach to explore more of the search space by iteratively decomposing large program specifications into smaller specifications reachable from a set of rewrites, then composing a final program from each rewrite that minimizes an affine cost model. To reduce memory requirements, we employ a novel memoization table representation, which indexes specifications by coordinates in $Z_{\geq 0}$ and compresses identical, adjacent solutions. This approach can visit a much larger set of programs than prior work. To evaluate the approach, we developed Morello, a compiler which lowers specifications roughly equivalent to a few-node XLA computation graph to x86. Notably, we found that an affine cost model is sufficient to surface high-throughput programs. For example, Morello synthesized a collection of matrix multiplication benchmarks targeting a Zen 1 CPU, including a 1x2048x16384, bfloat16-to-float32 vector-matrix multiply, which was integrated into Google's gemma.cpp.

* 13 pages, 2 figures

Via

Access Paper or Ask Questions

AI-Assisted Assessment of Coding Practices in Modern Code Review

May 22, 2024

Manushree Vijayvergiya, Małgorzata Salawa, Ivan Budiselić, Dan Zheng, Pascal Lamblin, Marko Ivanković, Juanjo Carin, Mateusz Lewko, Jovan Andonov, Goran Petrović(+3 more)

Figure 1 for AI-Assisted Assessment of Coding Practices in Modern Code Review

Figure 2 for AI-Assisted Assessment of Coding Practices in Modern Code Review

Figure 3 for AI-Assisted Assessment of Coding Practices in Modern Code Review

Figure 4 for AI-Assisted Assessment of Coding Practices in Modern Code Review

Abstract:Modern code review is a process in which an incremental code contribution made by a code author is reviewed by one or more peers before it is committed to the version control system. An important element of modern code review is verifying that code contributions adhere to best practices. While some of these best practices can be automatically verified, verifying others is commonly left to human reviewers. This paper reports on the development, deployment, and evaluation of AutoCommenter, a system backed by a large language model that automatically learns and enforces coding best practices. We implemented AutoCommenter for four programming languages (C++, Java, Python, and Go) and evaluated its performance and adoption in a large industrial setting. Our evaluation shows that an end-to-end system for learning and enforcing coding best practices is feasible and has a positive impact on the developer workflow. Additionally, this paper reports on the challenges associated with deploying such a system to tens of thousands of developers and the corresponding lessons learned.

* To appear at the ACM International Conference on AI-Powered Software (AIware '24)

Via

Access Paper or Ask Questions

BLIP: Facilitating the Exploration of Undesirable Consequences of Digital Technologies

May 10, 2024

Rock Yuren Pang, Sebastin Santy, René Just, Katharina Reinecke

Figure 1 for BLIP: Facilitating the Exploration of Undesirable Consequences of Digital Technologies

Figure 2 for BLIP: Facilitating the Exploration of Undesirable Consequences of Digital Technologies

Figure 3 for BLIP: Facilitating the Exploration of Undesirable Consequences of Digital Technologies

Figure 4 for BLIP: Facilitating the Exploration of Undesirable Consequences of Digital Technologies

Abstract:Digital technologies have positively transformed society, but they have also led to undesirable consequences not anticipated at the time of design or development. We posit that insights into past undesirable consequences can help researchers and practitioners gain awareness and anticipate potential adverse effects. To test this assumption, we introduce BLIP, a system that extracts real-world undesirable consequences of technology from online articles, summarizes and categorizes them, and presents them in an interactive, web-based interface. In two user studies with 15 researchers in various computer science disciplines, we found that BLIP substantially increased the number and diversity of undesirable consequences they could list in comparison to relying on prior knowledge or searching online. Moreover, BLIP helped them identify undesirable consequences relevant to their ongoing projects, made them aware of undesirable consequences they "had never considered," and inspired them to reflect on their own experiences with technology.

* To appear in the Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '24), May 11--16, 2024, Honolulu, HI, USA

Via

Access Paper or Ask Questions

rTisane: Externalizing conceptual models for data analysis increases engagement with domain knowledge and improves statistical model quality

Oct 25, 2023

Eunice Jun, Edward Misback, Jeffrey Heer, René Just

Figure 1 for rTisane: Externalizing conceptual models for data analysis increases engagement with domain knowledge and improves statistical model quality

Figure 2 for rTisane: Externalizing conceptual models for data analysis increases engagement with domain knowledge and improves statistical model quality

Figure 3 for rTisane: Externalizing conceptual models for data analysis increases engagement with domain knowledge and improves statistical model quality

Figure 4 for rTisane: Externalizing conceptual models for data analysis increases engagement with domain knowledge and improves statistical model quality

Abstract:Statistical models should accurately reflect analysts' domain knowledge about variables and their relationships. While recent tools let analysts express these assumptions and use them to produce a resulting statistical model, it remains unclear what analysts want to express and how externalization impacts statistical model quality. This paper addresses these gaps. We first conduct an exploratory study of analysts using a domain-specific language (DSL) to express conceptual models. We observe a preference for detailing how variables relate and a desire to allow, and then later resolve, ambiguity in their conceptual models. We leverage these findings to develop rTisane, a DSL for expressing conceptual models augmented with an interactive disambiguation process. In a controlled evaluation, we find that rTisane's DSL helps analysts engage more deeply with and accurately externalize their assumptions. rTisane also leads to statistical models that match analysts' assumptions, maintain analysis intent, and better fit the data.

Via

Access Paper or Ask Questions

Repairing Brain-Computer Interfaces with Fault-Based Data Acquisition

Mar 20, 2022

Cailin Winston, Caleb Winston, Chloe N Winston, Claris Winston, Cleah Winston, Rajesh PN Rao, René Just

Figure 1 for Repairing Brain-Computer Interfaces with Fault-Based Data Acquisition

Figure 2 for Repairing Brain-Computer Interfaces with Fault-Based Data Acquisition

Figure 3 for Repairing Brain-Computer Interfaces with Fault-Based Data Acquisition

Figure 4 for Repairing Brain-Computer Interfaces with Fault-Based Data Acquisition

Abstract:Brain-computer interfaces (BCIs) decode recorded neural signals from the brain and/or stimulate the brain with encoded neural signals. BCIs span both hardware and software and have a wide range of applications in restorative medicine, from restoring movement through prostheses and robotic limbs to restoring sensation and communication through spellers. BCIs also have applications in diagnostic medicine, e.g., providing clinicians with data for detecting seizures, sleep patterns, or emotions. Despite their promise, BCIs have not yet been adopted for long-term, day-to-day use because of challenges related to reliability and robustness, which are needed for safe operation in all scenarios. Ensuring safe operation currently requires hours of manual data collection and recalibration, involving both patients and clinicians. However, data collection is not targeted at eliminating specific faults in a BCI. This paper presents a new methodology for characterizing, detecting, and localizing faults in BCIs. Specifically, it proposes partial test oracles as a method for detecting faults and slice functions as a method for localizing faults to characteristic patterns in the input data or relevant tasks performed by the user. Through targeted data acquisition and retraining, the proposed methodology improves the correctness of BCIs. We evaluated the proposed methodology on five BCI applications. The results show that the proposed methodology (1) precisely localizes faults and (2) can significantly reduce the frequency of faults through retraining based on targeted, fault-based data acquisition. These results suggest that the proposed methodology is a promising step towards repairing faulty BCIs.

* Accepted at International Conference on Software Engineering (ICSE-2022)

Via

Access Paper or Ask Questions

Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships

Jan 07, 2022

Eunice Jun, Audrey Seo, Jeffrey Heer, René Just

Figure 1 for Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships

Figure 2 for Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships

Figure 3 for Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships

Figure 4 for Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships

Abstract:Proper statistical modeling incorporates domain theory about how concepts relate and details of how data were measured. However, data analysts currently lack tool support for recording and reasoning about domain assumptions, data collection, and modeling choices in an integrated manner, leading to mistakes that can compromise scientific validity. For instance, generalized linear mixed-effects models (GLMMs) help answer complex research questions, but omitting random effects impairs the generalizability of results. To address this need, we present Tisane, a mixed-initiative system for authoring generalized linear models with and without mixed-effects. Tisane introduces a study design specification language for expressing and asking questions about relationships between variables. Tisane contributes an interactive compilation process that represents relationships in a graph, infers candidate statistical models, and asks follow-up questions to disambiguate user queries to construct a valid model. In case studies with three researchers, we find that Tisane helps them focus on their goals and assumptions while avoiding past mistakes.

Via

Access Paper or Ask Questions