Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tim Schreier

Practical Principles for AI Cost and Compute Accounting

Feb 21, 2025

Stephen Casper, Luke Bailey, Tim Schreier

Abstract:Policymakers are increasingly using development cost and compute as proxies for AI model capabilities and risks. Recent laws have introduced regulatory requirements that are contingent on specific thresholds. However, technical ambiguities in how to perform this accounting could create loopholes that undermine regulatory effectiveness. This paper proposes seven principles for designing practical AI cost and compute accounting standards that (1) reduce opportunities for strategic gaming, (2) avoid disincentivizing responsible risk mitigation, and (3) enable consistent implementation across companies and jurisdictions.

Via

Access Paper or Ask Questions

Effective Mitigations for Systemic Risks from General-Purpose AI

Nov 14, 2024

Risto Uuk, Annemieke Brouwer, Tim Schreier, Noemi Dreksler, Valeria Pulignano, Rishi Bommasani

Figure 1 for Effective Mitigations for Systemic Risks from General-Purpose AI

Figure 2 for Effective Mitigations for Systemic Risks from General-Purpose AI

Figure 3 for Effective Mitigations for Systemic Risks from General-Purpose AI

Figure 4 for Effective Mitigations for Systemic Risks from General-Purpose AI

Abstract:The systemic risks posed by general-purpose AI models are a growing concern, yet the effectiveness of mitigations remains underexplored. Previous research has proposed frameworks for risk mitigation, but has left gaps in our understanding of the perceived effectiveness of measures for mitigating systemic risks. Our study addresses this gap by evaluating how experts perceive different mitigations that aim to reduce the systemic risks of general-purpose AI models. We surveyed 76 experts whose expertise spans AI safety; critical infrastructure; democratic processes; chemical, biological, radiological, and nuclear risks (CBRN); and discrimination and bias. Among 27 mitigations identified through a literature review, we find that a broad range of risk mitigation measures are perceived as effective in reducing various systemic risks and technically feasible by domain experts. In particular, three mitigation measures stand out: safety incident reports and security information sharing, third-party pre-deployment model audits, and pre-deployment risk assessments. These measures show both the highest expert agreement ratings (>60\%) across all four risk areas and are most frequently selected in experts' preferred combinations of measures (>40\%). The surveyed experts highlighted that external scrutiny, proactive evaluation and transparency are key principles for effective mitigation of systemic risks. We provide policy recommendations for implementing the most promising measures, incorporating the qualitative contributions from experts. These insights should inform regulatory frameworks and industry practices for mitigating the systemic risks associated with general-purpose AI.

* 78 pages, 7 figures, 2 tables

Via

Access Paper or Ask Questions

On Offline Evaluation of 3D Object Detection for Autonomous Driving

Aug 24, 2023

Tim Schreier, Katrin Renz, Andreas Geiger, Kashyap Chitta

Figure 1 for On Offline Evaluation of 3D Object Detection for Autonomous Driving

Figure 2 for On Offline Evaluation of 3D Object Detection for Autonomous Driving

Figure 3 for On Offline Evaluation of 3D Object Detection for Autonomous Driving

Abstract:Prior work in 3D object detection evaluates models using offline metrics like average precision since closed-loop online evaluation on the downstream driving task is costly. However, it is unclear how indicative offline results are of driving performance. In this work, we perform the first empirical evaluation measuring how predictive different detection metrics are of driving performance when detectors are integrated into a full self-driving stack. We conduct extensive experiments on urban driving in the CARLA simulator using 16 object detection models. We find that the nuScenes Detection Score has a higher correlation to driving performance than the widely used average precision metric. In addition, our results call for caution on the exclusive reliance on the emerging class of `planner-centric' metrics.

* Appears in: IEEE International Conference on Computer Vision (ICCV'23) Workshops

Via

Access Paper or Ask Questions