Picture for Jean-Stanislas Denain

Jean-Stanislas Denain

FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI

Add code
Nov 07, 2024
Viaarxiv icon

AI capabilities can be significantly improved without expensive retraining

Add code
Dec 12, 2023
Viaarxiv icon

Overthinking the Truth: Understanding how Language Models Process False Demonstrations

Add code
Jul 18, 2023
Viaarxiv icon

Auditing Visualizations: Transparency Methods Struggle to Detect Anomalous Behavior

Add code
Jun 27, 2022
Figure 1 for Auditing Visualizations: Transparency Methods Struggle to Detect Anomalous Behavior
Figure 2 for Auditing Visualizations: Transparency Methods Struggle to Detect Anomalous Behavior
Figure 3 for Auditing Visualizations: Transparency Methods Struggle to Detect Anomalous Behavior
Figure 4 for Auditing Visualizations: Transparency Methods Struggle to Detect Anomalous Behavior
Viaarxiv icon

Grounding Representation Similarity with Statistical Testing

Add code
Aug 03, 2021
Figure 1 for Grounding Representation Similarity with Statistical Testing
Figure 2 for Grounding Representation Similarity with Statistical Testing
Figure 3 for Grounding Representation Similarity with Statistical Testing
Figure 4 for Grounding Representation Similarity with Statistical Testing
Viaarxiv icon

MetFlow: A New Efficient Method for Bridging the Gap between Markov Chain Monte Carlo and Variational Inference

Add code
Feb 27, 2020
Figure 1 for MetFlow: A New Efficient Method for Bridging the Gap between Markov Chain Monte Carlo and Variational Inference
Figure 2 for MetFlow: A New Efficient Method for Bridging the Gap between Markov Chain Monte Carlo and Variational Inference
Figure 3 for MetFlow: A New Efficient Method for Bridging the Gap between Markov Chain Monte Carlo and Variational Inference
Figure 4 for MetFlow: A New Efficient Method for Bridging the Gap between Markov Chain Monte Carlo and Variational Inference
Viaarxiv icon