Picture for Richard Zemel

Richard Zemel

Confidence Calibration in Vision-Language-Action Models

Add code
Jul 23, 2025
Viaarxiv icon

Guiding LLM Decision-Making with Fairness Reward Models

Add code
Jul 15, 2025
Viaarxiv icon

Replay Can Provably Increase Forgetting

Add code
Jun 04, 2025
Viaarxiv icon

Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation

Add code
May 27, 2025
Viaarxiv icon

Adaptive Elicitation of Latent Information Using Natural Language

Add code
Apr 05, 2025
Figure 1 for Adaptive Elicitation of Latent Information Using Natural Language
Figure 2 for Adaptive Elicitation of Latent Information Using Natural Language
Figure 3 for Adaptive Elicitation of Latent Information Using Natural Language
Figure 4 for Adaptive Elicitation of Latent Information Using Natural Language
Viaarxiv icon

Towards Effective Discrimination Testing for Generative AI

Add code
Dec 30, 2024
Figure 1 for Towards Effective Discrimination Testing for Generative AI
Figure 2 for Towards Effective Discrimination Testing for Generative AI
Figure 3 for Towards Effective Discrimination Testing for Generative AI
Figure 4 for Towards Effective Discrimination Testing for Generative AI
Viaarxiv icon

Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification

Add code
Oct 07, 2024
Figure 1 for Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Figure 2 for Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Figure 3 for Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Figure 4 for Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Viaarxiv icon

Improving Predictor Reliability with Selective Recalibration

Add code
Oct 07, 2024
Figure 1 for Improving Predictor Reliability with Selective Recalibration
Figure 2 for Improving Predictor Reliability with Selective Recalibration
Figure 3 for Improving Predictor Reliability with Selective Recalibration
Figure 4 for Improving Predictor Reliability with Selective Recalibration
Viaarxiv icon

Controlling the World by Sleight of Hand

Add code
Aug 13, 2024
Figure 1 for Controlling the World by Sleight of Hand
Figure 2 for Controlling the World by Sleight of Hand
Figure 3 for Controlling the World by Sleight of Hand
Figure 4 for Controlling the World by Sleight of Hand
Viaarxiv icon

Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities

Add code
Jun 20, 2024
Figure 1 for Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Figure 2 for Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Figure 3 for Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Figure 4 for Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities
Viaarxiv icon