Picture for Yarin Gal

Yarin Gal

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

Add code
Oct 11, 2024
Viaarxiv icon

Temporal-Difference Variational Continual Learning

Add code
Oct 10, 2024
Figure 1 for Temporal-Difference Variational Continual Learning
Figure 2 for Temporal-Difference Variational Continual Learning
Figure 3 for Temporal-Difference Variational Continual Learning
Figure 4 for Temporal-Difference Variational Continual Learning
Viaarxiv icon

TextCAVs: Debugging vision models using text

Add code
Aug 16, 2024
Viaarxiv icon

Variational Inference Failures Under Model Symmetries: Permutation Invariant Posteriors for Bayesian Neural Networks

Add code
Aug 10, 2024
Viaarxiv icon

Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs

Add code
Jun 22, 2024
Viaarxiv icon

The Benefits and Risks of Transductive Approaches for AI Fairness

Add code
Jun 17, 2024
Viaarxiv icon

Deep Bayesian Active Learning for Preference Modeling in Large Language Models

Add code
Jun 14, 2024
Figure 1 for Deep Bayesian Active Learning for Preference Modeling in Large Language Models
Figure 2 for Deep Bayesian Active Learning for Preference Modeling in Large Language Models
Figure 3 for Deep Bayesian Active Learning for Preference Modeling in Large Language Models
Figure 4 for Deep Bayesian Active Learning for Preference Modeling in Large Language Models
Viaarxiv icon

Estimating the Hallucination Rate of Generative AI

Add code
Jun 11, 2024
Figure 1 for Estimating the Hallucination Rate of Generative AI
Figure 2 for Estimating the Hallucination Rate of Generative AI
Figure 3 for Estimating the Hallucination Rate of Generative AI
Figure 4 for Estimating the Hallucination Rate of Generative AI
Viaarxiv icon

Challenges and Considerations in the Evaluation of Bayesian Causal Discovery

Add code
Jun 05, 2024
Viaarxiv icon

Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities

Add code
May 30, 2024
Viaarxiv icon