Picture for Yonathan Efroni

Yonathan Efroni

Bill

Internalizing Self-Consistency in Language Models: Multi-Agent Consensus Alignment

Add code
Sep 18, 2025
Viaarxiv icon

Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do

Add code
Mar 20, 2025
Figure 1 for Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do
Figure 2 for Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do
Figure 3 for Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do
Figure 4 for Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do
Viaarxiv icon

Aligned Multi Objective Optimization

Add code
Feb 19, 2025
Viaarxiv icon

Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank

Add code
Oct 01, 2024
Figure 1 for Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
Figure 2 for Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
Figure 3 for Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
Figure 4 for Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
Viaarxiv icon

RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation

Add code
Jun 03, 2024
Figure 1 for RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation
Figure 2 for RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation
Viaarxiv icon

Generalizing Multi-Step Inverse Models for Representation Learning to Finite-Memory POMDPs

Add code
Apr 22, 2024
Viaarxiv icon

The Bias of Harmful Label Associations in Vision-Language Models

Add code
Feb 11, 2024
Viaarxiv icon

Pearl: A Production-ready Reinforcement Learning Agent

Add code
Dec 06, 2023
Figure 1 for Pearl: A Production-ready Reinforcement Learning Agent
Figure 2 for Pearl: A Production-ready Reinforcement Learning Agent
Figure 3 for Pearl: A Production-ready Reinforcement Learning Agent
Figure 4 for Pearl: A Production-ready Reinforcement Learning Agent
Viaarxiv icon

PcLast: Discovering Plannable Continuous Latent States

Add code
Nov 06, 2023
Figure 1 for PcLast: Discovering Plannable Continuous Latent States
Figure 2 for PcLast: Discovering Plannable Continuous Latent States
Figure 3 for PcLast: Discovering Plannable Continuous Latent States
Figure 4 for PcLast: Discovering Plannable Continuous Latent States
Viaarxiv icon

Prospective Side Information for Latent MDPs

Add code
Oct 11, 2023
Figure 1 for Prospective Side Information for Latent MDPs
Figure 2 for Prospective Side Information for Latent MDPs
Viaarxiv icon