Picture for Rasool Fakoor

Rasool Fakoor

Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens

Add code
Oct 18, 2024
Figure 1 for Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens
Figure 2 for Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens
Figure 3 for Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens
Figure 4 for Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens
Viaarxiv icon

AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents

Add code
Oct 17, 2024
Figure 1 for AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Figure 2 for AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Figure 3 for AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Figure 4 for AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Viaarxiv icon

AlphaRouter: Quantum Circuit Routing with Reinforcement Learning and Tree Search

Add code
Oct 07, 2024
Viaarxiv icon

EXTRACT: Efficient Policy Learning by Extracting Transferrable Robot Skills from Offline Data

Add code
Jun 25, 2024
Figure 1 for EXTRACT: Efficient Policy Learning by Extracting Transferrable Robot Skills from Offline Data
Figure 2 for EXTRACT: Efficient Policy Learning by Extracting Transferrable Robot Skills from Offline Data
Figure 3 for EXTRACT: Efficient Policy Learning by Extracting Transferrable Robot Skills from Offline Data
Figure 4 for EXTRACT: Efficient Policy Learning by Extracting Transferrable Robot Skills from Offline Data
Viaarxiv icon

Learning the Target Network in Function Space

Add code
Jun 03, 2024
Viaarxiv icon

TAIL: Task-specific Adapters for Imitation Learning with Large Pretrained Models

Add code
Oct 09, 2023
Viaarxiv icon

Budgeting Counterfactual for Offline RL

Add code
Jul 12, 2023
Viaarxiv icon

TD Convergence: An Optimization Perspective

Add code
Jun 30, 2023
Viaarxiv icon

Resetting the Optimizer in Deep RL: An Empirical Study

Add code
Jun 30, 2023
Viaarxiv icon

Data drift correction via time-varying importance weight estimator

Add code
Oct 04, 2022
Figure 1 for Data drift correction via time-varying importance weight estimator
Figure 2 for Data drift correction via time-varying importance weight estimator
Figure 3 for Data drift correction via time-varying importance weight estimator
Figure 4 for Data drift correction via time-varying importance weight estimator
Viaarxiv icon