Picture for Rasool Fakoor

Rasool Fakoor

Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens

Add code
Oct 18, 2024
Figure 1 for Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens
Figure 2 for Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens
Figure 3 for Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens
Figure 4 for Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens
Viaarxiv icon

AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents

Add code
Oct 17, 2024
Figure 1 for AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Figure 2 for AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Figure 3 for AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Figure 4 for AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Viaarxiv icon

AlphaRouter: Quantum Circuit Routing with Reinforcement Learning and Tree Search

Add code
Oct 07, 2024
Viaarxiv icon

EXTRACT: Efficient Policy Learning by Extracting Transferrable Robot Skills from Offline Data

Add code
Jun 25, 2024
Viaarxiv icon

Learning the Target Network in Function Space

Add code
Jun 03, 2024
Viaarxiv icon

TAIL: Task-specific Adapters for Imitation Learning with Large Pretrained Models

Add code
Oct 09, 2023
Viaarxiv icon

Budgeting Counterfactual for Offline RL

Add code
Jul 12, 2023
Viaarxiv icon

TD Convergence: An Optimization Perspective

Add code
Jun 30, 2023
Viaarxiv icon

Resetting the Optimizer in Deep RL: An Empirical Study

Add code
Jun 30, 2023
Viaarxiv icon

Data drift correction via time-varying importance weight estimator

Add code
Oct 04, 2022
Figure 1 for Data drift correction via time-varying importance weight estimator
Figure 2 for Data drift correction via time-varying importance weight estimator
Figure 3 for Data drift correction via time-varying importance weight estimator
Figure 4 for Data drift correction via time-varying importance weight estimator
Viaarxiv icon