Picture for Pascal Poupart

Pascal Poupart

University of Waterloo

Towards Cost-Effective Reward Guided Text Generation

Add code
Feb 06, 2025
Figure 1 for Towards Cost-Effective Reward Guided Text Generation
Figure 2 for Towards Cost-Effective Reward Guided Text Generation
Figure 3 for Towards Cost-Effective Reward Guided Text Generation
Figure 4 for Towards Cost-Effective Reward Guided Text Generation
Viaarxiv icon

Learning Soft Driving Constraints from Vectorized Scene Embeddings while Imitating Expert Trajectories

Add code
Dec 07, 2024
Viaarxiv icon

A Survey of Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges

Add code
Sep 11, 2024
Figure 1 for A Survey of Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges
Figure 2 for A Survey of Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges
Figure 3 for A Survey of Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges
Figure 4 for A Survey of Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges
Viaarxiv icon

FedLog: Personalized Federated Classification with Less Communication and More Flexibility

Add code
Jul 11, 2024
Figure 1 for FedLog: Personalized Federated Classification with Less Communication and More Flexibility
Figure 2 for FedLog: Personalized Federated Classification with Less Communication and More Flexibility
Figure 3 for FedLog: Personalized Federated Classification with Less Communication and More Flexibility
Figure 4 for FedLog: Personalized Federated Classification with Less Communication and More Flexibility
Viaarxiv icon

Uncertainty-Guided Optimization on Large Language Model Search Trees

Add code
Jul 04, 2024
Viaarxiv icon

Confidence Aware Inverse Constrained Reinforcement Learning

Add code
Jun 24, 2024
Viaarxiv icon

A Critical Look At Tokenwise Reward-Guided Text Generation

Add code
Jun 12, 2024
Figure 1 for A Critical Look At Tokenwise Reward-Guided Text Generation
Figure 2 for A Critical Look At Tokenwise Reward-Guided Text Generation
Figure 3 for A Critical Look At Tokenwise Reward-Guided Text Generation
Figure 4 for A Critical Look At Tokenwise Reward-Guided Text Generation
Viaarxiv icon

How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization?

Add code
Jun 10, 2024
Figure 1 for How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization?
Figure 2 for How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization?
Figure 3 for How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization?
Figure 4 for How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization?
Viaarxiv icon

A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization

Add code
Mar 20, 2024
Figure 1 for A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Figure 2 for A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Figure 3 for A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Figure 4 for A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
Viaarxiv icon

Why Online Reinforcement Learning is Causal

Add code
Mar 07, 2024
Figure 1 for Why Online Reinforcement Learning is Causal
Figure 2 for Why Online Reinforcement Learning is Causal
Figure 3 for Why Online Reinforcement Learning is Causal
Figure 4 for Why Online Reinforcement Learning is Causal
Viaarxiv icon