Picture for Mohammad Ghavamzadeh

Mohammad Ghavamzadeh

INRIA Lille - Nord Europe

Conservative Contextual Bandits: Beyond Linear Representations

Add code
Dec 09, 2024
Viaarxiv icon

Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis

Add code
Oct 31, 2024
Figure 1 for Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis
Figure 2 for Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis
Figure 3 for Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis
Figure 4 for Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis
Viaarxiv icon

Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models

Add code
Apr 02, 2024
Figure 1 for Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models
Figure 2 for Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models
Figure 3 for Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models
Figure 4 for Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models
Viaarxiv icon

Contextual Bandits with Stage-wise Constraints

Add code
Jan 15, 2024
Viaarxiv icon

Maximum Entropy Model Correction in Reinforcement Learning

Add code
Nov 29, 2023
Viaarxiv icon

Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage

Add code
Oct 27, 2023
Figure 1 for Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage
Figure 2 for Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage
Figure 3 for Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage
Figure 4 for Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage
Viaarxiv icon

Preference Elicitation with Soft Attributes in Interactive Recommendation

Add code
Oct 22, 2023
Figure 1 for Preference Elicitation with Soft Attributes in Interactive Recommendation
Figure 2 for Preference Elicitation with Soft Attributes in Interactive Recommendation
Figure 3 for Preference Elicitation with Soft Attributes in Interactive Recommendation
Figure 4 for Preference Elicitation with Soft Attributes in Interactive Recommendation
Viaarxiv icon

Factual and Personalized Recommendations using Language Models and Reinforcement Learning

Add code
Oct 09, 2023
Figure 1 for Factual and Personalized Recommendations using Language Models and Reinforcement Learning
Figure 2 for Factual and Personalized Recommendations using Language Models and Reinforcement Learning
Figure 3 for Factual and Personalized Recommendations using Language Models and Reinforcement Learning
Figure 4 for Factual and Personalized Recommendations using Language Models and Reinforcement Learning
Viaarxiv icon

A Convex Relaxation Approach to Bayesian Regret Minimization in Offline Bandits

Add code
Jun 02, 2023
Viaarxiv icon

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models

Add code
May 25, 2023
Viaarxiv icon