Picture for Andi Nika

Andi Nika

Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences

Add code
Mar 04, 2024
Viaarxiv icon

Corruption-Robust Offline Two-Player Zero-Sum Markov Games

Add code
Mar 04, 2024
Viaarxiv icon

Corruption Robust Offline Reinforcement Learning with Human Feedback

Add code
Feb 09, 2024
Viaarxiv icon

Contextual Combinatorial Volatile Bandits via Gaussian Processes

Add code
Oct 05, 2021
Figure 1 for Contextual Combinatorial Volatile Bandits via Gaussian Processes
Figure 2 for Contextual Combinatorial Volatile Bandits via Gaussian Processes
Figure 3 for Contextual Combinatorial Volatile Bandits via Gaussian Processes
Figure 4 for Contextual Combinatorial Volatile Bandits via Gaussian Processes
Viaarxiv icon

Pareto Active Learning with Gaussian Processes and Adaptive Discretization

Add code
Jun 24, 2020
Figure 1 for Pareto Active Learning with Gaussian Processes and Adaptive Discretization
Figure 2 for Pareto Active Learning with Gaussian Processes and Adaptive Discretization
Figure 3 for Pareto Active Learning with Gaussian Processes and Adaptive Discretization
Figure 4 for Pareto Active Learning with Gaussian Processes and Adaptive Discretization
Viaarxiv icon