Picture for Ilija Bogunovic

Ilija Bogunovic

Almost Surely Safe Alignment of Large Language Models at Inference-Time

Add code
Feb 03, 2025
Viaarxiv icon

No-Regret Linear Bandits under Gap-Adjusted Misspecification

Add code
Jan 09, 2025
Viaarxiv icon

Sample-efficient Bayesian Optimisation Using Known Invariances

Add code
Oct 22, 2024
Viaarxiv icon

Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift

Add code
Jul 26, 2024
Figure 1 for Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Figure 2 for Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Figure 3 for Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Figure 4 for Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Viaarxiv icon

Adversarial Robust Decision Transformer: Enhancing Robustness of RvS via Minimax Returns-to-go

Add code
Jul 25, 2024
Viaarxiv icon

Group Robust Preference Optimization in Reward-free RLHF

Add code
May 30, 2024
Viaarxiv icon

PROSAC: Provably Safe Certification for Machine Learning Models under Adversarial Attacks

Add code
Feb 04, 2024
Viaarxiv icon

REDUCR: Robust Data Downsampling Using Class Priority Reweighting

Add code
Dec 01, 2023
Viaarxiv icon

Sample Efficient Reinforcement Learning from Human Feedback via Active Exploration

Add code
Dec 01, 2023
Viaarxiv icon

Robust Best-arm Identification in Linear Bandits

Add code
Nov 08, 2023
Viaarxiv icon