Picture for Alexander Bukharin

Alexander Bukharin

HelpSteer2-Preference: Complementing Ratings with Preferences

Add code
Oct 02, 2024
Viaarxiv icon

Robust Reinforcement Learning from Corrupted Human Feedback

Add code
Jun 21, 2024
Viaarxiv icon

Adaptive Preference Scaling for Reinforcement Learning with Human Feedback

Add code
Jun 04, 2024
Viaarxiv icon

Data Diversity Matters for Robust Instruction Tuning

Add code
Nov 21, 2023
Viaarxiv icon

Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms

Add code
Oct 16, 2023
Viaarxiv icon

Deep Reinforcement Learning from Hierarchical Weak Preference Feedback

Add code
Sep 06, 2023
Viaarxiv icon

Machine Learning Force Fields with Data Cost Aware Training

Add code
Jun 05, 2023
Viaarxiv icon

Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

Add code
Mar 18, 2023
Viaarxiv icon

PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance

Add code
Jun 25, 2022
Figure 1 for PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
Figure 2 for PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
Figure 3 for PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
Figure 4 for PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
Viaarxiv icon

Early Detection of COVID-19 Hotspots Using Spatio-Temporal Data

Add code
May 31, 2021
Figure 1 for Early Detection of COVID-19 Hotspots Using Spatio-Temporal Data
Figure 2 for Early Detection of COVID-19 Hotspots Using Spatio-Temporal Data
Figure 3 for Early Detection of COVID-19 Hotspots Using Spatio-Temporal Data
Figure 4 for Early Detection of COVID-19 Hotspots Using Spatio-Temporal Data
Viaarxiv icon