Picture for Surya Kanoria

Surya Kanoria

Soft Preference Optimization: Aligning Language Models to Expert Distributions

Add code
Apr 30, 2024
Figure 1 for Soft Preference Optimization: Aligning Language Models to Expert Distributions
Viaarxiv icon

Automatic Music Playlist Generation via Simulation-based Reinforcement Learning

Add code
Oct 13, 2023
Viaarxiv icon

What to Learn, and How: Toward Effective Learning from Rationales

Add code
Nov 30, 2021
Figure 1 for What to Learn, and How: Toward Effective Learning from Rationales
Figure 2 for What to Learn, and How: Toward Effective Learning from Rationales
Figure 3 for What to Learn, and How: Toward Effective Learning from Rationales
Figure 4 for What to Learn, and How: Toward Effective Learning from Rationales
Viaarxiv icon