Picture for Misha Khalman

Misha Khalman

Building Math Agents with Multi-Turn Iterative Preference Learning

Add code
Sep 04, 2024
Figure 1 for Building Math Agents with Multi-Turn Iterative Preference Learning
Figure 2 for Building Math Agents with Multi-Turn Iterative Preference Learning
Figure 3 for Building Math Agents with Multi-Turn Iterative Preference Learning
Figure 4 for Building Math Agents with Multi-Turn Iterative Preference Learning
Viaarxiv icon

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Direct Language Model Alignment from Online AI Feedback

Add code
Feb 07, 2024
Viaarxiv icon

LiPO: Listwise Preference Optimization through Learning-to-Rank

Add code
Feb 02, 2024
Figure 1 for LiPO: Listwise Preference Optimization through Learning-to-Rank
Figure 2 for LiPO: Listwise Preference Optimization through Learning-to-Rank
Figure 3 for LiPO: Listwise Preference Optimization through Learning-to-Rank
Figure 4 for LiPO: Listwise Preference Optimization through Learning-to-Rank
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Calibrating Likelihoods towards Consistency in Summarization Models

Add code
Oct 12, 2023
Viaarxiv icon

Statistical Rejection Sampling Improves Preference Optimization

Add code
Sep 13, 2023
Viaarxiv icon

SLiC-HF: Sequence Likelihood Calibration with Human Feedback

Add code
May 17, 2023
Viaarxiv icon

Calibrating Sequence likelihood Improves Conditional Language Generation

Add code
Sep 30, 2022
Figure 1 for Calibrating Sequence likelihood Improves Conditional Language Generation
Figure 2 for Calibrating Sequence likelihood Improves Conditional Language Generation
Figure 3 for Calibrating Sequence likelihood Improves Conditional Language Generation
Figure 4 for Calibrating Sequence likelihood Improves Conditional Language Generation
Viaarxiv icon