Picture for Lifan Yuan

Lifan Yuan

Free Process Rewards without Process Labels

Add code
Dec 02, 2024
Viaarxiv icon

Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity

Add code
Jun 17, 2024
Figure 1 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Figure 2 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Figure 3 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Figure 4 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Viaarxiv icon

Advancing LLM Reasoning Generalists with Preference Trees

Add code
Apr 02, 2024
Figure 1 for Advancing LLM Reasoning Generalists with Preference Trees
Figure 2 for Advancing LLM Reasoning Generalists with Preference Trees
Figure 3 for Advancing LLM Reasoning Generalists with Preference Trees
Figure 4 for Advancing LLM Reasoning Generalists with Preference Trees
Viaarxiv icon

Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment

Add code
Feb 29, 2024
Figure 1 for Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
Figure 2 for Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
Figure 3 for Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
Figure 4 for Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
Viaarxiv icon

Executable Code Actions Elicit Better LLM Agents

Add code
Feb 01, 2024
Viaarxiv icon

Prudent Silence or Foolish Babble? Examining Large Language Models' Responses to the Unknown

Add code
Nov 16, 2023
Viaarxiv icon

UltraFeedback: Boosting Language Models with High-quality Feedback

Add code
Oct 02, 2023
Figure 1 for UltraFeedback: Boosting Language Models with High-quality Feedback
Figure 2 for UltraFeedback: Boosting Language Models with High-quality Feedback
Figure 3 for UltraFeedback: Boosting Language Models with High-quality Feedback
Figure 4 for UltraFeedback: Boosting Language Models with High-quality Feedback
Viaarxiv icon

CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets

Add code
Sep 29, 2023
Figure 1 for CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets
Figure 2 for CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets
Figure 3 for CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets
Figure 4 for CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets
Viaarxiv icon

MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback

Add code
Sep 19, 2023
Figure 1 for MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
Figure 2 for MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
Figure 3 for MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
Figure 4 for MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
Viaarxiv icon

Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations

Add code
Jun 07, 2023
Viaarxiv icon