Picture for Ganqu Cui

Ganqu Cui

AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset

Add code
Apr 04, 2025
Viaarxiv icon

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Add code
Mar 27, 2025
Viaarxiv icon

UltraIF: Advancing Instruction Following from the Wild

Add code
Feb 06, 2025
Figure 1 for UltraIF: Advancing Instruction Following from the Wild
Figure 2 for UltraIF: Advancing Instruction Following from the Wild
Figure 3 for UltraIF: Advancing Instruction Following from the Wild
Figure 4 for UltraIF: Advancing Instruction Following from the Wild
Viaarxiv icon

Process Reinforcement through Implicit Rewards

Add code
Feb 03, 2025
Viaarxiv icon

From Drafts to Answers: Unlocking LLM Potential via Aggregation Fine-Tuning

Add code
Jan 21, 2025
Viaarxiv icon

Free Process Rewards without Process Labels

Add code
Dec 02, 2024
Figure 1 for Free Process Rewards without Process Labels
Figure 2 for Free Process Rewards without Process Labels
Figure 3 for Free Process Rewards without Process Labels
Figure 4 for Free Process Rewards without Process Labels
Viaarxiv icon

Scalable Efficient Training of Large Language Models with Low-dimensional Projected Attention

Add code
Nov 04, 2024
Figure 1 for Scalable Efficient Training of Large Language Models with Low-dimensional Projected Attention
Figure 2 for Scalable Efficient Training of Large Language Models with Low-dimensional Projected Attention
Figure 3 for Scalable Efficient Training of Large Language Models with Low-dimensional Projected Attention
Figure 4 for Scalable Efficient Training of Large Language Models with Low-dimensional Projected Attention
Viaarxiv icon

Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity

Add code
Jun 17, 2024
Figure 1 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Figure 2 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Figure 3 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Figure 4 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Viaarxiv icon

UltraMedical: Building Specialized Generalists in Biomedicine

Add code
Jun 06, 2024
Viaarxiv icon

RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness

Add code
May 27, 2024
Figure 1 for RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
Figure 2 for RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
Figure 3 for RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
Figure 4 for RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
Viaarxiv icon