Picture for Ganqu Cui

Ganqu Cui

Free Process Rewards without Process Labels

Add code
Dec 02, 2024
Viaarxiv icon

Scalable Efficient Training of Large Language Models with Low-dimensional Projected Attention

Add code
Nov 04, 2024
Viaarxiv icon

Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity

Add code
Jun 17, 2024
Figure 1 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Figure 2 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Figure 3 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Figure 4 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Viaarxiv icon

UltraMedical: Building Specialized Generalists in Biomedicine

Add code
Jun 06, 2024
Viaarxiv icon

RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness

Add code
May 27, 2024
Figure 1 for RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
Figure 2 for RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
Figure 3 for RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
Figure 4 for RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
Viaarxiv icon

MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies

Add code
Apr 09, 2024
Figure 1 for MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Figure 2 for MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Figure 3 for MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Figure 4 for MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Viaarxiv icon

Advancing LLM Reasoning Generalists with Preference Trees

Add code
Apr 02, 2024
Figure 1 for Advancing LLM Reasoning Generalists with Preference Trees
Figure 2 for Advancing LLM Reasoning Generalists with Preference Trees
Figure 3 for Advancing LLM Reasoning Generalists with Preference Trees
Figure 4 for Advancing LLM Reasoning Generalists with Preference Trees
Viaarxiv icon

Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models

Add code
Mar 18, 2024
Viaarxiv icon

Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment

Add code
Feb 29, 2024
Figure 1 for Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
Figure 2 for Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
Figure 3 for Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
Figure 4 for Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
Viaarxiv icon

RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

Add code
Dec 01, 2023
Viaarxiv icon