Picture for Yirong Sun

Yirong Sun

PRISM: Preference Refinement via Implicit Scene Modeling for 3D Vision-Language Preference-Based Reinforcement Learning

Add code
Mar 13, 2025
Viaarxiv icon

Integrating Chain-of-Thought for Multimodal Alignment: A Study on 3D Vision-Language Learning

Add code
Mar 08, 2025
Viaarxiv icon

Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning

Add code
Feb 25, 2025
Viaarxiv icon

Instruction-Tuned LLMs Succeed in Document-Level MT Without Fine-Tuning -- But BLEU Turns a Blind Eye

Add code
Oct 29, 2024
Figure 1 for Instruction-Tuned LLMs Succeed in Document-Level MT Without Fine-Tuning -- But BLEU Turns a Blind Eye
Figure 2 for Instruction-Tuned LLMs Succeed in Document-Level MT Without Fine-Tuning -- But BLEU Turns a Blind Eye
Figure 3 for Instruction-Tuned LLMs Succeed in Document-Level MT Without Fine-Tuning -- But BLEU Turns a Blind Eye
Figure 4 for Instruction-Tuned LLMs Succeed in Document-Level MT Without Fine-Tuning -- But BLEU Turns a Blind Eye
Viaarxiv icon

The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield Better Language Models

Add code
Oct 09, 2024
Viaarxiv icon