Picture for Yuxiao Dong

Yuxiao Dong

Parameter-Efficient Fine-Tuning for Foundation Models

Add code
Jan 23, 2025
Viaarxiv icon

Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling

Add code
Jan 20, 2025
Figure 1 for Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Figure 2 for Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Figure 3 for Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Figure 4 for Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Viaarxiv icon

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Add code
Jan 06, 2025
Viaarxiv icon

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Add code
Dec 30, 2024
Viaarxiv icon

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

Add code
Dec 19, 2024
Figure 1 for LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
Figure 2 for LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
Figure 3 for LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
Figure 4 for LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
Viaarxiv icon

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Add code
Dec 16, 2024
Viaarxiv icon

Does RLHF Scale? Exploring the Impacts From Data, Model, and Method

Add code
Dec 08, 2024
Figure 1 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Figure 2 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Figure 3 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Figure 4 for Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Viaarxiv icon

GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot

Add code
Dec 03, 2024
Viaarxiv icon

Scaling Speech-Text Pre-training with Synthetic Interleaved Data

Add code
Nov 26, 2024
Viaarxiv icon

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Add code
Nov 04, 2024
Figure 1 for WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Figure 2 for WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Figure 3 for WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Figure 4 for WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Viaarxiv icon