Picture for Runji Lin

Runji Lin

Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement

Add code
Sep 18, 2024
Viaarxiv icon

Online Decision MetaMorphFormer: A Casual Transformer-Based Reinforcement Learning Framework of Universal Embodied Intelligence

Add code
Sep 11, 2024
Viaarxiv icon

Qwen2 Technical Report

Add code
Jul 16, 2024
Figure 1 for Qwen2 Technical Report
Figure 2 for Qwen2 Technical Report
Figure 3 for Qwen2 Technical Report
Figure 4 for Qwen2 Technical Report
Viaarxiv icon

LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

Add code
Jun 30, 2024
Viaarxiv icon

The Reason behind Good or Bad: Towards a Better Mathematical Verifier with Natural Language Feedback

Add code
Jun 20, 2024
Viaarxiv icon

Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment

Add code
May 28, 2024
Viaarxiv icon

Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach

Add code
Dec 19, 2023
Viaarxiv icon

Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models

Add code
Nov 15, 2023
Viaarxiv icon

Qwen Technical Report

Add code
Sep 28, 2023
Figure 1 for Qwen Technical Report
Figure 2 for Qwen Technical Report
Figure 3 for Qwen Technical Report
Figure 4 for Qwen Technical Report
Viaarxiv icon

#InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models

Add code
Aug 15, 2023
Viaarxiv icon