Picture for Qinyuan Cheng

Qinyuan Cheng

World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning

Add code
Mar 13, 2025
Viaarxiv icon

How to Mitigate Overfitting in Weak-to-strong Generalization?

Add code
Mar 06, 2025
Viaarxiv icon

AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models

Add code
Feb 24, 2025
Viaarxiv icon

Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?

Add code
Feb 17, 2025
Viaarxiv icon

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

Add code
Dec 18, 2024
Figure 1 for Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Figure 2 for Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Figure 3 for Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Figure 4 for Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Viaarxiv icon

Case2Code: Learning Inductive Reasoning with Synthetic Data

Add code
Jul 17, 2024
Figure 1 for Case2Code: Learning Inductive Reasoning with Synthetic Data
Figure 2 for Case2Code: Learning Inductive Reasoning with Synthetic Data
Figure 3 for Case2Code: Learning Inductive Reasoning with Synthetic Data
Figure 4 for Case2Code: Learning Inductive Reasoning with Synthetic Data
Viaarxiv icon

Scaling Laws for Fact Memorization of Large Language Models

Add code
Jun 22, 2024
Viaarxiv icon

Cross-Modality Safety Alignment

Add code
Jun 21, 2024
Viaarxiv icon

Unified Active Retrieval for Retrieval Augmented Generation

Add code
Jun 18, 2024
Viaarxiv icon

Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models

Add code
May 21, 2024
Viaarxiv icon