Picture for Yong Lin

Yong Lin

Why is Your Language Model a Poor Implicit Reward Model?

Add code
Jul 10, 2025
Viaarxiv icon

Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving on Inequalities

Add code
May 19, 2025
Viaarxiv icon

AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models

Add code
Apr 30, 2025
Viaarxiv icon

Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving

Add code
Feb 11, 2025
Viaarxiv icon

Entropy-Regularized Process Reward Model

Add code
Dec 15, 2024
Figure 1 for Entropy-Regularized Process Reward Model
Figure 2 for Entropy-Regularized Process Reward Model
Figure 3 for Entropy-Regularized Process Reward Model
Figure 4 for Entropy-Regularized Process Reward Model
Viaarxiv icon

On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization

Add code
Sep 05, 2024
Figure 1 for On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
Figure 2 for On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
Figure 3 for On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
Figure 4 for On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
Viaarxiv icon

Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic

Add code
Aug 24, 2024
Viaarxiv icon

Leveraging Invariant Principle for Heterophilic Graph Structure Distribution Shifts

Add code
Aug 18, 2024
Figure 1 for Leveraging Invariant Principle for Heterophilic Graph Structure Distribution Shifts
Figure 2 for Leveraging Invariant Principle for Heterophilic Graph Structure Distribution Shifts
Figure 3 for Leveraging Invariant Principle for Heterophilic Graph Structure Distribution Shifts
Figure 4 for Leveraging Invariant Principle for Heterophilic Graph Structure Distribution Shifts
Viaarxiv icon

Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs

Add code
Jun 14, 2024
Figure 1 for Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs
Figure 2 for Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs
Figure 3 for Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs
Figure 4 for Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs
Viaarxiv icon

On the Benefits of Over-parameterization for Out-of-Distribution Generalization

Add code
Mar 26, 2024
Viaarxiv icon