Picture for Jiahe Song

Jiahe Song

PRPO: Aligning Process Reward with Outcome Reward in Policy Optimization

Add code
Jan 13, 2026
Viaarxiv icon

EL4NER: Ensemble Learning for Named Entity Recognition via Multiple Small-Parameter Large Language Models

Add code
May 29, 2025
Viaarxiv icon

GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation

Add code
Feb 09, 2025
Viaarxiv icon