Picture for Yapei Chang

Yapei Chang

Argument Collapse: LLMs Flatten Long-Form Public Debate

Add code
Jun 01, 2026
Viaarxiv icon

Recovering Diversity Without Losing Alignment: A DPO Recipe for Post-Trained LLMs

Add code
May 28, 2026
Viaarxiv icon

How2Everything: Mining the Web for How-To Procedures to Evaluate and Improve LLMs

Add code
Feb 09, 2026
Viaarxiv icon

Olmo 3

Add code
Dec 15, 2025
Viaarxiv icon

Auto-Eval Judge: Towards a General Agentic Framework for Task Completion Evaluation

Add code
Aug 07, 2025
Figure 1 for Auto-Eval Judge: Towards a General Agentic Framework for Task Completion Evaluation
Figure 2 for Auto-Eval Judge: Towards a General Agentic Framework for Task Completion Evaluation
Figure 3 for Auto-Eval Judge: Towards a General Agentic Framework for Task Completion Evaluation
Figure 4 for Auto-Eval Judge: Towards a General Agentic Framework for Task Completion Evaluation
Viaarxiv icon

Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation

Add code
Jun 18, 2025
Viaarxiv icon

BLEUBERI: BLEU is a surprisingly effective reward for instruction following

Add code
May 16, 2025
Figure 1 for BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Figure 2 for BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Figure 3 for BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Figure 4 for BLEUBERI: BLEU is a surprisingly effective reward for instruction following
Viaarxiv icon

BEARCUBS: A benchmark for computer-using web agents

Add code
Mar 10, 2025
Figure 1 for BEARCUBS: A benchmark for computer-using web agents
Figure 2 for BEARCUBS: A benchmark for computer-using web agents
Figure 3 for BEARCUBS: A benchmark for computer-using web agents
Figure 4 for BEARCUBS: A benchmark for computer-using web agents
Viaarxiv icon

CLIPPER: Compression enables long-context synthetic data generation

Add code
Feb 20, 2025
Viaarxiv icon

PostMark: A Robust Blackbox Watermark for Large Language Models

Add code
Jun 20, 2024
Figure 1 for PostMark: A Robust Blackbox Watermark for Large Language Models
Figure 2 for PostMark: A Robust Blackbox Watermark for Large Language Models
Figure 3 for PostMark: A Robust Blackbox Watermark for Large Language Models
Figure 4 for PostMark: A Robust Blackbox Watermark for Large Language Models
Viaarxiv icon