Picture for Dayiheng Liu

Dayiheng Liu

additional authors not shown

CC-OCR V2: Benchmarking Large Multimodal Models for Literacy in Real-world Document Processing

Add code
May 05, 2026
Viaarxiv icon

JURY-RL: Votes Propose, Proofs Dispose for Label-Free RLVR

Add code
Apr 28, 2026
Viaarxiv icon

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models

Add code
Apr 13, 2026
Viaarxiv icon

ClinConsensus: A Consensus-Based Benchmark for Evaluating Chinese Medical LLMs across Difficulty Levels

Add code
Mar 03, 2026
Viaarxiv icon

HLE-Verified: A Systematic Verification and Structured Revision of Humanity's Last Exam

Add code
Feb 17, 2026
Viaarxiv icon

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Add code
Feb 05, 2026
Viaarxiv icon

Canzona: A Unified, Asynchronous, and Load-Balanced Framework for Distributed Matrix-based Optimizers

Add code
Feb 04, 2026
Viaarxiv icon

SWE-Universe: Scale Real-World Verifiable Environments to Millions

Add code
Feb 02, 2026
Viaarxiv icon

A Unified View of Attention and Residual Sinks: Outlier-Driven Rescaling is Essential for Transformer Training

Add code
Jan 30, 2026
Viaarxiv icon

PLawBench: A Rubric-Based Benchmark for Evaluating LLMs in Real-World Legal Practice

Add code
Jan 23, 2026
Viaarxiv icon