Picture for Lijie Wen

Lijie Wen

RLCSD: Reinforcement Learning with Contrastive On-Policy Self-Distillation

Add code
Jun 10, 2026
Viaarxiv icon

VaaWIT: Visual-Aware Adaptation of Large Language Models for Multilingual Web Image Translation

Add code
May 23, 2026
Viaarxiv icon

Do We Really Need External Tools to Mitigate Hallucinations? SIRA: Shared-Prefix Internal Reconstruction of Attribution

Add code
May 14, 2026
Viaarxiv icon

Student-in-the-Loop Chain-of-Thought Distillation via Generation-Time Selection

Add code
Apr 03, 2026
Viaarxiv icon

Meta-TTRL: A Metacognitive Framework for Self-Improving Test-Time Reinforcement Learning in Unified Multimodal Models

Add code
Mar 16, 2026
Viaarxiv icon

FlowPrefill: Decoupling Preemption from Prefill Scheduling Granularity to Mitigate Head-of-Line Blocking in LLM Serving

Add code
Feb 18, 2026
Viaarxiv icon

DR-LoRA: Dynamic Rank LoRA for Mixture-of-Experts Adaptation

Add code
Jan 08, 2026
Viaarxiv icon

d-TreeRPO: Towards More Reliable Policy Optimization for Diffusion Language Models

Add code
Dec 10, 2025
Viaarxiv icon

LiRA: Linguistic Robust Anchoring for Cross-lingual Large Language Models

Add code
Oct 16, 2025
Viaarxiv icon

REMA: A Unified Reasoning Manifold Framework for Interpreting Large Language Model

Add code
Sep 26, 2025
Viaarxiv icon