Picture for Hui-Ling Zhen

Hui-Ling Zhen

Unleashing Low-Bit Inference on Ascend NPUs: A Comprehensive Evaluation of HiFloat Formats

Add code
Feb 13, 2026
Viaarxiv icon

ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control

Add code
Feb 04, 2026
Viaarxiv icon

Beyond Speedup -- Utilizing KV Cache for Sampling and Reasoning

Add code
Jan 28, 2026
Viaarxiv icon

Benchmarking Post-Training Quantization of Large Language Models under Microscaling Floating Point Formats

Add code
Jan 14, 2026
Viaarxiv icon

SwiftMem: Fast Agentic Memory via Query-aware Indexing

Add code
Jan 13, 2026
Viaarxiv icon

Revisiting Judge Decoding from First Principles via Training-Free Distributional Divergence

Add code
Jan 08, 2026
Viaarxiv icon

What Matters For Safety Alignment?

Add code
Jan 07, 2026
Viaarxiv icon

Towards Efficient Agents: A Co-Design of Inference Architecture and System

Add code
Dec 20, 2025
Viaarxiv icon

SCOPE: Prompt Evolution for Enhancing Agent Effectiveness

Add code
Dec 17, 2025
Figure 1 for SCOPE: Prompt Evolution for Enhancing Agent Effectiveness
Figure 2 for SCOPE: Prompt Evolution for Enhancing Agent Effectiveness
Figure 3 for SCOPE: Prompt Evolution for Enhancing Agent Effectiveness
Figure 4 for SCOPE: Prompt Evolution for Enhancing Agent Effectiveness
Viaarxiv icon

MOSS: Efficient and Accurate FP8 LLM Training with Microscaling and Automatic Scaling

Add code
Nov 08, 2025
Viaarxiv icon