Picture for Tianyu Guo

Tianyu Guo

Revealing the Power of Post-Training for Small Language Models via Knowledge Distillation

Add code
Sep 30, 2025
Viaarxiv icon

GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments

Add code
Sep 26, 2025
Viaarxiv icon

Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers

Add code
Jun 12, 2025
Viaarxiv icon

EFIM: Efficient Serving of LLMs for Infilling Tasks with Improved KV Cache Reuse

Add code
May 29, 2025
Viaarxiv icon

IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis

Add code
May 23, 2025
Viaarxiv icon

Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs

Add code
Apr 10, 2025
Viaarxiv icon

Unshackling Context Length: An Efficient Selective Attention Approach through Query-Key Compression

Add code
Feb 20, 2025
Figure 1 for Unshackling Context Length: An Efficient Selective Attention Approach through Query-Key Compression
Figure 2 for Unshackling Context Length: An Efficient Selective Attention Approach through Query-Key Compression
Figure 3 for Unshackling Context Length: An Efficient Selective Attention Approach through Query-Key Compression
Figure 4 for Unshackling Context Length: An Efficient Selective Attention Approach through Query-Key Compression
Viaarxiv icon

How Do LLMs Perform Two-Hop Reasoning in Context?

Add code
Feb 19, 2025
Viaarxiv icon

Retrieval-Augmented Generation by Evidence Retroactivity in LLMs

Add code
Jan 07, 2025
Viaarxiv icon

Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs

Add code
Oct 17, 2024
Figure 1 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 2 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 3 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Figure 4 for Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Viaarxiv icon