Picture for Xingwu Chen

Xingwu Chen

On the Robustness of Transformers against Context Hijacking for Linear Classification

Add code
Feb 21, 2025
Viaarxiv icon

How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression

Add code
Aug 08, 2024
Viaarxiv icon

What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks

Add code
Apr 02, 2024
Viaarxiv icon