Picture for Federico Lebrón

Federico Lebrón

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

Add code
May 22, 2023
Viaarxiv icon