Picture for Vasudev Shyam

Vasudev Shyam

Symmetry Breaking in Transformers for Efficient and Interpretable Training

Add code
Jan 29, 2026
Viaarxiv icon

The Zamba2 Suite: Technical Report

Add code
Nov 22, 2024
Viaarxiv icon

Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters

Add code
Aug 09, 2024
Figure 1 for Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
Figure 2 for Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
Figure 3 for Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
Figure 4 for Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
Viaarxiv icon