Picture for Satwik Bhattamishra

Satwik Bhattamishra

Provably Learning Attention with Queries

Add code
Jan 23, 2026
Viaarxiv icon

A Formal Framework for Understanding Length Generalization in Transformers

Add code
Oct 03, 2024
Figure 1 for A Formal Framework for Understanding Length Generalization in Transformers
Figure 2 for A Formal Framework for Understanding Length Generalization in Transformers
Figure 3 for A Formal Framework for Understanding Length Generalization in Transformers
Figure 4 for A Formal Framework for Understanding Length Generalization in Transformers
Viaarxiv icon

Separations in the Representational Capabilities of Transformers and Recurrent Architectures

Add code
Jun 13, 2024
Viaarxiv icon

MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations

Add code
Oct 18, 2023
Figure 1 for MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations
Figure 2 for MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations
Figure 3 for MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations
Figure 4 for MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations
Viaarxiv icon

Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions

Add code
Oct 04, 2023
Figure 1 for Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions
Figure 2 for Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions
Figure 3 for Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions
Figure 4 for Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions
Viaarxiv icon

Structural Transfer Learning in NL-to-Bash Semantic Parsers

Add code
Jul 31, 2023
Figure 1 for Structural Transfer Learning in NL-to-Bash Semantic Parsers
Figure 2 for Structural Transfer Learning in NL-to-Bash Semantic Parsers
Viaarxiv icon

DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization

Add code
Jun 20, 2023
Figure 1 for DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization
Figure 2 for DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization
Figure 3 for DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization
Figure 4 for DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization
Viaarxiv icon

Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions

Add code
Nov 22, 2022
Figure 1 for Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
Figure 2 for Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
Figure 3 for Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
Figure 4 for Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
Viaarxiv icon

Revisiting the Compositional Generalization Abilities of Neural Sequence Models

Add code
Mar 14, 2022
Figure 1 for Revisiting the Compositional Generalization Abilities of Neural Sequence Models
Figure 2 for Revisiting the Compositional Generalization Abilities of Neural Sequence Models
Figure 3 for Revisiting the Compositional Generalization Abilities of Neural Sequence Models
Figure 4 for Revisiting the Compositional Generalization Abilities of Neural Sequence Models
Viaarxiv icon

Are NLP Models really able to Solve Simple Math Word Problems?

Add code
Mar 12, 2021
Figure 1 for Are NLP Models really able to Solve Simple Math Word Problems?
Figure 2 for Are NLP Models really able to Solve Simple Math Word Problems?
Figure 3 for Are NLP Models really able to Solve Simple Math Word Problems?
Figure 4 for Are NLP Models really able to Solve Simple Math Word Problems?
Viaarxiv icon