Picture for Abhinav Bhatele

Abhinav Bhatele

From Pixels to Prose: A Large Dataset of Dense Image Captions

Add code
Jun 14, 2024
Figure 1 for From Pixels to Prose: A Large Dataset of Dense Image Captions
Figure 2 for From Pixels to Prose: A Large Dataset of Dense Image Captions
Figure 3 for From Pixels to Prose: A Large Dataset of Dense Image Captions
Figure 4 for From Pixels to Prose: A Large Dataset of Dense Image Captions
Viaarxiv icon

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Add code
Jun 14, 2024
Viaarxiv icon

Loki: Low-Rank Keys for Efficient Sparse Attention

Add code
Jun 04, 2024
Figure 1 for Loki: Low-Rank Keys for Efficient Sparse Attention
Figure 2 for Loki: Low-Rank Keys for Efficient Sparse Attention
Figure 3 for Loki: Low-Rank Keys for Efficient Sparse Attention
Figure 4 for Loki: Low-Rank Keys for Efficient Sparse Attention
Viaarxiv icon

Transformers Can Do Arithmetic with the Right Embeddings

Add code
May 27, 2024
Figure 1 for Transformers Can Do Arithmetic with the Right Embeddings
Figure 2 for Transformers Can Do Arithmetic with the Right Embeddings
Figure 3 for Transformers Can Do Arithmetic with the Right Embeddings
Figure 4 for Transformers Can Do Arithmetic with the Right Embeddings
Viaarxiv icon

Performance-Aligned LLMs for Generating Fast Code

Add code
Apr 29, 2024
Viaarxiv icon

Can Large Language Models Write Parallel Code?

Add code
Jan 23, 2024
Viaarxiv icon

Jorge: Approximate Preconditioning for GPU-efficient Second-order Optimization

Add code
Oct 27, 2023
Viaarxiv icon

Modeling Parallel Programs using Large Language Models

Add code
Jun 29, 2023
Figure 1 for Modeling Parallel Programs using Large Language Models
Figure 2 for Modeling Parallel Programs using Large Language Models
Figure 3 for Modeling Parallel Programs using Large Language Models
Figure 4 for Modeling Parallel Programs using Large Language Models
Viaarxiv icon

Communication-minimizing Asynchronous Tensor Parallelism

Add code
May 22, 2023
Viaarxiv icon

A Novel Tensor-Expert Hybrid Parallelism Approach to Scale Mixture-of-Experts Training

Add code
Mar 11, 2023
Viaarxiv icon