Picture for Shikhar Tuli

Shikhar Tuli

FlexiGPT: Pruning and Extending Large Language Models with Low-Rank Weight Sharing

Add code
Jan 24, 2025
Viaarxiv icon

MoDeGPT: Modular Decomposition for Large Language Model Compression

Add code
Aug 20, 2024
Viaarxiv icon

DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token Sampling

Add code
May 01, 2024
Figure 1 for DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token Sampling
Figure 2 for DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token Sampling
Figure 3 for DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token Sampling
Figure 4 for DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token Sampling
Viaarxiv icon

BREATHE: Second-Order Gradients and Heteroscedastic Emulation based Design Space Exploration

Add code
Aug 16, 2023
Viaarxiv icon

TransCODE: Co-design of Transformers and Accelerators for Efficient Training and Inference

Add code
Mar 27, 2023
Viaarxiv icon

EdgeTran: Co-designing Transformers for Efficient Inference on Mobile Edge Platforms

Add code
Mar 24, 2023
Viaarxiv icon

AccelTran: A Sparsity-Aware Accelerator for Dynamic Inference with Transformers

Add code
Feb 28, 2023
Viaarxiv icon

CODEBench: A Neural Architecture and Hardware Accelerator Co-Design Framework

Add code
Dec 07, 2022
Viaarxiv icon

FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?

Add code
May 23, 2022
Figure 1 for FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?
Figure 2 for FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?
Figure 3 for FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?
Figure 4 for FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?
Viaarxiv icon

Generative Optimization Networks for Memory Efficient Data Generation

Add code
Oct 07, 2021
Figure 1 for Generative Optimization Networks for Memory Efficient Data Generation
Figure 2 for Generative Optimization Networks for Memory Efficient Data Generation
Figure 3 for Generative Optimization Networks for Memory Efficient Data Generation
Viaarxiv icon