Picture for Souvik Kundu

Souvik Kundu

Callie

MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization

Add code
Nov 08, 2024
Viaarxiv icon

LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding

Add code
Oct 04, 2024
Figure 1 for LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
Figure 2 for LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
Figure 3 for LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
Figure 4 for LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
Viaarxiv icon

Understanding the Performance and Estimating the Cost of LLM Fine-Tuning

Add code
Aug 08, 2024
Viaarxiv icon

MaskVD: Region Masking for Efficient Video Object Detection

Add code
Jul 16, 2024
Viaarxiv icon

Metron: Holistic Performance Evaluation Framework for LLM Inference Systems

Add code
Jul 09, 2024
Figure 1 for Metron: Holistic Performance Evaluation Framework for LLM Inference Systems
Figure 2 for Metron: Holistic Performance Evaluation Framework for LLM Inference Systems
Figure 3 for Metron: Holistic Performance Evaluation Framework for LLM Inference Systems
Figure 4 for Metron: Holistic Performance Evaluation Framework for LLM Inference Systems
Viaarxiv icon

CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs

Add code
Jul 07, 2024
Figure 1 for CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Figure 2 for CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Figure 3 for CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Figure 4 for CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Viaarxiv icon

LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation

Add code
Jun 18, 2024
Viaarxiv icon

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

Add code
Jun 11, 2024
Viaarxiv icon

Demystifying Platform Requirements for Diverse LLM Inference Use Cases

Add code
Jun 03, 2024
Viaarxiv icon

AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models

Add code
Mar 20, 2024
Viaarxiv icon