Picture for Tushar Krishna

Tushar Krishna

TURBOATTENTION: Efficient Attention Approximation For High Throughputs LLMs

Add code
Dec 11, 2024
Viaarxiv icon

MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization

Add code
Nov 08, 2024
Figure 1 for MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization
Figure 2 for MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization
Figure 3 for MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization
Figure 4 for MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization
Viaarxiv icon

AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution

Add code
Nov 05, 2024
Viaarxiv icon

LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation

Add code
Nov 04, 2024
Figure 1 for LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
Figure 2 for LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
Figure 3 for LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
Figure 4 for LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
Viaarxiv icon

CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs

Add code
Jul 07, 2024
Figure 1 for CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Figure 2 for CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Figure 3 for CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Figure 4 for CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Viaarxiv icon

FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models

Add code
Jun 28, 2024
Figure 1 for FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
Figure 2 for FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
Figure 3 for FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
Figure 4 for FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
Viaarxiv icon

SDQ: Sparse Decomposed Quantization for LLM Inference

Add code
Jun 19, 2024
Viaarxiv icon

Real-time Digital RF Emulation -- II: A Near Memory Custom Accelerator

Add code
Jun 13, 2024
Viaarxiv icon

Demystifying Platform Requirements for Diverse LLM Inference Use Cases

Add code
Jun 03, 2024
Viaarxiv icon

H3DFact: Heterogeneous 3D Integrated CIM for Factorization with Holographic Perceptual Representations

Add code
Apr 05, 2024
Figure 1 for H3DFact: Heterogeneous 3D Integrated CIM for Factorization with Holographic Perceptual Representations
Figure 2 for H3DFact: Heterogeneous 3D Integrated CIM for Factorization with Holographic Perceptual Representations
Figure 3 for H3DFact: Heterogeneous 3D Integrated CIM for Factorization with Holographic Perceptual Representations
Figure 4 for H3DFact: Heterogeneous 3D Integrated CIM for Factorization with Holographic Perceptual Representations
Viaarxiv icon