Picture for Tushar Krishna

Tushar Krishna

AIRCHITECT v2: Learning the Hardware Accelerator Design Space through Unified Representations

Add code
Jan 17, 2025
Viaarxiv icon

Leveraging ASIC AI Chips for Homomorphic Encryption

Add code
Jan 13, 2025
Viaarxiv icon

TURBOATTENTION: Efficient Attention Approximation For High Throughputs LLMs

Add code
Dec 11, 2024
Viaarxiv icon

MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization

Add code
Nov 08, 2024
Figure 1 for MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization
Figure 2 for MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization
Figure 3 for MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization
Figure 4 for MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization
Viaarxiv icon

AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution

Add code
Nov 05, 2024
Figure 1 for AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution
Figure 2 for AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution
Figure 3 for AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution
Figure 4 for AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution
Viaarxiv icon

LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation

Add code
Nov 04, 2024
Figure 1 for LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
Figure 2 for LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
Figure 3 for LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
Figure 4 for LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
Viaarxiv icon

CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs

Add code
Jul 07, 2024
Figure 1 for CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Figure 2 for CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Figure 3 for CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Figure 4 for CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
Viaarxiv icon

FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models

Add code
Jun 28, 2024
Figure 1 for FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
Figure 2 for FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
Figure 3 for FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
Figure 4 for FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
Viaarxiv icon

SDQ: Sparse Decomposed Quantization for LLM Inference

Add code
Jun 19, 2024
Figure 1 for SDQ: Sparse Decomposed Quantization for LLM Inference
Figure 2 for SDQ: Sparse Decomposed Quantization for LLM Inference
Figure 3 for SDQ: Sparse Decomposed Quantization for LLM Inference
Figure 4 for SDQ: Sparse Decomposed Quantization for LLM Inference
Viaarxiv icon

Real-time Digital RF Emulation -- II: A Near Memory Custom Accelerator

Add code
Jun 13, 2024
Figure 1 for Real-time Digital RF Emulation -- II: A Near Memory Custom Accelerator
Figure 2 for Real-time Digital RF Emulation -- II: A Near Memory Custom Accelerator
Figure 3 for Real-time Digital RF Emulation -- II: A Near Memory Custom Accelerator
Figure 4 for Real-time Digital RF Emulation -- II: A Near Memory Custom Accelerator
Viaarxiv icon