Picture for Harry Dong

Harry Dong

Towards Low-bit Communication for Tensor Parallel LLM Inference

Add code
Nov 12, 2024
Figure 1 for Towards Low-bit Communication for Tensor Parallel LLM Inference
Figure 2 for Towards Low-bit Communication for Tensor Parallel LLM Inference
Figure 3 for Towards Low-bit Communication for Tensor Parallel LLM Inference
Viaarxiv icon

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Add code
Oct 28, 2024
Viaarxiv icon

Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information

Add code
Oct 07, 2024
Figure 1 for Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information
Figure 2 for Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information
Figure 3 for Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information
Figure 4 for Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information
Viaarxiv icon

Prompt-prompted Mixture of Experts for Efficient LLM Generation

Add code
Apr 05, 2024
Viaarxiv icon

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

Add code
Feb 14, 2024
Viaarxiv icon

A Lightweight Transformer for Faster and Robust EBSD Data Collection

Add code
Aug 18, 2023
Viaarxiv icon

Deep Unfolded Tensor Robust PCA with Self-supervised Learning

Add code
Dec 21, 2022
Viaarxiv icon

Fast and Provable Tensor Robust Principal Component Analysis via Scaled Gradient Descent

Add code
Jun 18, 2022
Figure 1 for Fast and Provable Tensor Robust Principal Component Analysis via Scaled Gradient Descent
Figure 2 for Fast and Provable Tensor Robust Principal Component Analysis via Scaled Gradient Descent
Figure 3 for Fast and Provable Tensor Robust Principal Component Analysis via Scaled Gradient Descent
Figure 4 for Fast and Provable Tensor Robust Principal Component Analysis via Scaled Gradient Descent
Viaarxiv icon