Picture for Guohao Dai

Guohao Dai

FrameFusion: Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models

Add code
Dec 30, 2024
Viaarxiv icon

MBQ: Modality-Balanced Quantization for Large Vision-Language Models

Add code
Dec 27, 2024
Viaarxiv icon

E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling

Add code
Dec 19, 2024
Viaarxiv icon

Automating Energy-Efficient GPU Kernel Generation: A Fast Search-Based Compilation Approach

Add code
Nov 28, 2024
Viaarxiv icon

SoftmAP: Software-Hardware Co-design for Integer-Only Softmax on Associative Processors

Add code
Nov 26, 2024
Viaarxiv icon

Efficient and Effective Retrieval of Dense-Sparse Hybrid Vectors using Graph-based Approximate Nearest Neighbor Search

Add code
Oct 27, 2024
Viaarxiv icon

Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective

Add code
Oct 06, 2024
Figure 1 for Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Figure 2 for Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Figure 3 for Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Figure 4 for Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Viaarxiv icon

Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding

Add code
Oct 02, 2024
Figure 1 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Figure 2 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Figure 3 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Figure 4 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Viaarxiv icon

MARCA: Mamba Accelerator with ReConfigurable Architecture

Add code
Sep 16, 2024
Figure 1 for MARCA: Mamba Accelerator with ReConfigurable Architecture
Figure 2 for MARCA: Mamba Accelerator with ReConfigurable Architecture
Figure 3 for MARCA: Mamba Accelerator with ReConfigurable Architecture
Figure 4 for MARCA: Mamba Accelerator with ReConfigurable Architecture
Viaarxiv icon

CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios

Add code
Sep 16, 2024
Figure 1 for CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Figure 2 for CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Figure 3 for CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Figure 4 for CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Viaarxiv icon