Picture for Haibo Chen

Haibo Chen

Unifying KV Cache Compression for Large Language Models with LeanKV

Add code
Dec 04, 2024
Figure 1 for Unifying KV Cache Compression for Large Language Models with LeanKV
Figure 2 for Unifying KV Cache Compression for Large Language Models with LeanKV
Figure 3 for Unifying KV Cache Compression for Large Language Models with LeanKV
Figure 4 for Unifying KV Cache Compression for Large Language Models with LeanKV
Viaarxiv icon

PowerInfer-2: Fast Large Language Model Inference on a Smartphone

Add code
Jun 12, 2024
Figure 1 for PowerInfer-2: Fast Large Language Model Inference on a Smartphone
Figure 2 for PowerInfer-2: Fast Large Language Model Inference on a Smartphone
Figure 3 for PowerInfer-2: Fast Large Language Model Inference on a Smartphone
Figure 4 for PowerInfer-2: Fast Large Language Model Inference on a Smartphone
Viaarxiv icon

Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters

Add code
Jun 11, 2024
Figure 1 for Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters
Figure 2 for Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters
Figure 3 for Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters
Figure 4 for Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters
Viaarxiv icon

Characterizing the Dilemma of Performance and Index Size in Billion-Scale Vector Search and Breaking It with Second-Tier Memory

Add code
May 07, 2024
Viaarxiv icon

PNeSM: Arbitrary 3D Scene Stylization via Prompt-Based Neural Style Mapping

Add code
Mar 13, 2024
Viaarxiv icon

Attack Deterministic Conditional Image Generative Models for Diverse and Controllable Generation

Add code
Mar 13, 2024
Viaarxiv icon

PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU

Add code
Dec 16, 2023
Figure 1 for PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Figure 2 for PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Figure 3 for PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Figure 4 for PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Viaarxiv icon

TSSAT: Two-Stage Statistics-Aware Transformation for Artistic Style Transfer

Add code
Sep 12, 2023
Viaarxiv icon

An Overview of Resource Allocation in Integrated Sensing and Communication

Add code
May 15, 2023
Viaarxiv icon

Overview and Performance Analysis of Various Waveforms in High Mobility Scenarios

Add code
Feb 28, 2023
Viaarxiv icon