Picture for Mingu Lee

Mingu Lee

Double-P: Hierarchical Top-P Sparse Attention for Long-Context LLMs

Add code
Feb 05, 2026
Viaarxiv icon

Fast Forward: Accelerating LLM Prefill with Predictive FFN Sparsity

Add code
Jan 30, 2026
Viaarxiv icon

KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments

Add code
Apr 23, 2025
Viaarxiv icon

KeDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments

Add code
Apr 21, 2025
Viaarxiv icon

CAOTE: KV Caching through Attention Output Error based Token Eviction

Add code
Apr 18, 2025
Viaarxiv icon

AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability

Add code
Oct 24, 2024
Figure 1 for AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability
Figure 2 for AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability
Figure 3 for AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability
Figure 4 for AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability
Viaarxiv icon

Live Fitness Coaching as a Testbed for Situated Interaction

Add code
Jul 11, 2024
Figure 1 for Live Fitness Coaching as a Testbed for Situated Interaction
Figure 2 for Live Fitness Coaching as a Testbed for Situated Interaction
Figure 3 for Live Fitness Coaching as a Testbed for Situated Interaction
Figure 4 for Live Fitness Coaching as a Testbed for Situated Interaction
Viaarxiv icon

ToSA: Token Selective Attention for Efficient Vision Transformers

Add code
Jun 13, 2024
Figure 1 for ToSA: Token Selective Attention for Efficient Vision Transformers
Figure 2 for ToSA: Token Selective Attention for Efficient Vision Transformers
Figure 3 for ToSA: Token Selective Attention for Efficient Vision Transformers
Figure 4 for ToSA: Token Selective Attention for Efficient Vision Transformers
Viaarxiv icon

On Speculative Decoding for Multimodal Large Language Models

Add code
Apr 13, 2024
Figure 1 for On Speculative Decoding for Multimodal Large Language Models
Figure 2 for On Speculative Decoding for Multimodal Large Language Models
Figure 3 for On Speculative Decoding for Multimodal Large Language Models
Figure 4 for On Speculative Decoding for Multimodal Large Language Models
Viaarxiv icon

HyperCLOVA X Technical Report

Add code
Apr 13, 2024
Viaarxiv icon