Picture for Alvin Cheung

Alvin Cheung

Combee: Scaling Prompt Learning for Self-Improving Language Model Agents

Add code
Apr 05, 2026
Viaarxiv icon

SVG-EAR: Parameter-Free Linear Compensation for Sparse Video Generation via Error-aware Routing

Add code
Mar 09, 2026
Viaarxiv icon

Arming Data Agents with Tribal Knowledge

Add code
Feb 13, 2026
Viaarxiv icon

Qrita: High-performance Top-k and Top-p Algorithm for GPUs using Pivot-based Truncation and Selection

Add code
Feb 02, 2026
Viaarxiv icon

FrontierCS: Evolving Challenges for Evolving Intelligence

Add code
Dec 17, 2025
Figure 1 for FrontierCS: Evolving Challenges for Evolving Intelligence
Figure 2 for FrontierCS: Evolving Challenges for Evolving Intelligence
Figure 3 for FrontierCS: Evolving Challenges for Evolving Intelligence
Figure 4 for FrontierCS: Evolving Challenges for Evolving Intelligence
Viaarxiv icon

Autocomp: LLM-Driven Code Optimization for Tensor Accelerators

Add code
May 24, 2025
Viaarxiv icon

TWIX: Automatically Reconstructing Structured Data from Templatized Documents

Add code
Jan 11, 2025
Figure 1 for TWIX: Automatically Reconstructing Structured Data from Templatized Documents
Figure 2 for TWIX: Automatically Reconstructing Structured Data from Templatized Documents
Figure 3 for TWIX: Automatically Reconstructing Structured Data from Templatized Documents
Figure 4 for TWIX: Automatically Reconstructing Structured Data from Templatized Documents
Viaarxiv icon

LLM-Aided Compilation for Tensor Accelerators

Add code
Aug 06, 2024
Viaarxiv icon

Optimizing Speculative Decoding for Serving Large Language Models Using Goodput

Add code
Jun 20, 2024
Figure 1 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 2 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 3 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 4 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Viaarxiv icon

Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity

Add code
Apr 22, 2024
Figure 1 for Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Figure 2 for Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Figure 3 for Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Figure 4 for Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Viaarxiv icon