Picture for Xiaoxuan Liu

Xiaoxuan Liu

MoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs

Add code
Nov 18, 2024
Viaarxiv icon

UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images

Add code
Aug 26, 2024
Figure 1 for UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images
Figure 2 for UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images
Figure 3 for UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images
Figure 4 for UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images
Viaarxiv icon

UNetMamba: Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images

Add code
Aug 21, 2024
Figure 1 for UNetMamba: Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images
Figure 2 for UNetMamba: Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images
Figure 3 for UNetMamba: Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images
Figure 4 for UNetMamba: Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images
Viaarxiv icon

Optimizing Speculative Decoding for Serving Large Language Models Using Goodput

Add code
Jun 20, 2024
Figure 1 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 2 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 3 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 4 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Viaarxiv icon

Towards Clinical AI Fairness: Filling Gaps in the Puzzle

Add code
May 28, 2024
Viaarxiv icon

Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity

Add code
Apr 22, 2024
Viaarxiv icon

Computing in the Era of Large Generative Models: From Cloud-Native to AI-Native

Add code
Jan 17, 2024
Viaarxiv icon

Learned Best-Effort LLM Serving

Add code
Jan 15, 2024
Figure 1 for Learned Best-Effort LLM Serving
Figure 2 for Learned Best-Effort LLM Serving
Figure 3 for Learned Best-Effort LLM Serving
Figure 4 for Learned Best-Effort LLM Serving
Viaarxiv icon

Online Speculative Decoding

Add code
Oct 17, 2023
Viaarxiv icon

QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources

Add code
Oct 11, 2023
Viaarxiv icon