Picture for Xiangxi Mo

Xiangxi Mo

Pie: Pooling CPU Memory for LLM Inference

Add code
Nov 14, 2024
Viaarxiv icon

Optimizing Speculative Decoding for Serving Large Language Models Using Goodput

Add code
Jun 20, 2024
Figure 1 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 2 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 3 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Figure 4 for Optimizing Speculative Decoding for Serving Large Language Models Using Goodput
Viaarxiv icon

Optimizing LLM Queries in Relational Workloads

Add code
Mar 09, 2024
Viaarxiv icon

Context-Aware Streaming Perception in Dynamic Environments

Add code
Aug 16, 2022
Figure 1 for Context-Aware Streaming Perception in Dynamic Environments
Figure 2 for Context-Aware Streaming Perception in Dynamic Environments
Figure 3 for Context-Aware Streaming Perception in Dynamic Environments
Figure 4 for Context-Aware Streaming Perception in Dynamic Environments
Viaarxiv icon

Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning

Add code
Jun 12, 2019
Figure 1 for Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning
Figure 2 for Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning
Figure 3 for Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning
Figure 4 for Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning
Viaarxiv icon

The OoO VLIW JIT Compiler for GPU Inference

Add code
Jan 31, 2019
Figure 1 for The OoO VLIW JIT Compiler for GPU Inference
Figure 2 for The OoO VLIW JIT Compiler for GPU Inference
Figure 3 for The OoO VLIW JIT Compiler for GPU Inference
Figure 4 for The OoO VLIW JIT Compiler for GPU Inference
Viaarxiv icon