Picture for Xiaolin Wang

Xiaolin Wang

InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference

Add code
Sep 08, 2024
Figure 1 for InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference
Figure 2 for InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference
Figure 3 for InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference
Figure 4 for InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference
Viaarxiv icon

Characterization of Large Language Model Development in the Datacenter

Add code
Mar 12, 2024
Viaarxiv icon

Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision

Add code
Jun 01, 2022
Figure 1 for Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision
Figure 2 for Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision
Figure 3 for Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision
Figure 4 for Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision
Viaarxiv icon

GRAPHSPY: Fused Program Semantic-Level Embedding via Graph Neural Networks for Dead Store Detection

Add code
Nov 18, 2020
Figure 1 for GRAPHSPY: Fused Program Semantic-Level Embedding via Graph Neural Networks for Dead Store Detection
Figure 2 for GRAPHSPY: Fused Program Semantic-Level Embedding via Graph Neural Networks for Dead Store Detection
Figure 3 for GRAPHSPY: Fused Program Semantic-Level Embedding via Graph Neural Networks for Dead Store Detection
Figure 4 for GRAPHSPY: Fused Program Semantic-Level Embedding via Graph Neural Networks for Dead Store Detection
Viaarxiv icon

CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++

Add code
Jun 02, 2018
Figure 1 for CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++
Figure 2 for CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++
Figure 3 for CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++
Figure 4 for CytonMT: an Efficient Neural Machine Translation Open-source Toolkit Implemented in C++
Viaarxiv icon

CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++

Add code
Apr 14, 2018
Figure 1 for CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++
Figure 2 for CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++
Figure 3 for CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++
Figure 4 for CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++
Viaarxiv icon