Picture for Tian Zhao

Tian Zhao

Hybrid JIT-CUDA Graph Optimization for Low-Latency Large Language Model Inference

Add code
Apr 25, 2026
Viaarxiv icon

Evaluating CUDA Tile for AI Workloads on Hopper and Blackwell GPUs

Add code
Apr 25, 2026
Viaarxiv icon

Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions

Add code
Dec 03, 2024
Figure 1 for Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions
Figure 2 for Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions
Figure 3 for Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions
Figure 4 for Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions
Viaarxiv icon

GCtx-UNet: Efficient Network for Medical Image Segmentation

Add code
Jun 09, 2024
Viaarxiv icon

Transfer Learning for Microstructure Segmentation with CS-UNet: A Hybrid Algorithm with Transformer and CNN Encoders

Add code
Aug 26, 2023
Viaarxiv icon

Computer Vision Methods for the Microstructural Analysis of Materials: The State-of-the-art and Future Perspectives

Add code
Jul 29, 2022
Viaarxiv icon

Efficient Memory Partitioning in Software Defined Hardware

Add code
Feb 02, 2022
Figure 1 for Efficient Memory Partitioning in Software Defined Hardware
Figure 2 for Efficient Memory Partitioning in Software Defined Hardware
Figure 3 for Efficient Memory Partitioning in Software Defined Hardware
Figure 4 for Efficient Memory Partitioning in Software Defined Hardware
Viaarxiv icon

Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator

Add code
Sep 26, 2019
Figure 1 for Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator
Figure 2 for Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator
Figure 3 for Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator
Figure 4 for Serving Recurrent Neural Networks Efficiently with a Spatial Accelerator
Viaarxiv icon

Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark

Add code
Jun 04, 2018
Figure 1 for Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark
Figure 2 for Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark
Figure 3 for Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark
Figure 4 for Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark
Viaarxiv icon

DeepDSL: A Compilation-based Domain-Specific Language for Deep Learning

Add code
Jan 09, 2017
Figure 1 for DeepDSL: A Compilation-based Domain-Specific Language for Deep Learning
Figure 2 for DeepDSL: A Compilation-based Domain-Specific Language for Deep Learning
Figure 3 for DeepDSL: A Compilation-based Domain-Specific Language for Deep Learning
Figure 4 for DeepDSL: A Compilation-based Domain-Specific Language for Deep Learning
Viaarxiv icon