Picture for Yu Emma Wang

Yu Emma Wang

Machine Learning Fleet Efficiency: Analyzing and Optimizing Large-Scale Google TPU Systems with ML Productivity Goodput

Add code
Feb 10, 2025
Viaarxiv icon

Hadamard Domain Training with Integers for Class Incremental Quantized Learning

Add code
Oct 05, 2023
Viaarxiv icon

Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization

Add code
Jun 08, 2023
Viaarxiv icon

Mixed Precision Post Training Quantization of Neural Networks with Sensitivity Guided Search

Add code
Feb 07, 2023
Viaarxiv icon

AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models

Add code
Jan 21, 2022
Figure 1 for AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models
Figure 2 for AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models
Figure 3 for AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models
Figure 4 for AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models
Viaarxiv icon

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

Add code
Dec 13, 2021
Figure 1 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Figure 2 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Figure 3 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Figure 4 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Viaarxiv icon

Exploring the limits of Concurrency in ML Training on Google TPUs

Add code
Nov 07, 2020
Figure 1 for Exploring the limits of Concurrency in ML Training on Google TPUs
Figure 2 for Exploring the limits of Concurrency in ML Training on Google TPUs
Figure 3 for Exploring the limits of Concurrency in ML Training on Google TPUs
Figure 4 for Exploring the limits of Concurrency in ML Training on Google TPUs
Viaarxiv icon

Exploiting Parallelism Opportunities with Deep Learning Frameworks

Add code
Aug 13, 2019
Figure 1 for Exploiting Parallelism Opportunities with Deep Learning Frameworks
Figure 2 for Exploiting Parallelism Opportunities with Deep Learning Frameworks
Figure 3 for Exploiting Parallelism Opportunities with Deep Learning Frameworks
Figure 4 for Exploiting Parallelism Opportunities with Deep Learning Frameworks
Viaarxiv icon

Benchmarking TPU, GPU, and CPU Platforms for Deep Learning

Add code
Aug 06, 2019
Figure 1 for Benchmarking TPU, GPU, and CPU Platforms for Deep Learning
Figure 2 for Benchmarking TPU, GPU, and CPU Platforms for Deep Learning
Figure 3 for Benchmarking TPU, GPU, and CPU Platforms for Deep Learning
Figure 4 for Benchmarking TPU, GPU, and CPU Platforms for Deep Learning
Viaarxiv icon