Picture for Zhexi Zhang

Zhexi Zhang

Understanding Parameter Sharing in Transformers

Add code
Jun 15, 2023
Figure 1 for Understanding Parameter Sharing in Transformers
Figure 2 for Understanding Parameter Sharing in Transformers
Figure 3 for Understanding Parameter Sharing in Transformers
Figure 4 for Understanding Parameter Sharing in Transformers
Viaarxiv icon

MobileNMT: Enabling Translation in 15MB and 30ms

Add code
Jun 07, 2023
Viaarxiv icon

PARAGEN : A Parallel Generation Toolkit

Add code
Oct 07, 2022
Figure 1 for PARAGEN : A Parallel Generation Toolkit
Figure 2 for PARAGEN : A Parallel Generation Toolkit
Figure 3 for PARAGEN : A Parallel Generation Toolkit
Figure 4 for PARAGEN : A Parallel Generation Toolkit
Viaarxiv icon

ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradients Accumulation

Add code
Nov 23, 2020
Figure 1 for ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradients Accumulation
Figure 2 for ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradients Accumulation
Figure 3 for ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradients Accumulation
Figure 4 for ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradients Accumulation
Viaarxiv icon