Picture for Sheng Zha

Sheng Zha

Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning

Add code
Sep 02, 2024
Viaarxiv icon

DEM: Distribution Edited Model for Training with Mixed Data Distributions

Add code
Jun 21, 2024
Viaarxiv icon

Pre-training Differentially Private Models with Limited Public Data

Add code
Feb 28, 2024
Figure 1 for Pre-training Differentially Private Models with Limited Public Data
Figure 2 for Pre-training Differentially Private Models with Limited Public Data
Figure 3 for Pre-training Differentially Private Models with Limited Public Data
Figure 4 for Pre-training Differentially Private Models with Limited Public Data
Viaarxiv icon

Extreme Miscalibration and the Illusion of Adversarial Robustness

Add code
Feb 27, 2024
Viaarxiv icon

Zero redundancy distributed learning with differential privacy

Add code
Nov 20, 2023
Viaarxiv icon

On the accuracy and efficiency of group-wise clipping in differentially private optimization

Add code
Oct 30, 2023
Viaarxiv icon

Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer

Add code
Oct 19, 2023
Viaarxiv icon

Coupling public and private gradient provably helps optimization

Add code
Oct 02, 2023
Viaarxiv icon

HYTREL: Hypergraph-enhanced Tabular Data Representation Learning

Add code
Jul 14, 2023
Viaarxiv icon

Large Language Models of Code Fail at Completing Code with Potential Bugs

Add code
Jun 06, 2023
Viaarxiv icon