Picture for Yiwen Kou

Yiwen Kou

Guided Discrete Diffusion for Electronic Health Record Generation

Add code
Apr 18, 2024
Viaarxiv icon

Matching the Statistical Query Lower Bound for k-sparse Parity Problems with Stochastic Gradient Descent

Add code
Apr 18, 2024
Viaarxiv icon

Fast Sampling via De-randomization for Discrete Diffusion Models

Add code
Dec 14, 2023
Viaarxiv icon

Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data

Add code
Oct 29, 2023
Figure 1 for Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data
Figure 2 for Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data
Figure 3 for Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data
Figure 4 for Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data
Viaarxiv icon

Why Does Sharpness-Aware Minimization Generalize Better Than SGD?

Add code
Oct 11, 2023
Figure 1 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Figure 2 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Figure 3 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Figure 4 for Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Viaarxiv icon

Benign Overfitting for Two-layer ReLU Networks

Add code
Mar 07, 2023
Viaarxiv icon