Picture for Zhanpeng Zhou

Zhanpeng Zhou

Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training

Add code
Oct 14, 2024
Viaarxiv icon

On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent

Add code
Oct 07, 2024
Figure 1 for On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
Figure 2 for On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
Figure 3 for On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
Figure 4 for On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
Viaarxiv icon

Cross-Task Linearity Emerges in the Pretraining-Finetuning Paradigm

Add code
Feb 06, 2024
Viaarxiv icon

Going Beyond Neural Network Feature Similarity: The Network Feature Complexity and Its Interpretation Using Category Theory

Add code
Oct 10, 2023
Viaarxiv icon

Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity

Add code
Jul 17, 2023
Viaarxiv icon

Defects of Convolutional Decoder Networks in Frequency Representation

Add code
Oct 17, 2022
Figure 1 for Defects of Convolutional Decoder Networks in Frequency Representation
Figure 2 for Defects of Convolutional Decoder Networks in Frequency Representation
Figure 3 for Defects of Convolutional Decoder Networks in Frequency Representation
Figure 4 for Defects of Convolutional Decoder Networks in Frequency Representation
Viaarxiv icon

Batch Normalization Is Blind to the First and Second Derivatives of the Loss

Add code
Jun 02, 2022
Figure 1 for Batch Normalization Is Blind to the First and Second Derivatives of the Loss
Figure 2 for Batch Normalization Is Blind to the First and Second Derivatives of the Loss
Figure 3 for Batch Normalization Is Blind to the First and Second Derivatives of the Loss
Figure 4 for Batch Normalization Is Blind to the First and Second Derivatives of the Loss
Viaarxiv icon

A Unified Game-Theoretic Interpretation of Adversarial Robustness

Add code
Nov 08, 2021
Figure 1 for A Unified Game-Theoretic Interpretation of Adversarial Robustness
Figure 2 for A Unified Game-Theoretic Interpretation of Adversarial Robustness
Figure 3 for A Unified Game-Theoretic Interpretation of Adversarial Robustness
Figure 4 for A Unified Game-Theoretic Interpretation of Adversarial Robustness
Viaarxiv icon

Learning Baseline Values for Shapley Values

Add code
May 22, 2021
Figure 1 for Learning Baseline Values for Shapley Values
Figure 2 for Learning Baseline Values for Shapley Values
Figure 3 for Learning Baseline Values for Shapley Values
Figure 4 for Learning Baseline Values for Shapley Values
Viaarxiv icon

Game-theoretic Understanding of Adversarially Learned Features

Add code
Mar 12, 2021
Figure 1 for Game-theoretic Understanding of Adversarially Learned Features
Figure 2 for Game-theoretic Understanding of Adversarially Learned Features
Figure 3 for Game-theoretic Understanding of Adversarially Learned Features
Figure 4 for Game-theoretic Understanding of Adversarially Learned Features
Viaarxiv icon