Picture for Daniel Soudry

Daniel Soudry

Provable Tempered Overfitting of Minimal Nets and Typical Nets

Add code
Oct 24, 2024
Viaarxiv icon

Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks

Add code
Oct 02, 2024
Figure 1 for Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Figure 2 for Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Figure 3 for Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Figure 4 for Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Viaarxiv icon

Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes

Add code
Jun 10, 2024
Viaarxiv icon

How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers

Add code
Feb 09, 2024
Figure 1 for How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers
Figure 2 for How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers
Figure 3 for How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers
Viaarxiv icon

Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators

Add code
Jan 25, 2024
Figure 1 for Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators
Figure 2 for Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators
Figure 3 for Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators
Figure 4 for Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators
Viaarxiv icon

The Joint Effect of Task Similarity and Overparameterization on Catastrophic Forgetting -- An Analytical Model

Add code
Jan 24, 2024
Viaarxiv icon

How do Minimum-Norm Shallow Denoisers Look in Function Space?

Add code
Nov 12, 2023
Figure 1 for How do Minimum-Norm Shallow Denoisers Look in Function Space?
Figure 2 for How do Minimum-Norm Shallow Denoisers Look in Function Space?
Figure 3 for How do Minimum-Norm Shallow Denoisers Look in Function Space?
Figure 4 for How do Minimum-Norm Shallow Denoisers Look in Function Space?
Viaarxiv icon

The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks

Add code
Jun 30, 2023
Viaarxiv icon

DropCompute: simple and more robust distributed synchronous training via compute variance reduction

Add code
Jun 18, 2023
Viaarxiv icon

Continual Learning in Linear Classification on Separable Data

Add code
Jun 06, 2023
Viaarxiv icon