Picture for James Martens

James Martens

Dj

Normalization and effective learning rates in reinforcement learning

Add code
Jul 01, 2024
Viaarxiv icon

Disentangling the Causes of Plasticity Loss in Neural Networks

Add code
Feb 29, 2024
Viaarxiv icon

Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation

Add code
Feb 20, 2023
Viaarxiv icon

Pre-training via Denoising for Molecular Property Prediction

Add code
May 31, 2022
Figure 1 for Pre-training via Denoising for Molecular Property Prediction
Figure 2 for Pre-training via Denoising for Molecular Property Prediction
Figure 3 for Pre-training via Denoising for Molecular Property Prediction
Figure 4 for Pre-training via Denoising for Molecular Property Prediction
Viaarxiv icon

Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers

Add code
Mar 15, 2022
Figure 1 for Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
Figure 2 for Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
Figure 3 for Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
Figure 4 for Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
Viaarxiv icon

Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping

Add code
Oct 05, 2021
Viaarxiv icon

On the validity of kernel approximations for orthogonally-initialized neural networks

Add code
Apr 13, 2021
Viaarxiv icon

Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model

Add code
Jul 09, 2019
Figure 1 for Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model
Figure 2 for Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model
Figure 3 for Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model
Figure 4 for Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model
Viaarxiv icon

Adversarial Robustness through Local Linearization

Add code
Jul 04, 2019
Figure 1 for Adversarial Robustness through Local Linearization
Figure 2 for Adversarial Robustness through Local Linearization
Figure 3 for Adversarial Robustness through Local Linearization
Figure 4 for Adversarial Robustness through Local Linearization
Viaarxiv icon

Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks

Add code
May 27, 2019
Figure 1 for Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks
Viaarxiv icon