Picture for Sidak Pal Singh

Sidak Pal Singh

ETH Zurich

Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks

Add code
Nov 04, 2024
Viaarxiv icon

What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis

Add code
Oct 14, 2024
Viaarxiv icon

Local vs Global continual learning

Add code
Jul 23, 2024
Viaarxiv icon

Landscaping Linear Mode Connectivity

Add code
Jun 24, 2024
Viaarxiv icon

Hallmarks of Optimization Trajectories in Neural Networks and LLMs: The Lengths, Bends, and Dead Ends

Add code
Mar 12, 2024
Viaarxiv icon

Towards Meta-Pruning via Optimal Transport

Add code
Feb 13, 2024
Viaarxiv icon

Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers

Add code
Nov 29, 2023
Viaarxiv icon

Transformer Fusion with Optimal Transport

Add code
Oct 15, 2023
Viaarxiv icon

Towards guarantees for parameter isolation in continual learning

Add code
Oct 02, 2023
Viaarxiv icon

On the curvature of the loss landscape

Add code
Jul 10, 2023
Viaarxiv icon