Picture for Wadjih Bencheikh

Wadjih Bencheikh

Optimal Gradient Checkpointing for Sparse and Recurrent Architectures using Off-Chip Memory

Add code
Dec 16, 2024
Viaarxiv icon