Picture for Bogdan Nicolae

Bogdan Nicolae

Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading

Add code
Oct 26, 2024
Figure 1 for Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading
Figure 2 for Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading
Figure 3 for Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading
Figure 4 for Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading
Viaarxiv icon

DataStates-LLM: Lazy Asynchronous Checkpointing for Large Language Models

Add code
Jun 15, 2024
Viaarxiv icon

Understanding Patterns of Deep Learning ModelEvolution in Network Architecture Search

Add code
Sep 22, 2023
Viaarxiv icon