Picture for Anna Choromanska

Anna Choromanska

Outer-Momentum Restarting in High-Dimensional Two-Phase Optimization

Add code
May 27, 2026
Viaarxiv icon

Worker Disagreement Reveals Sharp Directions in Local SGD

Add code
May 26, 2026
Viaarxiv icon

Understanding Quantization of Optimizer States in LLM Pre-training: Dynamics of State Staleness and Effectiveness of State Resets

Add code
Mar 17, 2026
Viaarxiv icon

Zero-Shot Cross-City Generalization in End-to-End Autonomous Driving: Self-Supervised versus Supervised Representations

Add code
Mar 12, 2026
Viaarxiv icon

Self-Supervised JEPA-based World Models for LiDAR Occupancy Completion and Forecasting

Add code
Feb 13, 2026
Viaarxiv icon

Streamlining Industrial Contract Management with Retrieval-Augmented LLMs

Add code
Nov 18, 2025
Viaarxiv icon

Adaptive Memory Momentum via a Model-Based Framework for Deep Learning Optimization

Add code
Oct 06, 2025
Viaarxiv icon

A Survey of Optimization Methods for Training DL Models: Theoretical Perspective on Convergence and Generalization

Add code
Jan 24, 2025
Figure 1 for A Survey of Optimization Methods for Training DL Models: Theoretical Perspective on Convergence and Generalization
Figure 2 for A Survey of Optimization Methods for Training DL Models: Theoretical Perspective on Convergence and Generalization
Figure 3 for A Survey of Optimization Methods for Training DL Models: Theoretical Perspective on Convergence and Generalization
Figure 4 for A Survey of Optimization Methods for Training DL Models: Theoretical Perspective on Convergence and Generalization
Viaarxiv icon

AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data

Add code
Jan 09, 2025
Figure 1 for AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data
Figure 2 for AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data
Figure 3 for AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data
Figure 4 for AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data
Viaarxiv icon

Adjacent Leader Decentralized Stochastic Gradient Descent

Add code
May 18, 2024
Figure 1 for Adjacent Leader Decentralized Stochastic Gradient Descent
Figure 2 for Adjacent Leader Decentralized Stochastic Gradient Descent
Figure 3 for Adjacent Leader Decentralized Stochastic Gradient Descent
Figure 4 for Adjacent Leader Decentralized Stochastic Gradient Descent
Viaarxiv icon