Picture for Peter Richtárik

Peter Richtárik

King Abdullah University of Science and Technology

On the Convergence of DP-SGD with Adaptive Clipping

Add code
Dec 27, 2024
Viaarxiv icon

Differentially Private Random Block Coordinate Descent

Add code
Dec 22, 2024
Viaarxiv icon

MARINA-P: Superior Performance in Non-smooth Federated Optimization with Adaptive Stepsizes

Add code
Dec 22, 2024
Viaarxiv icon

Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization

Add code
Dec 03, 2024
Viaarxiv icon

Pushing the Limits of Large Language Model Quantization via the Linearity Theorem

Add code
Nov 26, 2024
Figure 1 for Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Figure 2 for Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Figure 3 for Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Figure 4 for Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Viaarxiv icon

Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum

Add code
Oct 22, 2024
Viaarxiv icon

Tighter Performance Theory of FedExProx

Add code
Oct 20, 2024
Viaarxiv icon

Unlocking FedNL: Self-Contained Compute-Optimized Implementation

Add code
Oct 11, 2024
Figure 1 for Unlocking FedNL: Self-Contained Compute-Optimized Implementation
Figure 2 for Unlocking FedNL: Self-Contained Compute-Optimized Implementation
Figure 3 for Unlocking FedNL: Self-Contained Compute-Optimized Implementation
Figure 4 for Unlocking FedNL: Self-Contained Compute-Optimized Implementation
Viaarxiv icon

Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation

Add code
Oct 10, 2024
Viaarxiv icon

MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times

Add code
Oct 05, 2024
Figure 1 for MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times
Figure 2 for MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times
Figure 3 for MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times
Figure 4 for MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times
Viaarxiv icon