Picture for Peter Richtárik

Peter Richtárik

King Abdullah University of Science and Technology

Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization

Add code
Dec 03, 2024
Viaarxiv icon

Pushing the Limits of Large Language Model Quantization via the Linearity Theorem

Add code
Nov 26, 2024
Figure 1 for Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Figure 2 for Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Figure 3 for Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Figure 4 for Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Viaarxiv icon

Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum

Add code
Oct 22, 2024
Viaarxiv icon

Tighter Performance Theory of FedExProx

Add code
Oct 20, 2024
Viaarxiv icon

Unlocking FedNL: Self-Contained Compute-Optimized Implementation

Add code
Oct 11, 2024
Figure 1 for Unlocking FedNL: Self-Contained Compute-Optimized Implementation
Figure 2 for Unlocking FedNL: Self-Contained Compute-Optimized Implementation
Figure 3 for Unlocking FedNL: Self-Contained Compute-Optimized Implementation
Figure 4 for Unlocking FedNL: Self-Contained Compute-Optimized Implementation
Viaarxiv icon

Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation

Add code
Oct 10, 2024
Viaarxiv icon

MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times

Add code
Oct 05, 2024
Figure 1 for MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times
Figure 2 for MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times
Figure 3 for MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times
Figure 4 for MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times
Viaarxiv icon

On the Convergence of FedProx with Extrapolation and Inexact Prox

Add code
Oct 02, 2024
Viaarxiv icon

Cohort Squeeze: Beyond a Single Communication Round per Cohort in Cross-Device Federated Learning

Add code
Jun 03, 2024
Viaarxiv icon

Prune at the Clients, Not the Server: Accelerated Sparse Training in Federated Learning

Add code
May 31, 2024
Viaarxiv icon