Abstract:In this article, we consider the problem of unconstrained time-varying convex optimization, where the cost function changes with time. We provide an in-depth technical analysis of the problem and argue why freezing the cost at each time step and taking finite steps toward the minimizer is not the best tracking solution for this problem. We propose a set of algorithms that by taking into account the temporal variation of the cost aim to reduce the tracking error of the time-varying minimizer of the problem. The main contribution of our work is that our proposed algorithms only require the first-order derivatives of the cost function with respect to the decision variable. This approach significantly reduces computational cost compared to the existing algorithms, which use the inverse of the Hessian of the cost. Specifically, the proposed algorithms reduce the computational cost from $O(n^3)$ to $O(n)$ per timestep, where $n$ is the size of the decision variable. Avoiding the inverse of the Hessian also makes our algorithms applicable to non-convex optimization problems. We refer to these algorithms as $O(n)$-algorithms. These $O(n)$-algorithms are designed to solve the problem for different scenarios based on the available temporal information about the cost. We illustrate our results through various examples, including the solution of a model predictive control problem framed as a convex optimization problem with a streaming time-varying cost function.
Abstract:Federated learning (FL) has gained considerable popularity for distributed machine learning due to its ability to preserve the privacy of participating agents by eliminating the need for data aggregation. Nevertheless, communication costs between agents and the central server in FL are substantial in large-scale problems and remain a limiting factor for this algorithm. This paper introduces an innovative algorithm, called \emph{FedScalar}, within the federated learning framework aimed at improving communication efficiency. Unlike traditional FL methods that require agents to send high-dimensional vectors to the server, \emph{FedScalar} enables agents to communicate updates using a single scalar. Each agent encodes its updated model parameters into a scalar through the inner product between its local update difference and a random vector, which is then transmitted to the server. The server decodes this information by projecting the averaged scalar values onto the random vector. Our method thereby significantly reduces communication overhead. Technically, we demonstrate that the proposed algorithm achieves a convergence rate of $O(1/\sqrt{K})$ to a stationary point for smooth, non-convex loss functions. Additionally, our analysis shows that altering the underlying distribution of the random vector generated by the server can reduce the variance during the aggregation step of the algorithm. Finally, we validate the performance and communication efficiency of our algorithm with numerical simulations.
Abstract:Training a deep neural network using gradient-based methods necessitates the calculation of gradients at each level. However, using backpropagation or reverse mode differentiation, to calculate the gradients necessities significant memory consumption, rendering backpropagation an inefficient method for computing gradients. This paper focuses on analyzing the performance of the well-known Frank-Wolfe algorithm, a.k.a. conditional gradient algorithm by having access to the forward mode of automatic differentiation to compute gradients. We provide in-depth technical details that show the proposed Algorithm does converge to the optimal solution with a sub-linear rate of convergence by having access to the noisy estimate of the true gradient obtained in the forward mode of automated differentiation, referred to as the Projected Forward Gradient. In contrast, the standard Frank-Wolfe algorithm, when provided with access to the Projected Forward Gradient, fails to converge to the optimal solution. We demonstrate the convergence attributes of our proposed algorithms using a numerical example.
Abstract:This paper proposes a set of novel optimization algorithms for solving a class of convex optimization problems with time-varying streaming cost function. We develop an approach to track the optimal solution with a bounded error. Unlike the existing results, our algorithm is executed only by using the first-order derivatives of the cost function which makes it computationally efficient for optimization with time-varying cost function. We compare our algorithms to the gradient descent algorithm and show why gradient descent is not an effective solution for optimization problems with time-varying cost. Several examples including solving a model predictive control problem cast as a convex optimization problem with a streaming time-varying cost function demonstrate our results.