Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Harsh Mishra

Accelerated Neural Network Training with Rooted Logistic Objectives

Oct 05, 2023

Zhu Wang, Praveen Raj Veluswami, Harsh Mishra, Sathya N. Ravi

Abstract:Many neural networks deployed in the real world scenarios are trained using cross entropy based loss functions. From the optimization perspective, it is known that the behavior of first order methods such as gradient descent crucially depend on the separability of datasets. In fact, even in the most simplest case of binary classification, the rate of convergence depends on two factors: (1) condition number of data matrix, and (2) separability of the dataset. With no further pre-processing techniques such as over-parametrization, data augmentation etc., separability is an intrinsic quantity of the data distribution under consideration. We focus on the landscape design of the logistic function and derive a novel sequence of {\em strictly} convex functions that are at least as strict as logistic loss. The minimizers of these functions coincide with those of the minimum norm solution wherever possible. The strict convexity of the derived function can be extended to finetune state-of-the-art models and applications. In empirical experimental analysis, we apply our proposed rooted logistic objective to multiple deep models, e.g., fully-connected neural networks and transformers, on various of classification benchmarks. Our results illustrate that training with rooted loss function is converged faster and gains performance improvements. Furthermore, we illustrate applications of our novel rooted loss function in generative modeling based downstream applications, such as finetuning StyleGAN model with the rooted loss. The code implementing our losses and models can be found here for open source software development purposes: https://anonymous.4open.science/r/rooted_loss.

Via

Access Paper or Ask Questions

Flag Aggregator: Scalable Distributed Training under Failures and Augmented Losses using Convex Optimization

Feb 12, 2023

Hamidreza Almasi, Harsh Mishra, Balajee Vamanan, Sathya N. Ravi

Figure 1 for Flag Aggregator: Scalable Distributed Training under Failures and Augmented Losses using Convex Optimization

Figure 2 for Flag Aggregator: Scalable Distributed Training under Failures and Augmented Losses using Convex Optimization

Figure 3 for Flag Aggregator: Scalable Distributed Training under Failures and Augmented Losses using Convex Optimization

Figure 4 for Flag Aggregator: Scalable Distributed Training under Failures and Augmented Losses using Convex Optimization

Abstract:Modern ML applications increasingly rely on complex deep learning models and large datasets. There has been an exponential growth in the amount of computation needed to train the largest models. Therefore, to scale computation and data, these models are inevitably trained in a distributed manner in clusters of nodes, and their updates are aggregated before being applied to the model. However, a distributed setup is prone to byzantine failures of individual nodes, components, and software. With data augmentation added to these settings, there is a critical need for robust and efficient aggregation systems. We extend the current state-of-the-art aggregators and propose an optimization-based subspace estimator by modeling pairwise distances as quadratic functions by utilizing the recently introduced Flag Median problem. The estimator in our loss function favors the pairs that preserve the norm of the difference vector. We theoretically show that our approach enhances the robustness of state-of-the-art byzantine resilient aggregators. Also, we evaluate our method with different tasks in a distributed setup with a parameter server architecture and show its communication efficiency while maintaining similar accuracy. The code is publicly available at https://github.com/hamidralmasi/FlagAggregator

Via

Access Paper or Ask Questions

Using Intermediate Forward Iterates for Intermediate Generator Optimization

Feb 05, 2023

Harsh Mishra, Jurijs Nazarovs, Manmohan Dogra, Sathya N. Ravi

Figure 1 for Using Intermediate Forward Iterates for Intermediate Generator Optimization

Figure 2 for Using Intermediate Forward Iterates for Intermediate Generator Optimization

Figure 3 for Using Intermediate Forward Iterates for Intermediate Generator Optimization

Figure 4 for Using Intermediate Forward Iterates for Intermediate Generator Optimization

Abstract:Score-based models have recently been introduced as a richer framework to model distributions in high dimensions and are generally more suitable for generative tasks. In score-based models, a generative task is formulated using a parametric model (such as a neural network) to directly learn the gradient of such high dimensional distributions, instead of the density functions themselves, as is done traditionally. From the mathematical point of view, such gradient information can be utilized in reverse by stochastic sampling to generate diverse samples. However, from a computational perspective, existing score-based models can be efficiently trained only if the forward or the corruption process can be computed in closed form. By using the relationship between the process and layers in a feed-forward network, we derive a backpropagation-based procedure which we call Intermediate Generator Optimization to utilize intermediate iterates of the process with negligible computational overhead. The main advantage of IGO is that it can be incorporated into any standard autoencoder pipeline for the generative task. We analyze the sample complexity properties of IGO to solve downstream tasks like Generative PCA. We show applications of the IGO on two dense predictive tasks viz., image extrapolation, and point cloud denoising. Our experiments indicate that obtaining an ensemble of generators for various time points is possible using first-order methods.

Via

Access Paper or Ask Questions