Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sanjay Lall

LORE: Lagrangian-Optimized Robust Embeddings for Visual Encoders

May 24, 2025

Borna Khodabandeh, Amirabbas Afzali, Amirhossein Afsharrad, Seyed Shahabeddin Mousavi, Sanjay Lall, Sajjad Amini, Seyed-Mohsen Moosavi-Dezfooli

Abstract:Visual encoders have become fundamental components in modern computer vision pipelines. However, ensuring robustness against adversarial perturbations remains a critical challenge. Recent efforts have explored both supervised and unsupervised adversarial fine-tuning strategies. We identify two key limitations in these approaches: (i) they often suffer from instability, especially during the early stages of fine-tuning, resulting in suboptimal convergence and degraded performance on clean data, and (ii) they exhibit a suboptimal trade-off between robustness and clean data accuracy, hindering the simultaneous optimization of both objectives. To overcome these challenges, we propose Lagrangian-Optimized Robust Embeddings (LORE), a novel unsupervised adversarial fine-tuning framework. LORE utilizes constrained optimization, which offers a principled approach to balancing competing goals, such as improving robustness while preserving nominal performance. By enforcing embedding-space proximity constraints, LORE effectively maintains clean data performance throughout adversarial fine-tuning. Extensive experiments show that LORE significantly improves zero-shot adversarial robustness with minimal degradation in clean data accuracy. Furthermore, we demonstrate the effectiveness of the adversarially fine-tuned CLIP image encoder in out-of-distribution generalization and enhancing the interpretability of image embeddings.

Via

Access Paper or Ask Questions

One Goal, Many Challenges: Robust Preference Optimization Amid Content-Aware and Multi-Source Noise

Mar 16, 2025

Amirabbas Afzali, Amirhossein Afsharrad, Seyed Shahabeddin Mousavi, Sanjay Lall

Abstract:Large Language Models (LLMs) have made significant strides in generating human-like responses, largely due to preference alignment techniques. However, these methods often assume unbiased human feedback, which is rarely the case in real-world scenarios. This paper introduces Content-Aware Noise-Resilient Preference Optimization (CNRPO), a novel framework that addresses multiple sources of content-dependent noise in preference learning. CNRPO employs a multi-objective optimization approach to separate true preferences from content-aware noises, effectively mitigating their impact. We leverage backdoor attack mechanisms to efficiently learn and control various noise sources within a single model. Theoretical analysis and extensive experiments on different synthetic noisy datasets demonstrate that CNRPO significantly improves alignment with primary human preferences while controlling for secondary noises and biases, such as response length and harmfulness.

Via

Access Paper or Ask Questions

Cooperative Multi-Agent Constrained Stochastic Linear Bandits

Oct 22, 2024

Amirhossein Afsharrad, Parisa Oftadeh, Ahmadreza Moradipari, Sanjay Lall

Abstract:In this study, we explore a collaborative multi-agent stochastic linear bandit setting involving a network of $N$ agents that communicate locally to minimize their collective regret while keeping their expected cost under a specified threshold $\tau$. Each agent encounters a distinct linear bandit problem characterized by its own reward and cost parameters, i.e., local parameters. The goal of the agents is to determine the best overall action corresponding to the average of these parameters, or so-called global parameters. In each round, an agent is randomly chosen to select an action based on its current knowledge of the system. This chosen action is then executed by all agents, then they observe their individual rewards and costs. We propose a safe distributed upper confidence bound algorithm, so called \textit{MA-OPLB}, and establish a high probability bound on its $T$-round regret. MA-OPLB utilizes an accelerated consensus method, where agents can compute an estimate of the average rewards and costs across the network by communicating the proper information with their neighbors. We show that our regret bound is of order $ \mathcal{O}\left(\frac{d}{\tau-c_0}\frac{\log(NT)^2}{\sqrt{N}}\sqrt{\frac{T}{\log(1/|\lambda_2|)}}\right)$, where $\lambda_2$ is the second largest (in absolute value) eigenvalue of the communication matrix, and $\tau-c_0$ is the known cost gap of a feasible action. We also experimentally show the performance of our proposed algorithm in different network structures.

Via

Access Paper or Ask Questions

Generalizable Spacecraft Trajectory Generation via Multimodal Learning with Transformers

Oct 15, 2024

Davide Celestini, Amirhossein Afsharrad, Daniele Gammelli, Tommaso Guffanti, Gioele Zardini, Sanjay Lall, Elisa Capello, Simone D'Amico, Marco Pavone

Figure 1 for Generalizable Spacecraft Trajectory Generation via Multimodal Learning with Transformers

Figure 2 for Generalizable Spacecraft Trajectory Generation via Multimodal Learning with Transformers

Figure 3 for Generalizable Spacecraft Trajectory Generation via Multimodal Learning with Transformers

Figure 4 for Generalizable Spacecraft Trajectory Generation via Multimodal Learning with Transformers

Abstract:Effective trajectory generation is essential for reliable on-board spacecraft autonomy. Among other approaches, learning-based warm-starting represents an appealing paradigm for solving the trajectory generation problem, effectively combining the benefits of optimization- and data-driven methods. Current approaches for learning-based trajectory generation often focus on fixed, single-scenario environments, where key scene characteristics, such as obstacle positions or final-time requirements, remain constant across problem instances. However, practical trajectory generation requires the scenario to be frequently reconfigured, making the single-scenario approach a potentially impractical solution. To address this challenge, we present a novel trajectory generation framework that generalizes across diverse problem configurations, by leveraging high-capacity transformer neural networks capable of learning from multimodal data sources. Specifically, our approach integrates transformer-based neural network models into the trajectory optimization process, encoding both scene-level information (e.g., obstacle locations, initial and goal states) and trajectory-level constraints (e.g., time bounds, fuel consumption targets) via multimodal representations. The transformer network then generates near-optimal initial guesses for non-convex optimization problems, significantly enhancing convergence speed and performance. The framework is validated through extensive simulations and real-world experiments on a free-flyer platform, achieving up to 30% cost improvement and 80% reduction in infeasible cases with respect to traditional approaches, and demonstrating robust generalization across diverse scenario variations.

* 8 pages, 6 figures, submitted to 2025 American Control Conference (ACC)

Via

Access Paper or Ask Questions

Learning Temporal Logic Predicates from Data with Statistical Guarantees

Jun 15, 2024

Emi Soroka, Rohan Sinha, Sanjay Lall

Figure 1 for Learning Temporal Logic Predicates from Data with Statistical Guarantees

Figure 2 for Learning Temporal Logic Predicates from Data with Statistical Guarantees

Figure 3 for Learning Temporal Logic Predicates from Data with Statistical Guarantees

Figure 4 for Learning Temporal Logic Predicates from Data with Statistical Guarantees

Abstract:Temporal logic rules are often used in control and robotics to provide structured, human-interpretable descriptions of high-dimensional trajectory data. These rules have numerous applications including safety validation using formal methods, constraining motion planning among autonomous agents, and classifying data. However, existing methods for learning temporal logic predicates from data provide no assurances about the correctness of the resulting predicate. We present a novel method to learn temporal logic predicates from data with finite-sample correctness guarantees. Our approach leverages expression optimization and conformal prediction to learn predicates that correctly describe future trajectories under mild assumptions with a user-defined confidence level. We provide experimental results showing the performance of our approach on a simulated trajectory dataset and perform ablation studies to understand how each component of our algorithm contributes to its performance.

Via

Access Paper or Ask Questions

Adversarial Training of Two-Layer Polynomial and ReLU Activation Networks via Convex Optimization

May 22, 2024

Daniel Kuelbs, Sanjay Lall, Mert Pilanci

Abstract:Training neural networks which are robust to adversarial attacks remains an important problem in deep learning, especially as heavily overparameterized models are adopted in safety-critical settings. Drawing from recent work which reformulates the training problems for two-layer ReLU and polynomial activation networks as convex programs, we devise a convex semidefinite program (SDP) for adversarial training of polynomial activation networks via the S-procedure. We also derive a convex SDP to compute the minimum distance from a correctly classified example to the decision boundary of a polynomial activation network. Adversarial training for two-layer ReLU activation networks has been explored in the literature, but, in contrast to prior work, we present a scalable approach which is compatible with standard machine libraries and GPU acceleration. The adversarial training SDP for polynomial activation networks leads to large increases in robust test accuracy against $\ell^\infty$ attacks on the Breast Cancer Wisconsin dataset from the UCI Machine Learning Repository. For two-layer ReLU networks, we leverage our scalable implementation to retrain the final two fully connected layers of a Pre-Activation ResNet-18 model on the CIFAR-10 dataset. Our 'robustified' model achieves higher clean and robust test accuracies than the same architecture trained with sharpness-aware minimization.

* 6 pages, 4 figures

Via

Access Paper or Ask Questions

Markov Decision Processes with Noisy State Observation

Dec 13, 2023

Amirhossein Afsharrad, Sanjay Lall

Abstract:This paper addresses the challenge of a particular class of noisy state observations in Markov Decision Processes (MDPs), a common issue in various real-world applications. We focus on modeling this uncertainty through a confusion matrix that captures the probabilities of misidentifying the true state. Our primary goal is to estimate the inherent measurement noise, and to this end, we propose two novel algorithmic approaches. The first, the method of second-order repetitive actions, is designed for efficient noise estimation within a finite time window, providing identifiable conditions for system analysis. The second approach comprises a family of Bayesian algorithms, which we thoroughly analyze and compare in terms of performance and limitations. We substantiate our theoretical findings with simulations, demonstrating the effectiveness of our methods in different scenarios, particularly highlighting their behavior in environments with varying stationary distributions. Our work advances the understanding of reinforcement learning in noisy environments, offering robust techniques for more accurate state estimation in MDPs.

Via

Access Paper or Ask Questions

Convex Methods for Constrained Linear Bandits

Nov 10, 2023

Amirhossein Afsharrad, Ahmadreza Moradipari, Sanjay Lall

Abstract:Recently, bandit optimization has received significant attention in real-world safety-critical systems that involve repeated interactions with humans. While there exist various algorithms with performance guarantees in the literature, practical implementation of the algorithms has not received as much attention. This work presents a comprehensive study on the computational aspects of safe bandit algorithms, specifically safe linear bandits, by introducing a framework that leverages convex programming tools to create computationally efficient policies. In particular, we first characterize the properties of the optimal policy for safe linear bandit problem and then propose an end-to-end pipeline of safe linear bandit algorithms that only involves solving convex problems. We also numerically evaluate the performance of our proposed methods.

Via

Access Paper or Ask Questions

Shape-Based Approach to Household Load Curve Clustering and Prediction

Feb 05, 2017

Thanchanok Teeraratkul, Daniel O'Neill, Sanjay Lall

Figure 1 for Shape-Based Approach to Household Load Curve Clustering and Prediction

Figure 2 for Shape-Based Approach to Household Load Curve Clustering and Prediction

Figure 3 for Shape-Based Approach to Household Load Curve Clustering and Prediction

Figure 4 for Shape-Based Approach to Household Load Curve Clustering and Prediction

Abstract:Consumer Demand Response (DR) is an important research and industry problem, which seeks to categorize, predict and modify consumer's energy consumption. Unfortunately, traditional clustering methods have resulted in many hundreds of clusters, with a given consumer often associated with several clusters, making it difficult to classify consumers into stable representative groups and to predict individual energy consumption patterns. In this paper, we present a shape-based approach that better classifies and predicts consumer energy consumption behavior at the household level. The method is based on Dynamic Time Warping. DTW seeks an optimal alignment between energy consumption patterns reflecting the effect of hidden patterns of regular consumer behavior. Using real consumer 24-hour load curves from Opower Corporation, our method results in a 50% reduction in the number of representative groups and an improvement in prediction accuracy measured under DTW distance. We extend the approach to estimate which electrical devices will be used and in which hours.

* 14 pages, submitted to a transaction

Via

Access Paper or Ask Questions