Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andrew Silva

Shared Autonomy for Proximal Teaching

Feb 27, 2025

Megha Srivastava, Reihaneh Iranmanesh, Yuchen Cui, Deepak Gopinath, Emily Sumner, Andrew Silva, Laporsha Dees, Guy Rosman, Dorsa Sadigh

Abstract:Motor skill learning often requires experienced professionals who can provide personalized instruction. Unfortunately, the availability of high-quality training can be limited for specialized tasks, such as high performance racing. Several recent works have leveraged AI-assistance to improve instruction of tasks ranging from rehabilitation to surgical robot tele-operation. However, these works often make simplifying assumptions on the student learning process, and fail to model how a teacher's assistance interacts with different individuals' abilities when determining optimal teaching strategies. Inspired by the idea of scaffolding from educational psychology, we leverage shared autonomy, a framework for combining user inputs with robot autonomy, to aid with curriculum design. Our key insight is that the way a student's behavior improves in the presence of assistance from an autonomous agent can highlight which sub-skills might be most ``learnable'' for the student, or within their Zone of Proximal Development. We use this to design Z-COACH, a method for using shared autonomy to provide personalized instruction targeting interpretable task sub-skills. In a user study (n=50), where we teach high performance racing in a simulated environment of the Thunderhill Raceway Park with the CARLA Autonomous Driving simulator, we show that Z-COACH helps identify which skills each student should first practice, leading to an overall improvement in driving time, behavior, and smoothness. Our work shows that increasingly available semi-autonomous capabilities (e.g. in vehicles, robots) can not only assist human users, but also help *teach* them.

* Accepted to ACM/IEEE International Conference on Human-Robot Interaction, 2025

Via

Access Paper or Ask Questions

Dreaming to Assist: Learning to Align with Human Objectives for Shared Control in High-Speed Racing

Oct 14, 2024

Jonathan DeCastro, Andrew Silva, Deepak Gopinath, Emily Sumner, Thomas M. Balch, Laporsha Dees, Guy Rosman

Abstract:Tight coordination is required for effective human-robot teams in domains involving fast dynamics and tactical decisions, such as multi-car racing. In such settings, robot teammates must react to cues of a human teammate's tactical objective to assist in a way that is consistent with the objective (e.g., navigating left or right around an obstacle). To address this challenge, we present Dream2Assist, a framework that combines a rich world model able to infer human objectives and value functions, and an assistive agent that provides appropriate expert assistance to a given human teammate. Our approach builds on a recurrent state space model to explicitly infer human intents, enabling the assistive agent to select actions that align with the human and enabling a fluid teaming interaction. We demonstrate our approach in a high-speed racing domain with a population of synthetic human drivers pursuing mutually exclusive objectives, such as "stay-behind" and "overtake". We show that the combined human-robot team, when blending its actions with those of the human, outperforms the synthetic humans alone as well as several baseline assistance strategies, and that intent-conditioning enables adherence to human preferences during task execution, leading to improved performance while satisfying the human's objective.

* Accepted to CoRL 2024, Munich, Germany

Via

Access Paper or Ask Questions

Interpretable Reinforcement Learning for Robotics and Continuous Control

Nov 16, 2023

Rohan Paleja, Letian Chen, Yaru Niu, Andrew Silva, Zhaoxin Li, Songan Zhang, Chace Ritchie, Sugju Choi, Kimberlee Chestnut Chang, Hongtei Eric Tseng(+3 more)

Abstract:Interpretability in machine learning is critical for the safe deployment of learned policies across legally-regulated and safety-critical domains. While gradient-based approaches in reinforcement learning have achieved tremendous success in learning policies for continuous control problems such as robotics and autonomous driving, the lack of interpretability is a fundamental barrier to adoption. We propose Interpretable Continuous Control Trees (ICCTs), a tree-based model that can be optimized via modern, gradient-based, reinforcement learning approaches to produce high-performing, interpretable policies. The key to our approach is a procedure for allowing direct optimization in a sparse decision-tree-like representation. We validate ICCTs against baselines across six domains, showing that ICCTs are capable of learning policies that parity or outperform baselines by up to 33% in autonomous driving scenarios while achieving a 300x-600x reduction in the number of parameters against deep learning baselines. We prove that ICCTs can serve as universal function approximators and display analytically that ICCTs can be verified in linear time. Furthermore, we deploy ICCTs in two realistic driving domains, based on interstate Highway-94 and 280 in the US. Finally, we verify ICCT's utility with end-users and find that ICCTs are rated easier to simulate, quicker to validate, and more interpretable than neural networks.

* arXiv admin note: text overlap with arXiv:2202.02352

Via

Access Paper or Ask Questions

FedPC: Federated Learning for Language Generation with Personal and Context Preference Embeddings

Oct 07, 2022

Andrew Silva, Pradyumna Tambwekar, Matthew Gombolay

Figure 1 for FedPC: Federated Learning for Language Generation with Personal and Context Preference Embeddings

Figure 2 for FedPC: Federated Learning for Language Generation with Personal and Context Preference Embeddings

Figure 3 for FedPC: Federated Learning for Language Generation with Personal and Context Preference Embeddings

Figure 4 for FedPC: Federated Learning for Language Generation with Personal and Context Preference Embeddings

Abstract:Federated learning is a training paradigm that learns from multiple distributed users without aggregating data on a centralized server. Such a paradigm promises the ability to deploy machine-learning at-scale to a diverse population of end-users without first collecting a large, labeled dataset for all possible tasks. As federated learning typically averages learning updates across a decentralized population, there is a growing need for personalization of federated learning systems (i.e conversational agents must be able to personalize to a specific user's preferences). In this work, we propose a new direction for personalization research within federated learning, leveraging both personal embeddings and shared context embeddings. We also present an approach to predict these ``preference'' embeddings, enabling personalization without backpropagation. Compared to state-of-the-art personalization baselines, our approach achieves a 50\% improvement in test-time perplexity using 0.001\% of the memory required by baseline approaches, and achieving greater sample- and compute-efficiency.

* Andrew Silva and Pradyumna Tambwekar contributed equally towards this work

Via

Access Paper or Ask Questions

Multi-UAV Planning for Cooperative Wildfire Coverage and Tracking with Quality-of-Service Guarantees

Jun 21, 2022

Esmaeil Seraj, Andrew Silva, Matthew Gombolay

Abstract:In recent years, teams of robot and Unmanned Aerial Vehicles (UAVs) have been commissioned by researchers to enable accurate, online wildfire coverage and tracking. While the majority of prior work focuses on the coordination and control of such multi-robot systems, to date, these UAV teams have not been given the ability to reason about a fire's track (i.e., location and propagation dynamics) to provide performance guarantee over a time horizon. Motivated by the problem of aerial wildfire monitoring, we propose a predictive framework which enables cooperation in multi-UAV teams towards collaborative field coverage and fire tracking with probabilistic performance guarantee. Our approach enables UAVs to infer the latent fire propagation dynamics for time-extended coordination in safety-critical conditions. We derive a set of novel, analytical temporal, and tracking-error bounds to enable the UAV-team to distribute their limited resources and cover the entire fire area according to the case-specific estimated states and provide a probabilistic performance guarantee. Our results are not limited to the aerial wildfire monitoring case-study and are generally applicable to problems, such as search-and-rescue, target tracking and border patrol. We evaluate our approach in simulation and provide demonstrations of the proposed framework on a physical multi-robot testbed to account for real robot dynamics and restrictions. Our quantitative evaluations validate the performance of our method accumulating 7.5x and 9.0x smaller tracking-error than state-of-the-art model-based and reinforcement learning benchmarks, respectively.

* To appear in the journal of Autonomous Agents and Multi-Agent Systems (AAMAS)

Via

Access Paper or Ask Questions

FedEmbed: Personalized Private Federated Learning

Feb 18, 2022

Andrew Silva, Katherine Metcalf, Nicholas Apostoloff, Barry-John Theobald

Figure 1 for FedEmbed: Personalized Private Federated Learning

Figure 2 for FedEmbed: Personalized Private Federated Learning

Figure 3 for FedEmbed: Personalized Private Federated Learning

Figure 4 for FedEmbed: Personalized Private Federated Learning

Abstract:Federated learning enables the deployment of machine learning to problems for which centralized data collection is impractical. Adding differential privacy guarantees bounds on privacy while data are contributed to a global model. Adding personalization to federated learning introduces new challenges as we must account for preferences of individual users, where a data sample could have conflicting labels because one sub-population of users might view an input positively, but other sub-populations view the same input negatively. We present FedEmbed, a new approach to private federated learning for personalizing a global model that uses (1) sub-populations of similar users, and (2) personal embeddings. We demonstrate that current approaches to federated learning are inadequate for handling data with conflicting labels, and we show that FedEmbed achieves up to 45% improvement over baseline approaches to personalized private federated learning.

* 15 pages

Via

Access Paper or Ask Questions

Learning Interpretable, High-Performing Policies for Continuous Control Problems

Feb 04, 2022

Rohan Paleja, Yaru Niu, Andrew Silva, Chace Ritchie, Sugju Choi, Matthew Gombolay

Figure 1 for Learning Interpretable, High-Performing Policies for Continuous Control Problems

Figure 2 for Learning Interpretable, High-Performing Policies for Continuous Control Problems

Figure 3 for Learning Interpretable, High-Performing Policies for Continuous Control Problems

Figure 4 for Learning Interpretable, High-Performing Policies for Continuous Control Problems

Abstract:Gradient-based approaches in reinforcement learning (RL) have achieved tremendous success in learning policies for continuous control problems. While the performance of these approaches warrants real-world adoption in domains, such as in autonomous driving and robotics, these policies lack interpretability, limiting deployability in safety-critical and legally-regulated domains. Such domains require interpretable and verifiable control policies that maintain high performance. We propose Interpretable Continuous Control Trees (ICCTs), a tree-based model that can be optimized via modern, gradient-based, RL approaches to produce high-performing, interpretable policies. The key to our approach is a procedure for allowing direct optimization in a sparse decision-tree-like representation. We validate ICCTs against baselines across six domains, showing that ICCTs are capable of learning interpretable policy representations that parity or outperform baselines by up to 33$\%$ in autonomous driving scenarios while achieving a $300$x-$600$x reduction in the number of policy parameters against deep learning baselines.

Via

Access Paper or Ask Questions

Multimodal Punctuation Prediction with Contextual Dropout

Feb 12, 2021

Andrew Silva, Barry-John Theobald, Nicholas Apostoloff

Figure 1 for Multimodal Punctuation Prediction with Contextual Dropout

Figure 2 for Multimodal Punctuation Prediction with Contextual Dropout

Figure 3 for Multimodal Punctuation Prediction with Contextual Dropout

Figure 4 for Multimodal Punctuation Prediction with Contextual Dropout

Abstract:Automatic speech recognition (ASR) is widely used in consumer electronics. ASR greatly improves the utility and accessibility of technology, but usually the output is only word sequences without punctuation. This can result in ambiguity in inferring user-intent. We first present a transformer-based approach for punctuation prediction that achieves 8% improvement on the IWSLT 2012 TED Task, beating the previous state of the art [1]. We next describe our multimodal model that learns from both text and audio, which achieves 8% improvement over the text-only algorithm on an internal dataset for which we have both the audio and transcriptions. Finally, we present an approach to learning a model using contextual dropout that allows us to handle variable amounts of future context at test time.

* Accepted for publication at ICASSP 2021

Via

Access Paper or Ask Questions

Interpretable Policy Specification and Synthesis through Natural Language and RL

Jan 18, 2021

Pradyumna Tambwekar, Andrew Silva, Nakul Gopalan, Matthew Gombolay

Figure 1 for Interpretable Policy Specification and Synthesis through Natural Language and RL

Figure 2 for Interpretable Policy Specification and Synthesis through Natural Language and RL

Figure 3 for Interpretable Policy Specification and Synthesis through Natural Language and RL

Figure 4 for Interpretable Policy Specification and Synthesis through Natural Language and RL

Abstract:Policy specification is a process by which a human can initialize a robot's behaviour and, in turn, warm-start policy optimization via Reinforcement Learning (RL). While policy specification/design is inherently a collaborative process, modern methods based on Learning from Demonstration or Deep RL lack the model interpretability and accessibility to be classified as such. Current state-of-the-art methods for policy specification rely on black-box models, which are an insufficient means of collaboration for non-expert users: These models provide no means of inspecting policies learnt by the agent and are not focused on creating a usable modality for teaching robot behaviour. In this paper, we propose a novel machine learning framework that enables humans to 1) specify, through natural language, interpretable policies in the form of easy-to-understand decision trees, 2) leverage these policies to warm-start reinforcement learning and 3) outperform baselines that lack our natural language initialization mechanism. We train our approach by collecting a first-of-its-kind corpus mapping free-form natural language policy descriptions to decision tree-based policies. We show that our novel framework translates natural language to decision trees with a 96% and 97% accuracy on a held-out corpus across two domains, respectively. Finally, we validate that policies initialized with natural language commands are able to significantly outperform relevant baselines (p < 0.001) that do not benefit from our natural language-based warm-start technique.

Via

Access Paper or Ask Questions

Using Cross-Loss Influence Functions to Explain Deep Network Representations

Dec 03, 2020

Andrew Silva, Rohit Chopra, Matthew Gombolay

Figure 1 for Using Cross-Loss Influence Functions to Explain Deep Network Representations

Figure 2 for Using Cross-Loss Influence Functions to Explain Deep Network Representations

Figure 3 for Using Cross-Loss Influence Functions to Explain Deep Network Representations

Figure 4 for Using Cross-Loss Influence Functions to Explain Deep Network Representations

Abstract:As machine learning is increasingly deployed in the real world, it is ever more vital that we understand the decision-criteria of the models we train. Recently, researchers have shown that influence functions, a statistical measure of sample impact, may be extended to approximate the effects of training samples on classification accuracy for deep neural networks. However, prior work only applies to supervised learning setups where training and testing share an objective function. Despite the rise in unsupervised learning, self-supervised learning, and model pre-training, there are currently no suitable technologies for estimating influence of deep networks that do not train and test on the same objective. To overcome this limitation, we provide the first theoretical and empirical demonstration that influence functions can be extended to handle mismatched training and testing settings. Our result enables us to compute the influence of unsupervised and self-supervised training examples with respect to a supervised test objective. We demonstrate this technique on a synthetic dataset as well as two Skip-gram language model examples to examine cluster membership and sources of unwanted bias.

Via

Access Paper or Ask Questions