Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Taylor Killian

Robust Autonomy Emerges from Self-Play

Feb 05, 2025

Marco Cusumano-Towner, David Hafner, Alex Hertzberg, Brody Huval, Aleksei Petrenko, Eugene Vinitsky, Erik Wijmans, Taylor Killian, Stuart Bowers, Ozan Sener(+2 more)

Abstract:Self-play has powered breakthroughs in two-player and multi-player games. Here we show that self-play is a surprisingly effective strategy in another domain. We show that robust and naturalistic driving emerges entirely from self-play in simulation at unprecedented scale -- 1.6~billion~km of driving. This is enabled by Gigaflow, a batched simulator that can synthesize and train on 42 years of subjective driving experience per hour on a single 8-GPU node. The resulting policy achieves state-of-the-art performance on three independent autonomous driving benchmarks. The policy outperforms the prior state of the art when tested on recorded real-world scenarios, amidst human drivers, without ever seeing human data during training. The policy is realistic when assessed against human references and achieves unprecedented robustness, averaging 17.5 years of continuous driving between incidents in simulation.

Via

Access Paper or Ask Questions

Identifying Differential Patient Care Through Inverse Intent Inference

Nov 11, 2024

Hyewon Jeong, Siddharth Nayak, Taylor Killian, Sanjat Kanjilal, Marzyeh Ghassemi

Figure 1 for Identifying Differential Patient Care Through Inverse Intent Inference

Figure 2 for Identifying Differential Patient Care Through Inverse Intent Inference

Figure 3 for Identifying Differential Patient Care Through Inverse Intent Inference

Figure 4 for Identifying Differential Patient Care Through Inverse Intent Inference

Abstract:Sepsis is a life-threatening condition defined by end-organ dysfunction due to a dysregulated host response to infection. Although the Surviving Sepsis Campaign has launched and has been releasing sepsis treatment guidelines to unify and normalize the care for sepsis patients, it has been reported in numerous studies that disparities in care exist across the trajectory of patient stay in the emergency department and intensive care unit. Here, we apply a number of reinforcement learning techniques including behavioral cloning, imitation learning, and inverse reinforcement learning, to learn the optimal policy in the management of septic patient subgroups using expert demonstrations. Then we estimate the counterfactual optimal policies by applying the model to another subset of unseen medical populations and identify the difference in cure by comparing it to the real policy. Our data comes from the sepsis cohort of MIMIC-IV and the clinical data warehouses of the Mass General Brigham healthcare system. The ultimate objective of this work is to use the optimal learned policy function to estimate the counterfactual treatment policy and identify deviations across sub-populations of interest. We hope this approach would help us identify any disparities in care and also changes in cure in response to the publication of national sepsis treatment guidelines.

* Reinforcement Learning for Real Life (RL4RealLife) Workshop, NeurIPS 2022

Via

Access Paper or Ask Questions

Multiple Sclerosis Severity Classification From Clinical Text

Oct 29, 2020

Alister D Costa, Stefan Denkovski, Michal Malyska, Sae Young Moon, Brandon Rufino, Zhen Yang, Taylor Killian, Marzyeh Ghassemi

Figure 1 for Multiple Sclerosis Severity Classification From Clinical Text

Figure 2 for Multiple Sclerosis Severity Classification From Clinical Text

Figure 3 for Multiple Sclerosis Severity Classification From Clinical Text

Figure 4 for Multiple Sclerosis Severity Classification From Clinical Text

Abstract:Multiple Sclerosis (MS) is a chronic, inflammatory and degenerative neurological disease, which is monitored by a specialist using the Expanded Disability Status Scale (EDSS) and recorded in unstructured text in the form of a neurology consult note. An EDSS measurement contains an overall "EDSS" score and several functional subscores. Typically, expert knowledge is required to interpret consult notes and generate these scores. Previous approaches used limited context length Word2Vec embeddings and keyword searches to predict scores given a consult note, but often failed when scores were not explicitly stated. In this work, we present MS-BERT, the first publicly available transformer model trained on real clinical data other than MIMIC. Next, we present MSBC, a classifier that applies MS-BERT to generate embeddings and predict EDSS and functional subscores. Lastly, we explore combining MSBC with other models through the use of Snorkel to generate scores for unlabelled consult notes. MSBC achieves state-of-the-art performance on all metrics and prediction tasks and outperforms the models generated from the Snorkel ensemble. We improve Macro-F1 by 0.12 (to 0.88) for predicting EDSS and on average by 0.29 (to 0.63) for predicting functional subscores over previous Word2Vec CNN and rule-based approaches.

* EMNLP 2020 Clinical NLP workshop

Via

Access Paper or Ask Questions

Kernelized Capsule Networks

Jun 07, 2019

Taylor Killian, Justin Goodwin, Olivia Brown, Sung-Hyun Son

Figure 1 for Kernelized Capsule Networks

Figure 2 for Kernelized Capsule Networks

Figure 3 for Kernelized Capsule Networks

Figure 4 for Kernelized Capsule Networks

Abstract:Capsule Networks attempt to represent patterns in images in a way that preserves hierarchical spatial relationships. Additionally, research has demonstrated that these techniques may be robust against adversarial perturbations. We present an improvement to training capsule networks with added robustness via non-parametric kernel methods. The representations learned through the capsule network are used to construct covariance kernels for Gaussian processes (GPs). We demonstrate that this approach achieves comparable prediction performance to Capsule Networks while improving robustness to adversarial perturbations and providing a meaningful measure of uncertainty that may aid in the detection of adversarial inputs.

* Paper accepted to the ICML 2019 Workshop on Understanding and Improving Generalization in Deep Learning

Via

Access Paper or Ask Questions

Interpretable Reinforcement Learning via Differentiable Decision Trees

Mar 22, 2019

Ivan Dario Jimenez Rodriguez, Taylor Killian, Sung-Hyun Son, Matthew Gombolay

Figure 1 for Interpretable Reinforcement Learning via Differentiable Decision Trees

Figure 2 for Interpretable Reinforcement Learning via Differentiable Decision Trees

Figure 3 for Interpretable Reinforcement Learning via Differentiable Decision Trees

Figure 4 for Interpretable Reinforcement Learning via Differentiable Decision Trees

Abstract:Decision trees are ubiquitous in machine learning for their ease of use and interpretability; however, they are not typically implemented in reinforcement learning because they cannot be updated via stochastic gradient descent. Traditional applications of decision trees for reinforcement learning have focused instead on making commitments to decision boundaries as the tree is grown one layer at a time. We overcome this critical limitation by allowing for a gradient update over the entire tree structure that improves sample complexity when a tree is fuzzy and interpretability when sharp. We offer three key contributions towards this goal. First, we motivate the need for policy gradient-based learning by examining the theoretical properties of gradient descent over differentiable decision trees. Second, we introduce a regularization framework that yields interpretability via sparsity in the tree structure. Third, we demonstrate the ability to construct a decision tree via policy gradient in canonical reinforcement learning domains and supervised learning benchmarks.

Via

Access Paper or Ask Questions

Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes

Oct 31, 2017

Taylor Killian, Samuel Daulton, George Konidaris, Finale Doshi-Velez

Figure 1 for Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes

Figure 2 for Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes

Figure 3 for Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes

Figure 4 for Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes

Abstract:We introduce a new formulation of the Hidden Parameter Markov Decision Process (HiP-MDP), a framework for modeling families of related tasks using low-dimensional latent embeddings. Our new framework correctly models the joint uncertainty in the latent parameters and the state space. We also replace the original Gaussian Process-based model with a Bayesian Neural Network, enabling more scalable inference. Thus, we expand the scope of the HiP-MDP to applications with higher dimensions and more complex dynamics.

* To appear at NIPS 2017, selected for an oral presentation. 17 pages (incl references and appendix). Example code can be found at http://github.com/dtak/hip-mdp-public

Via

Access Paper or Ask Questions

Transfer Learning Across Patient Variations with Hidden Parameter Markov Decision Processes

Dec 01, 2016

Taylor Killian, George Konidaris, Finale Doshi-Velez

Figure 1 for Transfer Learning Across Patient Variations with Hidden Parameter Markov Decision Processes

Figure 2 for Transfer Learning Across Patient Variations with Hidden Parameter Markov Decision Processes

Abstract:Due to physiological variation, patients diagnosed with the same condition may exhibit divergent, but related, responses to the same treatments. Hidden Parameter Markov Decision Processes (HiP-MDPs) tackle this transfer-learning problem by embedding these tasks into a low-dimensional space. However, the original formulation of HiP-MDP had a critical flaw: the embedding uncertainty was modeled independently of the agent's state uncertainty, requiring an unnatural training procedure in which all tasks visited every part of the state space---possible for robots that can be moved to a particular location, impossible for human patients. We update the HiP-MDP framework and extend it to more robustly develop personalized medicine strategies for HIV treatment.

* Brief abstract for poster submission to Machine Learning for Healthcare workshop at NIPS 2016

Via

Access Paper or Ask Questions