Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Arthur Richards

Learning Interpretable Models of Aircraft Handling Behaviour by Reinforcement Learning from Human Feedback

May 26, 2023

Tom Bewley, Jonathan Lawry, Arthur Richards

Abstract:We propose a method to capture the handling abilities of fast jet pilots in a software model via reinforcement learning (RL) from human preference feedback. We use pairwise preferences over simulated flight trajectories to learn an interpretable rule-based model called a reward tree, which enables the automated scoring of trajectories alongside an explanatory rationale. We train an RL agent to execute high-quality handling behaviour by using the reward tree as the objective, and thereby generate data for iterative preference collection and further refinement of both tree and agent. Experiments with synthetic preferences show reward trees to be competitive with uninterpretable neural network reward models on quantitative and qualitative evaluations.

* arXiv admin note: substantial text overlap with arXiv:2210.01007

Via

Access Paper or Ask Questions

Reward Learning with Trees: Methods and Evaluation

Oct 03, 2022

Tom Bewley, Jonathan Lawry, Arthur Richards, Rachel Craddock, Ian Henderson

Figure 1 for Reward Learning with Trees: Methods and Evaluation

Figure 2 for Reward Learning with Trees: Methods and Evaluation

Figure 3 for Reward Learning with Trees: Methods and Evaluation

Figure 4 for Reward Learning with Trees: Methods and Evaluation

Abstract:Recent efforts to learn reward functions from human feedback have tended to use deep neural networks, whose lack of transparency hampers our ability to explain agent behaviour or verify alignment. We explore the merits of learning intrinsically interpretable tree models instead. We develop a recently proposed method for learning reward trees from preference labels, and show it to be broadly competitive with neural networks on challenging high-dimensional tasks, with good robustness to limited or corrupted data. Having found that reward tree learning can be done effectively in complex settings, we then consider why it should be used, demonstrating that the interpretable reward structure gives significant scope for traceability, verification and explanation.

* 22 pages (9 main body). Preprint, under review

Via

Access Paper or Ask Questions

Summarising and Comparing Agent Dynamics with Contrastive Spatiotemporal Abstraction

Jan 17, 2022

Tom Bewley, Jonathan Lawry, Arthur Richards

Figure 1 for Summarising and Comparing Agent Dynamics with Contrastive Spatiotemporal Abstraction

Figure 2 for Summarising and Comparing Agent Dynamics with Contrastive Spatiotemporal Abstraction

Figure 3 for Summarising and Comparing Agent Dynamics with Contrastive Spatiotemporal Abstraction

Figure 4 for Summarising and Comparing Agent Dynamics with Contrastive Spatiotemporal Abstraction

Abstract:We introduce a data-driven, model-agnostic technique for generating a human-interpretable summary of the salient points of contrast within an evolving dynamical system, such as the learning process of a control agent. It involves the aggregation of transition data along both spatial and temporal dimensions according to an information-theoretic divergence measure. A practical algorithm is outlined for continuous state spaces, and deployed to summarise the learning histories of deep reinforcement learning agents with the aid of graphical and textual communication methods. We expect our method to be complementary to existing techniques in the realm of agent interpretability.

* 13 pages (6 body, 1 references, 6 appendix). Pre-print; under review

Via

Access Paper or Ask Questions

Modelling Agent Policies with Interpretable Imitation Learning

Jun 19, 2020

Tom Bewley, Jonathan Lawry, Arthur Richards

Figure 1 for Modelling Agent Policies with Interpretable Imitation Learning

Figure 2 for Modelling Agent Policies with Interpretable Imitation Learning

Figure 3 for Modelling Agent Policies with Interpretable Imitation Learning

Abstract:As we deploy autonomous agents in safety-critical domains, it becomes important to develop an understanding of their internal mechanisms and representations. We outline an approach to imitation learning for reverse-engineering black box agent policies in MDP environments, yielding simplified, interpretable models in the form of decision trees. As part of this process, we explicitly model and learn agents' latent state representations by selecting from a large space of candidate features constructed from the Markov state. We present initial promising results from an implementation in a multi-agent traffic environment.

* 6 pages, 3 figures; under review for the 1st TAILOR Workshop, due to take place 29-30 August 2020 in Santiago de Compostela

Via

Access Paper or Ask Questions

Formal Specification and Analysis of Autonomous Systems under Partial Compliance

Jul 22, 2016

Jeremy Morse, Dejanira Araiza-Illan, Jonathan Lawry, Arthur Richards, Kerstin Eder

Figure 1 for Formal Specification and Analysis of Autonomous Systems under Partial Compliance

Figure 2 for Formal Specification and Analysis of Autonomous Systems under Partial Compliance

Figure 3 for Formal Specification and Analysis of Autonomous Systems under Partial Compliance

Abstract:The widespread adoption of autonomous systems depends on providing guarantees of safety and functional correctness, at both design time and runtime. Information about the extent to which functional requirements can be met in combination with non-functional requirements (NFRs) -- i.e. requirements that can be partially complied with -- , under dynamic and uncertain environments, provides opportunities to enhance the safety and functional correctness of systems at design time. We present a technique to formally define system attributes that can change or be changed to deal with dynamic and uncertain environments (denominated weakened specifications) as a partially ordered lattice, and to automatically explore the system under different specifications, using probabilistic model checking, to find the likelihood of satisfying a requirement. The resulting probabilities form boundaries of "optimal specifications", analogous to Pareto frontiers in multi-objective optimization, informing the designer about the system's capabilities, such as resilience or robustness, when changing its attributes to deal with dynamic and uncertain environments. We illustrate the proposed technique through a domestic robotic assistant example.

Via

Access Paper or Ask Questions