Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

William Harvey

Rolling Ahead Diffusion for Traffic Scene Simulation

Feb 13, 2025

Yunpeng Liu, Matthew Niedoba, William Harvey, Adam Scibior, Berend Zwartsenberg, Frank Wood

Abstract:Realistic driving simulation requires that NPCs not only mimic natural driving behaviors but also react to the behavior of other simulated agents. Recent developments in diffusion-based scenario generation focus on creating diverse and realistic traffic scenarios by jointly modelling the motion of all the agents in the scene. However, these traffic scenarios do not react when the motion of agents deviates from their modelled trajectories. For example, the ego-agent can be controlled by a stand along motion planner. To produce reactive scenarios with joint scenario models, the model must regenerate the scenario at each timestep based on new observations in a Model Predictive Control (MPC) fashion. Although reactive, this method is time-consuming, as one complete possible future for all NPCs is generated per simulation step. Alternatively, one can utilize an autoregressive model (AR) to predict only the immediate next-step future for all NPCs. Although faster, this method lacks the capability for advanced planning. We present a rolling diffusion based traffic scene generation model which mixes the benefits of both methods by predicting the next step future and simultaneously predicting partially noised further future steps at the same time. We show that such model is efficient compared to diffusion model based AR, achieving a beneficial compromise between reactivity and computational efficiency.

* Accepted to Workshop on Machine Learning for Autonomous Driving at AAAI 2025

Via

Access Paper or Ask Questions

Semantically Consistent Video Inpainting with Conditional Diffusion Models

Apr 30, 2024

Dylan Green, William Harvey, Saeid Naderiparizi, Matthew Niedoba, Yunpeng Liu, Xiaoxuan Liang, Jonathan Lavington, Ke Zhang, Vasileios Lioutas, Setareh Dabiri(+3 more)

Abstract:Current state-of-the-art methods for video inpainting typically rely on optical flow or attention-based approaches to inpaint masked regions by propagating visual information across frames. While such approaches have led to significant progress on standard benchmarks, they struggle with tasks that require the synthesis of novel content that is not present in other frames. In this paper we reframe video inpainting as a conditional generative modeling problem and present a framework for solving such problems with conditional video diffusion models. We highlight the advantages of using a generative approach for this task, showing that our method is capable of generating diverse, high-quality inpaintings and synthesizing new content that is spatially, temporally, and semantically consistent with the provided context.

Via

Access Paper or Ask Questions

Trans-Dimensional Generative Modeling via Jump Diffusion Models

May 25, 2023

Andrew Campbell, William Harvey, Christian Weilbach, Valentin De Bortoli, Tom Rainforth, Arnaud Doucet

Abstract:We propose a new class of generative models that naturally handle data of varying dimensionality by jointly modeling the state and dimension of each datapoint. The generative process is formulated as a jump diffusion process that makes jumps between different dimensional spaces. We first define a dimension destroying forward noising process, before deriving the dimension creating time-reversed generative process along with a novel evidence lower bound training objective for learning to approximate it. Simulating our learned approximation to the time-reversed generative process then provides an effective way of sampling data of varying dimensionality by jointly generating state values and dimensions. We demonstrate our approach on molecular and video datasets of varying dimensionality, reporting better compatibility with test-time diffusion guidance imputation tasks and improved interpolation capabilities versus fixed dimensional models that generate state values and dimensions separately.

* 38 pages, 11 figures, 5 tables

Via

Access Paper or Ask Questions

Visual Chain-of-Thought Diffusion Models

Mar 28, 2023

William Harvey, Frank Wood

Abstract:Recent progress with conditional image diffusion models has been stunning, and this holds true whether we are speaking about models conditioned on a text description, a scene layout, or a sketch. Unconditional image diffusion models are also improving but lag behind, as do diffusion models which are conditioned on lower-dimensional features like class labels. We propose to close the gap between conditional and unconditional models using a two-stage sampling procedure. In the first stage we sample an embedding describing the semantic content of the image. In the second stage we sample the image conditioned on this embedding and then discard the embedding. Doing so lets us leverage the power of conditional diffusion models on the unconditional generation task, which we show improves FID by 25-50% compared to standard unconditional generation.

Via

Access Paper or Ask Questions

Flexible Diffusion Modeling of Long Videos

May 23, 2022

William Harvey, Saeid Naderiparizi, Vaden Masrani, Christian Weilbach, Frank Wood

Figure 1 for Flexible Diffusion Modeling of Long Videos

Figure 2 for Flexible Diffusion Modeling of Long Videos

Figure 3 for Flexible Diffusion Modeling of Long Videos

Figure 4 for Flexible Diffusion Modeling of Long Videos

Abstract:We present a framework for video modeling based on denoising diffusion probabilistic models that produces long-duration video completions in a variety of realistic environments. We introduce a generative model that can at test-time sample any arbitrary subset of video frames conditioned on any other subset and present an architecture adapted for this purpose. Doing so allows us to efficiently compare and optimize a variety of schedules for the order in which frames in a long video are sampled and use selective sparse and long-range conditioning on previously sampled frames. We demonstrate improved video modeling over prior work on a number of datasets and sample temporally coherent videos over 25 minutes in length. We additionally release a new video modeling dataset and semantically meaningful metrics based on videos generated in the CARLA self-driving car simulator.

* 17 pages, 12 figures

Via

Access Paper or Ask Questions

Image Completion via Inference in Deep Generative Models

Feb 24, 2021

William Harvey, Saeid Naderiparizi, Frank Wood

Figure 1 for Image Completion via Inference in Deep Generative Models

Figure 2 for Image Completion via Inference in Deep Generative Models

Figure 3 for Image Completion via Inference in Deep Generative Models

Figure 4 for Image Completion via Inference in Deep Generative Models

Abstract:We consider image completion from the perspective of amortized inference in an image generative model. We leverage recent state of the art variational auto-encoder architectures that have been shown to produce photo-realistic natural images at non-trivial resolutions. Through amortized inference in such a model we can train neural artifacts that produce diverse, realistic image completions even when the vast majority of an image is missing. We demonstrate superior sample quality and diversity compared to prior art on the CIFAR-10 and FFHQ-256 datasets. We conclude by describing and demonstrating an application that requires an in-painting model with the capabilities ours exhibits: the use of Bayesian optimal experimental design to select the most informative sequence of small field of view x-rays for chest pathology detection.

* 19 pages, 12 figures

Via

Access Paper or Ask Questions

Assisting the Adversary to Improve GAN Training

Oct 03, 2020

Andreas Munk, William Harvey, Frank Wood

Figure 1 for Assisting the Adversary to Improve GAN Training

Figure 2 for Assisting the Adversary to Improve GAN Training

Figure 3 for Assisting the Adversary to Improve GAN Training

Figure 4 for Assisting the Adversary to Improve GAN Training

Abstract:We propose a method for improved training of generative adversarial networks (GANs). Some of the most popular methods for improving the stability and performance of GANs involve constraining or regularizing the discriminator. Our method, on the other hand, involves regularizing the generator. It can be used alongside existing approaches to GAN training and is simple and straightforward to implement. Our method is motivated by a common mismatch between theoretical analysis and practice: analysis often assumes that the discriminator reaches its optimum on each iteration. In practice, this is essentially never true, often leading to poor gradient estimates for the generator. To address this, we introduce the Adversary's Assistant (AdvAs). It is a theoretically motivated penalty imposed on the generator based on the norm of the gradients used to train the discriminator. This encourages the generator to move towards points where the discriminator is optimal. We demonstrate the effect of applying AdvAs to several GAN objectives, datasets and network architectures. The results indicate a reduction in the mismatch between theory and practice and that AdvAs can lead to improvement of GAN training, as measured by FID scores.

Via

Access Paper or Ask Questions

Planning as Inference in Epidemiological Models

Apr 03, 2020

Frank Wood, Andrew Warrington, Saeid Naderiparizi, Christian Weilbach, Vaden Masrani, William Harvey, Adam Scibior, Boyan Beronov, Ali Nasseri

Figure 1 for Planning as Inference in Epidemiological Models

Figure 2 for Planning as Inference in Epidemiological Models

Figure 3 for Planning as Inference in Epidemiological Models

Figure 4 for Planning as Inference in Epidemiological Models

Abstract:In this work we demonstrate how existing software tools can be used to automate parts of infectious disease-control policy-making via performing inference in existing epidemiological dynamics models. The kind of inference tasks undertaken include computing, for planning purposes, the posterior distribution over putatively controllable, via direct policy-making choices, simulation model parameters that give rise to acceptable disease progression outcomes. Neither the full capabilities of such inference automation software tools nor their utility for planning is widely disseminated at the current time. Timely gains in understanding about these tools and how they can be used may lead to more fine-grained and less economically damaging policy prescriptions, particularly during the current COVID-19 pandemic.

* minor typos corrected

Via

Access Paper or Ask Questions

Attention for Inference Compilation

Oct 25, 2019

William Harvey, Andreas Munk, Atılım Güneş Baydin, Alexander Bergholm, Frank Wood

Figure 1 for Attention for Inference Compilation

Figure 2 for Attention for Inference Compilation

Figure 3 for Attention for Inference Compilation

Figure 4 for Attention for Inference Compilation

Abstract:We present a new approach to automatic amortized inference in universal probabilistic programs which improves performance compared to current methods. Our approach is a variation of inference compilation (IC) which leverages deep neural networks to approximate a posterior distribution over latent variables in a probabilistic program. A challenge with existing IC network architectures is that they can fail to model long-range dependencies between latent variables. To address this, we introduce an attention mechanism that attends to the most salient variables previously sampled in the execution of a probabilistic program. We demonstrate that the addition of attention allows the proposal distributions to better match the true posterior, enhancing inference about latent variables in simulators.

Via

Access Paper or Ask Questions

Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training

Jun 13, 2019

William Harvey, Michael Teng, Frank Wood

Figure 1 for Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training

Figure 2 for Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training

Figure 3 for Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training

Figure 4 for Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training

Abstract:We introduce the use of Bayesian optimal experimental design techniques for generating glimpse sequences to use in semi-supervised training of hard attention networks. Hard attention holds the promise of greater energy efficiency and superior inference performance. Employing such networks for image classification usually involves choosing a sequence of glimpse locations from a stochastic policy. As the outputs of observations are typically non-differentiable with respect to their glimpse locations, unsupervised gradient learning of such a policy requires REINFORCE-style updates. Also, the only reward signal is the final classification accuracy. For these reasons hard attention networks, despite their promise, have not achieved the wide adoption that soft attention networks have and, in many practical settings, are difficult to train. We find that our method for semi-supervised training makes it easier and faster to train hard attention networks and correspondingly could make them practical to consider in situations where they were not before.

* 9 pages, 5 figures + appendix with 6 pages, 4 figures.Submitted to NeurIPS 2019

Via

Access Paper or Ask Questions