Abstract:We introduce an interactive approach, DrawTalking, in which the user builds interactive worlds by sketching and speaking. It emphasizes user control and flexibility, and gives programming-like capability without code. We implemented it on the iPad. An open-ended study shows the mechanics resonate and are applicable to many creative-exploratory use cases. We hope to inspire and inform research in future natural user-centered interfaces.
Abstract:We present a hydrodynamic simulation system using the GPU compute shaders of DirectX for simulating virtual agent behaviors and navigation inside a smoothed particle hydrodynamical (SPH) fluid environment with real-time water mesh surface reconstruction. The current SPH literature includes interactions between SPH and heterogeneous meshes but seldom involves interactions between SPH and virtual boid agents. The contribution of the system lies in the combination of the parallel smoothed particle hydrodynamics model with the distributed boid model of virtual agents to enable agents to interact with fluids. The agents based on the boid algorithm influence the motion of SPH fluid particles, and the forces from the SPH algorithm affect the movement of the boids. To enable realistic fluid rendering and simulation in a particle-based system, it is essential to construct a mesh from the particle attributes. Our system also contributes to the surface reconstruction aspect of the pipeline, in which we performed a set of experiments with the parallel marching cubes algorithm per frame for constructing the mesh from the fluid particles in a real-time compute and memory-intensive application, producing a wide range of triangle configurations. We also demonstrate that our system is versatile enough for reinforced robotic agents instead of boid agents to interact with the fluid environment for underwater navigation and remote control engineering purposes.
Abstract:When creating 3D content, highly specialized skills are generally needed to design and generate models of objects and other assets by hand. We address this problem through high-quality 3D asset retrieval from multi-modal inputs, including 2D sketches, images and text. We use CLIP as it provides a bridge to higher-level latent features. We use these features to perform a multi-modality fusion to address the lack of artistic control that affects common data-driven approaches. Our approach allows for multi-modal conditional feature-driven retrieval through a 3D asset database, by utilizing a combination of input latent embeddings. We explore the effects of different combinations of feature embeddings across different input types and weighting methods.
Abstract:As synthetic imagery is used more frequently in training deep models, it is important to understand how different synthesis techniques impact the performance of such models. In this work, we perform a thorough evaluation of the effectiveness of several different synthesis techniques and their impact on the complexity of classifier domain adaptation to the "real" underlying data distribution that they seek to replicate. In addition, we propose a novel learned synthesis technique to better train classifier models than state-of-the-art offline graphical methods, while using significantly less computational resources. We accomplish this by learning a generative model to perform shading of synthetic geometry conditioned on a "g-buffer" representation of the scene to render, as well as a low sample Monte Carlo rendered image. The major contributions are (i) a dataset that allows comparison of real and synthetic versions of the same scene, (ii) an augmented data representation that boosts the stability of learning and improves the datasets accuracy, (iii) three different partially differentiable rendering techniques where lighting, denoising and shading are learned, and (iv) we improve a state of the art generative adversarial network (GAN) approach by using an ensemble of trained models to generate datasets that approach the performance of training on real data and surpass the performance of the full global illumination rendering.
Abstract:Efficient simulation of the Navier-Stokes equations for fluid flow is a long standing problem in applied mathematics, for which state-of-the-art methods require large compute resources. In this work, we propose a data-driven approach that leverages the approximation power of deep-learning with the precision of standard solvers to obtain fast and highly realistic simulations. Our method solves the incompressible Euler equations using the standard operator splitting method, in which a large sparse linear system with many free parameters must be solved. We use a Convolutional Network with a highly tailored architecture, trained using a novel unsupervised learning framework to solve the linear system. We present real-time 2D and 3D simulations that outperform recently proposed data-driven methods; the obtained results are realistic and show good generalization properties.