Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Adrian Azzarelli

AquaNeRF: Neural Radiance Fields in Underwater Media with Distractor Removal

Feb 22, 2025

Luca Gough, Adrian Azzarelli, Fan Zhang, Nantheera Anantrasirichai

Abstract:Neural radiance field (NeRF) research has made significant progress in modeling static video content captured in the wild. However, current models and rendering processes rarely consider scenes captured underwater, which are useful for studying and filming ocean life. They fail to address visual artifacts unique to underwater scenes, such as moving fish and suspended particles. This paper introduces a novel NeRF renderer and optimization scheme for an implicit MLP-based NeRF model. Our renderer reduces the influence of floaters and moving objects that interfere with static objects of interest by estimating a single surface per ray. We use a Gaussian weight function with a small offset to ensure that the transmittance of the surrounding media remains constant. Additionally, we enhance our model with a depth-based scaling function to upscale gradients for near-camera volumes. Overall, our method outperforms the baseline Nerfacto by approximately 7.5\% and SeaThru-NeRF by 6.2% in terms of PSNR. Subjective evaluation also shows a significant reduction of artifacts while preserving details of static targets and background compared to the state of the arts.

* Accepted by 2025 IEEE International Symposium on Circuits and Systems

Via

Access Paper or Ask Questions

Exploring Dynamic Novel View Synthesis Technologies for Cinematography

Dec 23, 2024

Adrian Azzarelli, Nantheera Anantrasirichai, David R Bull

Abstract:Novel view synthesis (NVS) has shown significant promise for applications in cinematographic production, particularly through the exploitation of Neural Radiance Fields (NeRF) and Gaussian Splatting (GS). These methods model real 3D scenes, enabling the creation of new shots that are challenging to capture in the real world due to set topology or expensive equipment requirement. This innovation also offers cinematographic advantages such as smooth camera movements, virtual re-shoots, slow-motion effects, etc. This paper explores dynamic NVS with the aim of facilitating the model selection process. We showcase its potential through a short montage filmed using various NVS models.

Via

Access Paper or Ask Questions

BVI-CR: A Multi-View Human Dataset for Volumetric Video Compression

Nov 17, 2024

Ge Gao, Adrian Azzarelli, Ho Man Kwan, Nantheera Anantrasirichai, Fan Zhang, Oliver Moolan-Feroze, David Bull

Figure 1 for BVI-CR: A Multi-View Human Dataset for Volumetric Video Compression

Figure 2 for BVI-CR: A Multi-View Human Dataset for Volumetric Video Compression

Figure 3 for BVI-CR: A Multi-View Human Dataset for Volumetric Video Compression

Figure 4 for BVI-CR: A Multi-View Human Dataset for Volumetric Video Compression

Abstract:The advances in immersive technologies and 3D reconstruction have enabled the creation of digital replicas of real-world objects and environments with fine details. These processes generate vast amounts of 3D data, requiring more efficient compression methods to satisfy the memory and bandwidth constraints associated with data storage and transmission. However, the development and validation of efficient 3D data compression methods are constrained by the lack of comprehensive and high-quality volumetric video datasets, which typically require much more effort to acquire and consume increased resources compared to 2D image and video databases. To bridge this gap, we present an open multi-view volumetric human dataset, denoted BVI-CR, which contains 18 multi-view RGB-D captures and their corresponding textured polygonal meshes, depicting a range of diverse human actions. Each video sequence contains 10 views in 1080p resolution with durations between 10-15 seconds at 30FPS. Using BVI-CR, we benchmarked three conventional and neural coordinate-based multi-view video compression methods, following the MPEG MIV Common Test Conditions, and reported their rate quality performance based on various quality metrics. The results show the great potential of neural representation based methods in volumetric video compression compared to conventional video coding methods (with an up to 38\% average coding gain in PSNR). This dataset provides a development and validation platform for a variety of tasks including volumetric reconstruction, compression, and quality assessment. The database will be shared publicly at \url{https://github.com/fan-aaron-zhang/bvi-cr}.

Via

Access Paper or Ask Questions

Reviewing Intelligent Cinematography: AI research for camera-based video production

May 08, 2024

Adrian Azzarelli, Nantheera Anantrasirichai, David R Bull

Figure 1 for Reviewing Intelligent Cinematography: AI research for camera-based video production

Figure 2 for Reviewing Intelligent Cinematography: AI research for camera-based video production

Figure 3 for Reviewing Intelligent Cinematography: AI research for camera-based video production

Figure 4 for Reviewing Intelligent Cinematography: AI research for camera-based video production

Abstract:This paper offers a comprehensive review of artificial intelligence (AI) research in the context of real camera content acquisition for entertainment purposes and is aimed at both researchers and cinematographers. Considering the breadth of computer vision research and the lack of review papers tied to intelligent cinematography (IC), this review introduces a holistic view of the IC landscape while providing the technical insight for experts across across disciplines. We preface the main discussion with technical background on generative AI, object detection, automated camera calibration and 3-D content acquisition, and link explanatory articles to assist non-technical readers. The main discussion categorizes work by four production types: General Production, Virtual Production, Live Production and Aerial Production. Note that for Virtual Production we do not discuss research relating to virtual content acquisition, including work on automated video generation, like Stable Diffusion. Within each section, we (1) sub-classify work by the technical field of research - reflected by the subsections, and (2) evaluate the trends and challenge w.r.t to each type of production. In the final chapter, we present our concluding remarks on the greater scope of IC research and outline work that we believe has significant potential to influence the whole industry. We find that work relating to virtual production has the greatest potential to impact other mediums of production, driven by the growing interest in LED volumes/stages for in-camera virtual effects (ICVFX) and automated 3-D capture for a virtual modelling of real world scenes and actors. This is the first piece of literature to offer a structured and comprehensive examination of IC research. Consequently, we address ethical and legal concerns regarding the use of creative AI involving artists, actors and the general public, in the...

* For researchers and cinematographers. 43 pages including Table of Contents, List of Figures and Tables. We obtained permission to use Figures 5 and 11. All other Figures have been drawn by us

Via

Access Paper or Ask Questions

WavePlanes: A compact Wavelet representation for Dynamic Neural Radiance Fields

Dec 03, 2023

Adrian Azzarelli, Nantheera Anantrasirichai, David R Bull

Figure 1 for WavePlanes: A compact Wavelet representation for Dynamic Neural Radiance Fields

Figure 2 for WavePlanes: A compact Wavelet representation for Dynamic Neural Radiance Fields

Figure 3 for WavePlanes: A compact Wavelet representation for Dynamic Neural Radiance Fields

Figure 4 for WavePlanes: A compact Wavelet representation for Dynamic Neural Radiance Fields

Abstract:Dynamic Neural Radiance Fields (Dynamic NeRF) enhance NeRF technology to model moving scenes. However, they are resource intensive and challenging to compress. To address this issue, this paper presents WavePlanes, a fast and more compact explicit model. We propose a multi-scale space and space-time feature plane representation using N-level 2-D wavelet coefficients. The inverse discrete wavelet transform reconstructs N feature signals at varying detail, which are linearly decoded to approximate the color and density of volumes in a 4-D grid. Exploiting the sparsity of wavelet coefficients, we compress a Hash Map containing only non-zero coefficients and their locations on each plane. This results in a compressed model size of ~12 MB. Compared with state-of-the-art plane-based models, WavePlanes is up to 15x smaller, less computationally demanding and achieves comparable results in as little as one hour of training - without requiring custom CUDA code or high performance computing resources. Additionally, we propose new feature fusion schemes that work as well as previously proposed schemes while providing greater interpretability. Our code is available at: https://github.com/azzarelli/waveplanes/

Via

Access Paper or Ask Questions

Towards a Robust Framework for NeRF Evaluation

May 31, 2023

Adrian Azzarelli, Nantheera Anantrasirichai, David R Bull

Figure 1 for Towards a Robust Framework for NeRF Evaluation

Figure 2 for Towards a Robust Framework for NeRF Evaluation

Figure 3 for Towards a Robust Framework for NeRF Evaluation

Figure 4 for Towards a Robust Framework for NeRF Evaluation

Abstract:Neural Radiance Field (NeRF) research has attracted significant attention recently, with 3D modelling, virtual/augmented reality, and visual effects driving its application. While current NeRF implementations can produce high quality visual results, there is a conspicuous lack of reliable methods for evaluating them. Conventional image quality assessment methods and analytical metrics (e.g. PSNR, SSIM, LPIPS etc.) only provide approximate indicators of performance since they generalise the ability of the entire NeRF pipeline. Hence, in this paper, we propose a new test framework which isolates the neural rendering network from the NeRF pipeline and then performs a parametric evaluation by training and evaluating the NeRF on an explicit radiance field representation. We also introduce a configurable approach for generating representations specifically for evaluation purposes. This employs ray-casting to transform mesh models into explicit NeRF samples, as well as to "shade" these representations. Combining these two approaches, we demonstrate how different "tasks" (scenes with different visual effects or learning strategies) and types of networks (NeRFs and depth-wise implicit neural representations (INRs)) can be evaluated within this framework. Additionally, we propose a novel metric to measure task complexity of the framework which accounts for the visual parameters and the distribution of the spatial data. Our approach offers the potential to create a comparative objective evaluation framework for NeRF methods.

* 9 pages, 2 main experiments, 2 additional experiments

Via

Access Paper or Ask Questions