Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sara Fridovich-Keil

Geometric Algebra Planes: Convex Implicit Neural Volumes

Nov 21, 2024

Irmak Sivgin, Sara Fridovich-Keil, Gordon Wetzstein, Mert Pilanci

Abstract:Volume parameterizations abound in recent literature, from the classic voxel grid to the implicit neural representation and everything in between. While implicit representations have shown impressive capacity and better memory efficiency compared to voxel grids, to date they require training via nonconvex optimization. This nonconvex training process can be slow to converge and sensitive to initialization and hyperparameter choices that affect the final converged result. We introduce a family of models, GA-Planes, that is the first class of implicit neural volume representations that can be trained by convex optimization. GA-Planes models include any combination of features stored in tensor basis elements, followed by a neural feature decoder. They generalize many existing representations and can be adapted for convex, semiconvex, or nonconvex training as needed for different inverse problems. In the 2D setting, we prove that GA-Planes is equivalent to a low-rank plus low-resolution matrix factorization; we show that this approximation outperforms the classic low-rank plus sparse decomposition for fitting a natural image. In 3D, we demonstrate GA-Planes' competitive performance in terms of expressiveness, model size, and optimizability across three volume fitting tasks: radiance field reconstruction, 3D segmentation, and video segmentation.

* Code is available at https://github.com/sivginirmak/Geometric-Algebra-Planes

Via

Access Paper or Ask Questions

ThermalNeRF: Thermal Radiance Fields

Jul 22, 2024

Yvette Y. Lin, Xin-Yi Pan, Sara Fridovich-Keil, Gordon Wetzstein

Abstract:Thermal imaging has a variety of applications, from agricultural monitoring to building inspection to imaging under poor visibility, such as in low light, fog, and rain. However, reconstructing thermal scenes in 3D presents several challenges due to the comparatively lower resolution and limited features present in long-wave infrared (LWIR) images. To overcome these challenges, we propose a unified framework for scene reconstruction from a set of LWIR and RGB images, using a multispectral radiance field to represent a scene viewed by both visible and infrared cameras, thus leveraging information across both spectra. We calibrate the RGB and infrared cameras with respect to each other, as a preprocessing step using a simple calibration target. We demonstrate our method on real-world sets of RGB and LWIR photographs captured from a handheld thermal camera, showing the effectiveness of our method at scene representation across the visible and infrared spectra. We show that our method is capable of thermal super-resolution, as well as visually removing obstacles to reveal objects that are occluded in either the RGB or thermal channels. Please see https://yvette256.github.io/thermalnerf for video results as well as our code and dataset release.

* Presented at ICCP 2024; project page at https://yvette256.github.io/thermalnerf

Via

Access Paper or Ask Questions

Solving Inverse Problems in Protein Space Using Diffusion-Based Priors

Jun 06, 2024

Axel Levy, Eric R. Chan, Sara Fridovich-Keil, Frédéric Poitevin, Ellen D. Zhong, Gordon Wetzstein

Abstract:The interaction of a protein with its environment can be understood and controlled via its 3D structure. Experimental methods for protein structure determination, such as X-ray crystallography or cryogenic electron microscopy, shed light on biological processes but introduce challenging inverse problems. Learning-based approaches have emerged as accurate and efficient methods to solve these inverse problems for 3D structure determination, but are specialized for a predefined type of measurement. Here, we introduce a versatile framework to turn raw biophysical measurements of varying types into 3D atomic models. Our method combines a physics-based forward model of the measurement process with a pretrained generative model providing a task-agnostic, data-driven prior. Our method outperforms posterior sampling baselines on both linear and non-linear inverse problems. In particular, it is the first diffusion-based method for refining atomic models from cryo-EM density maps.

Via

Access Paper or Ask Questions

Volumetric Reconstruction Resolves Off-Resonance Artifacts in Static and Dynamic PROPELLER MRI

Nov 22, 2023

Annesha Ghosh, Gordon Wetzstein, Mert Pilanci, Sara Fridovich-Keil

Abstract:Off-resonance artifacts in magnetic resonance imaging (MRI) are visual distortions that occur when the actual resonant frequencies of spins within the imaging volume differ from the expected frequencies used to encode spatial information. These discrepancies can be caused by a variety of factors, including magnetic field inhomogeneities, chemical shifts, or susceptibility differences within the tissues. Such artifacts can manifest as blurring, ghosting, or misregistration of the reconstructed image, and they often compromise its diagnostic quality. We propose to resolve these artifacts by lifting the 2D MRI reconstruction problem to 3D, introducing an additional "spectral" dimension to model this off-resonance. Our approach is inspired by recent progress in modeling radiance fields, and is capable of reconstructing both static and dynamic MR images as well as separating fat and water, which is of independent clinical interest. We demonstrate our approach in the context of PROPELLER (Periodically Rotated Overlapping ParallEL Lines with Enhanced Reconstruction) MRI acquisitions, which are popular for their robustness to motion artifacts. Our method operates in a few minutes on a single GPU, and to our knowledge is the first to correct for chemical shift in gradient echo PROPELLER MRI reconstruction without additional measurements or pretraining data.

* Code is available at https://github.com/sarafridov/volumetric-propeller

Via

Access Paper or Ask Questions

Gradient Descent Provably Solves Nonlinear Tomographic Reconstruction

Oct 06, 2023

Sara Fridovich-Keil, Fabrizio Valdivia, Gordon Wetzstein, Benjamin Recht, Mahdi Soltanolkotabi

Abstract:In computed tomography (CT), the forward model consists of a linear Radon transform followed by an exponential nonlinearity based on the attenuation of light according to the Beer-Lambert Law. Conventional reconstruction often involves inverting this nonlinearity as a preprocessing step and then solving a convex inverse problem. However, this nonlinear measurement preprocessing required to use the Radon transform is poorly conditioned in the vicinity of high-density materials, such as metal. This preprocessing makes CT reconstruction methods numerically sensitive and susceptible to artifacts near high-density regions. In this paper, we study a technique where the signal is directly reconstructed from raw measurements through the nonlinear forward model. Though this optimization is nonconvex, we show that gradient descent provably converges to the global optimum at a geometric rate, perfectly reconstructing the underlying signal with a near minimal number of random measurements. We also prove similar results in the under-determined setting where the number of measurements is significantly smaller than the dimension of the signal. This is achieved by enforcing prior structural information about the signal through constraints on the optimization variables. We illustrate the benefits of direct nonlinear CT reconstruction with cone-beam CT experiments on synthetic and real 3D volumes. We show that this approach reduces metal artifacts compared to a commercial reconstruction of a human skull with metal dental crowns.

Via

Access Paper or Ask Questions

Neural Microfacet Fields for Inverse Rendering

Mar 31, 2023

Alexander Mai, Dor Verbin, Falko Kuester, Sara Fridovich-Keil

Abstract:We present Neural Microfacet Fields, a method for recovering materials, geometry, and environment illumination from images of a scene. Our method uses a microfacet reflectance model within a volumetric setting by treating each sample along the ray as a (potentially non-opaque) surface. Using surface-based Monte Carlo rendering in a volumetric setting enables our method to perform inverse rendering efficiently by combining decades of research in surface-based light transport with recent advances in volume rendering for view synthesis. Our approach outperforms prior work in inverse rendering, capturing high fidelity geometry and high frequency illumination details; its novel view synthesis results are on par with state-of-the-art methods that do not recover illumination or materials.

* Project page: https://half-potato.gitlab.io/posts/nmf/

Via

Access Paper or Ask Questions

K-Planes: Explicit Radiance Fields in Space, Time, and Appearance

Jan 24, 2023

Sara Fridovich-Keil, Giacomo Meanti, Frederik Warburg, Benjamin Recht, Angjoo Kanazawa

Abstract:We introduce k-planes, a white-box model for radiance fields in arbitrary dimensions. Our model uses d choose 2 planes to represent a d-dimensional scene, providing a seamless way to go from static (d=3) to dynamic (d=4) scenes. This planar factorization makes adding dimension-specific priors easy, e.g. temporal smoothness and multi-resolution spatial structure, and induces a natural decomposition of static and dynamic components of a scene. We use a linear feature decoder with a learned color basis that yields similar performance as a nonlinear black-box MLP decoder. Across a range of synthetic and real, static and dynamic, fixed and varying appearance scenes, k-planes yields competitive and often state-of-the-art reconstruction fidelity with low memory usage, achieving 1000x compression over a full 4D grid, and fast optimization with a pure PyTorch implementation. For video results and code, please see sarafridov.github.io/K-Planes.

* Project page https://sarafridov.github.io/K-Planes/

Via

Access Paper or Ask Questions

Models Out of Line: A Fourier Lens on Distribution Shift Robustness

Jul 08, 2022

Sara Fridovich-Keil, Brian R. Bartoldson, James Diffenderfer, Bhavya Kailkhura, Peer-Timo Bremer

Figure 1 for Models Out of Line: A Fourier Lens on Distribution Shift Robustness

Figure 2 for Models Out of Line: A Fourier Lens on Distribution Shift Robustness

Figure 3 for Models Out of Line: A Fourier Lens on Distribution Shift Robustness

Figure 4 for Models Out of Line: A Fourier Lens on Distribution Shift Robustness

Abstract:Improving the accuracy of deep neural networks (DNNs) on out-of-distribution (OOD) data is critical to an acceptance of deep learning (DL) in real world applications. It has been observed that accuracies on in-distribution (ID) versus OOD data follow a linear trend and models that outperform this baseline are exceptionally rare (and referred to as "effectively robust"). Recently, some promising approaches have been developed to improve OOD robustness: model pruning, data augmentation, and ensembling or zero-shot evaluating large pretrained models. However, there still is no clear understanding of the conditions on OOD data and model properties that are required to observe effective robustness. We approach this issue by conducting a comprehensive empirical study of diverse approaches that are known to impact OOD robustness on a broad range of natural and synthetic distribution shifts of CIFAR-10 and ImageNet. In particular, we view the "effective robustness puzzle" through a Fourier lens and ask how spectral properties of both models and OOD data influence the corresponding effective robustness. We find this Fourier lens offers some insight into why certain robust models, particularly those from the CLIP family, achieve OOD robustness. However, our analysis also makes clear that no known metric is consistently the best explanation (or even a strong explanation) of OOD robustness. Thus, to aid future research into the OOD puzzle, we address the gap in publicly-available models with effective robustness by introducing a set of pretrained models--RobustNets--with varying levels of OOD robustness.

Via

Access Paper or Ask Questions

When does dough become a bagel? Analyzing the remaining mistakes on ImageNet

May 09, 2022

Vijay Vasudevan, Benjamin Caine, Raphael Gontijo-Lopes, Sara Fridovich-Keil, Rebecca Roelofs

Figure 1 for When does dough become a bagel? Analyzing the remaining mistakes on ImageNet

Figure 2 for When does dough become a bagel? Analyzing the remaining mistakes on ImageNet

Figure 3 for When does dough become a bagel? Analyzing the remaining mistakes on ImageNet

Figure 4 for When does dough become a bagel? Analyzing the remaining mistakes on ImageNet

Abstract:Image classification accuracy on the ImageNet dataset has been a barometer for progress in computer vision over the last decade. Several recent papers have questioned the degree to which the benchmark remains useful to the community, yet innovations continue to contribute gains to performance, with today's largest models achieving 90%+ top-1 accuracy. To help contextualize progress on ImageNet and provide a more meaningful evaluation for today's state-of-the-art models, we manually review and categorize every remaining mistake that a few top models make in order to provide insight into the long-tail of errors on one of the most benchmarked datasets in computer vision. We focus on the multi-label subset evaluation of ImageNet, where today's best models achieve upwards of 97% top-1 accuracy. Our analysis reveals that nearly half of the supposed mistakes are not mistakes at all, and we uncover new valid multi-labels, demonstrating that, without careful review, we are significantly underestimating the performance of these models. On the other hand, we also find that today's best models still make a significant number of mistakes (40%) that are obviously wrong to human reviewers. To calibrate future progress on ImageNet, we provide an updated multi-label evaluation set, and we curate ImageNet-Major: a 68-example "major error" slice of the obvious mistakes made by today's top models -- a slice where models should achieve near perfection, but today are far from doing so.

Via

Access Paper or Ask Questions

Plenoxels: Radiance Fields without Neural Networks

Dec 09, 2021

Alex Yu, Sara Fridovich-Keil, Matthew Tancik, Qinhong Chen, Benjamin Recht, Angjoo Kanazawa

Figure 1 for Plenoxels: Radiance Fields without Neural Networks

Figure 2 for Plenoxels: Radiance Fields without Neural Networks

Figure 3 for Plenoxels: Radiance Fields without Neural Networks

Figure 4 for Plenoxels: Radiance Fields without Neural Networks

Abstract:We introduce Plenoxels (plenoptic voxels), a system for photorealistic view synthesis. Plenoxels represent a scene as a sparse 3D grid with spherical harmonics. This representation can be optimized from calibrated images via gradient methods and regularization without any neural components. On standard, benchmark tasks, Plenoxels are optimized two orders of magnitude faster than Neural Radiance Fields with no loss in visual quality.

* For video and code, please see https://alexyu.net/plenoxels

Via

Access Paper or Ask Questions