Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yunhao Ba

All-day Depth Completion

May 27, 2024

Vadim Ezhov, Hyoungseob Park, Zhaoyang Zhang, Rishi Upadhyay, Howard Zhang, Chethan Chinder Chandrappa, Achuta Kadambi, Yunhao Ba, Julie Dorsey, Alex Wong

Abstract:We propose a method for depth estimation under different illumination conditions, i.e., day and night time. As photometry is uninformative in regions under low-illumination, we tackle the problem through a multi-sensor fusion approach, where we take as input an additional synchronized sparse point cloud (i.e., from a LiDAR) projected onto the image plane as a sparse depth map, along with a camera image. The crux of our method lies in the use of the abundantly available synthetic data to first approximate the 3D scene structure by learning a mapping from sparse to (coarse) dense depth maps along with their predictive uncertainty - we term this, SpaDe. In poorly illuminated regions where photometric intensities do not afford the inference of local shape, the coarse approximation of scene depth serves as a prior; the uncertainty map is then used with the image to guide refinement through an uncertainty-driven residual learning (URL) scheme. The resulting depth completion network leverages complementary strengths from both modalities - depth is sparse but insensitive to illumination and in metric scale, and image is dense but sensitive with scale ambiguity. SpaDe can be used in a plug-and-play fashion, which allows for 25% improvement when augmented onto existing methods to preprocess sparse depth. We demonstrate URL on the nuScenes dataset where we improve over all baselines by an average 11.65% in all-day scenarios, 11.23% when tested specifically for daytime, and 13.12% for nighttime scenes.

* 8 pages, 4 figures

Via

Access Paper or Ask Questions

GT-Rain Single Image Deraining Challenge Report

Mar 18, 2024

Howard Zhang, Yunhao Ba, Ethan Yang, Rishi Upadhyay, Alex Wong, Achuta Kadambi, Yun Guo, Xueyao Xiao, Xiaoxiong Wang, Yi Li(+13 more)

Figure 1 for GT-Rain Single Image Deraining Challenge Report

Figure 2 for GT-Rain Single Image Deraining Challenge Report

Figure 3 for GT-Rain Single Image Deraining Challenge Report

Abstract:This report reviews the results of the GT-Rain challenge on single image deraining at the UG2+ workshop at CVPR 2023. The aim of this competition is to study the rainy weather phenomenon in real world scenarios, provide a novel real world rainy image dataset, and to spark innovative ideas that will further the development of single image deraining methods on real images. Submissions were trained on the GT-Rain dataset and evaluated on an extension of the dataset consisting of 15 additional scenes. Scenes in GT-Rain are comprised of real rainy image and ground truth image captured moments after the rain had stopped. 275 participants were registered in the challenge and 55 competed in the final testing phase.

Via

Access Paper or Ask Questions

WeatherProof: A Paired-Dataset Approach to Semantic Segmentation in Adverse Weather

Dec 15, 2023

Blake Gella, Howard Zhang, Rishi Upadhyay, Tiffany Chang, Matthew Waliman, Yunhao Ba, Alex Wong, Achuta Kadambi

Figure 1 for WeatherProof: A Paired-Dataset Approach to Semantic Segmentation in Adverse Weather

Figure 2 for WeatherProof: A Paired-Dataset Approach to Semantic Segmentation in Adverse Weather

Figure 3 for WeatherProof: A Paired-Dataset Approach to Semantic Segmentation in Adverse Weather

Figure 4 for WeatherProof: A Paired-Dataset Approach to Semantic Segmentation in Adverse Weather

Abstract:The introduction of large, foundational models to computer vision has led to drastically improved performance on the task of semantic segmentation. However, these existing methods exhibit a large performance drop when testing on images degraded by weather conditions such as rain, fog, or snow. We introduce a general paired-training method that can be applied to all current foundational model architectures that leads to improved performance on images in adverse weather conditions. To this end, we create the WeatherProof Dataset, the first semantic segmentation dataset with accurate clear and adverse weather image pairs, which not only enables our new training paradigm, but also improves the evaluation of the performance gap between clear and degraded segmentation. We find that training on these paired clear and adverse weather frames which share an underlying scene results in improved performance on adverse weather data. With this knowledge, we propose a training pipeline which accentuates the advantages of paired-data training using consistency losses and language guidance, which leads to performance improvements by up to 18.4% as compared to standard training procedures.

Via

Access Paper or Ask Questions

Enhancing Diffusion Models with 3D Perspective Geometry Constraints

Dec 01, 2023

Rishi Upadhyay, Howard Zhang, Yunhao Ba, Ethan Yang, Blake Gella, Sicheng Jiang, Alex Wong, Achuta Kadambi

Abstract:While perspective is a well-studied topic in art, it is generally taken for granted in images. However, for the recent wave of high-quality image synthesis methods such as latent diffusion models, perspective accuracy is not an explicit requirement. Since these methods are capable of outputting a wide gamut of possible images, it is difficult for these synthesized images to adhere to the principles of linear perspective. We introduce a novel geometric constraint in the training process of generative models to enforce perspective accuracy. We show that outputs of models trained with this constraint both appear more realistic and improve performance of downstream models trained on generated images. Subjective human trials show that images generated with latent diffusion models trained with our constraint are preferred over images from the Stable Diffusion V2 model 70% of the time. SOTA monocular depth estimation models such as DPT and PixelFormer, fine-tuned on our images, outperform the original models trained on real images by up to 7.03% in RMSE and 19.3% in SqRel on the KITTI test set for zero-shot transfer.

* Project Webpage: http://visual.ee.ucla.edu/diffusionperspective.htm/

Via

Access Paper or Ask Questions

MIME: Minority Inclusion for Majority Group Enhancement of AI Performance

Sep 01, 2022

Pradyumna Chari, Yunhao Ba, Shreeram Athreya, Achuta Kadambi

Figure 1 for MIME: Minority Inclusion for Majority Group Enhancement of AI Performance

Figure 2 for MIME: Minority Inclusion for Majority Group Enhancement of AI Performance

Abstract:Several papers have rightly included minority groups in artificial intelligence (AI) training data to improve test inference for minority groups and/or society-at-large. A society-at-large consists of both minority and majority stakeholders. A common misconception is that minority inclusion does not increase performance for majority groups alone. In this paper, we make the surprising finding that including minority samples can improve test error for the majority group. In other words, minority group inclusion leads to majority group enhancements (MIME) in performance. A theoretical existence proof of the MIME effect is presented and found to be consistent with experimental results on six different datasets. Project webpage: https://visual.ee.ucla.edu/mime.htm/

Via

Access Paper or Ask Questions

Towards Ground Truth for Single Image Deraining

Jun 22, 2022

Yunhao Ba, Howard Zhang, Ethan Yang, Akira Suzuki, Arnold Pfahnl, Chethan Chinder Chandrappa, Celso de Melo, Suya You, Stefano Soatto, Alex Wong(+1 more)

Figure 1 for Towards Ground Truth for Single Image Deraining

Figure 2 for Towards Ground Truth for Single Image Deraining

Figure 3 for Towards Ground Truth for Single Image Deraining

Figure 4 for Towards Ground Truth for Single Image Deraining

Abstract:We propose a large-scale dataset of real-world rainy and clean image pairs and a method to remove degradations, induced by rain streaks and rain accumulation, from the image. As there exists no real-world dataset for deraining, current state-of-the-art methods rely on synthetic data and thus are limited by the sim2real domain gap; moreover, rigorous evaluation remains a challenge due to the absence of a real paired dataset. We fill this gap by collecting the first real paired deraining dataset through meticulous control of non-rain variations. Our dataset enables paired training and quantitative evaluation for diverse real-world rain phenomena (e.g. rain streaks and rain accumulation). To learn a representation invariant to rain phenomena, we propose a deep neural network that reconstructs the underlying scene by minimizing a rain-invariant loss between rainy and clean images. Extensive experiments demonstrate that the proposed dataset benefits existing derainers, and our model can outperform the state-of-the-art deraining methods on real rainy images under various conditions.

Via

Access Paper or Ask Questions

Overcoming Difficulty in Obtaining Dark-skinned Subjects for Remote-PPG by Synthetic Augmentation

Jun 10, 2021

Yunhao Ba, Zhen Wang, Kerim Doruk Karinca, Oyku Deniz Bozkurt, Achuta Kadambi

Figure 1 for Overcoming Difficulty in Obtaining Dark-skinned Subjects for Remote-PPG by Synthetic Augmentation

Figure 2 for Overcoming Difficulty in Obtaining Dark-skinned Subjects for Remote-PPG by Synthetic Augmentation

Figure 3 for Overcoming Difficulty in Obtaining Dark-skinned Subjects for Remote-PPG by Synthetic Augmentation

Figure 4 for Overcoming Difficulty in Obtaining Dark-skinned Subjects for Remote-PPG by Synthetic Augmentation

Abstract:Camera-based remote photoplethysmography (rPPG) provides a non-contact way to measure physiological signals (e.g., heart rate) using facial videos. Recent deep learning architectures have improved the accuracy of such physiological measurement significantly, yet they are restricted by the diversity of the annotated videos. The existing datasets MMSE-HR, AFRL, and UBFC-RPPG contain roughly 10%, 0%, and 5% of dark-skinned subjects respectively. The unbalanced training sets result in a poor generalization capability to unseen subjects and lead to unwanted bias toward different demographic groups. In Western academia, it is regrettably difficult in a university setting to collect data on these dark-skinned subjects. Here we show a first attempt to overcome the lack of dark-skinned subjects by synthetic augmentation. A joint optimization framework is utilized to translate real videos from light-skinned subjects to dark skin tones while retaining their pulsatile signals. In the experiment, our method exhibits around 31% reduction in mean absolute error for the dark-skinned group and 46% improvement on bias mitigation for all the groups, as compared with the previous work trained with just real samples.

Via

Access Paper or Ask Questions

Visual Physics: Discovering Physical Laws from Videos

Nov 27, 2019

Pradyumna Chari, Chinmay Talegaonkar, Yunhao Ba, Achuta Kadambi

Figure 1 for Visual Physics: Discovering Physical Laws from Videos

Figure 2 for Visual Physics: Discovering Physical Laws from Videos

Figure 3 for Visual Physics: Discovering Physical Laws from Videos

Figure 4 for Visual Physics: Discovering Physical Laws from Videos

Abstract:In this paper, we teach a machine to discover the laws of physics from video streams. We assume no prior knowledge of physics, beyond a temporal stream of bounding boxes. The problem is very difficult because a machine must learn not only a governing equation (e.g. projectile motion) but also the existence of governing parameters (e.g. velocities). We evaluate our ability to discover physical laws on videos of elementary physical phenomena, such as projectile motion or circular motion. These elementary tasks have textbook governing equations and enable ground truth verification of our approach.

Via

Access Paper or Ask Questions

Blending Diverse Physical Priors with Neural Networks

Oct 01, 2019

Yunhao Ba, Guangyuan Zhao, Achuta Kadambi

Figure 1 for Blending Diverse Physical Priors with Neural Networks

Figure 2 for Blending Diverse Physical Priors with Neural Networks

Figure 3 for Blending Diverse Physical Priors with Neural Networks

Figure 4 for Blending Diverse Physical Priors with Neural Networks

Abstract:Machine learning in context of physical systems merits a re-examination of the learning strategy. In addition to data, one can leverage a vast library of physical prior models (e.g. kinematics, fluid flow, etc) to perform more robust inference. The nascent sub-field of \emph{physics-based learning} (PBL) studies the blending of neural networks with physical priors. While previous PBL algorithms have been applied successfully to specific tasks, it is hard to generalize existing PBL methods to a wide range of physics-based problems. Such generalization would require an architecture that can adapt to variations in the correctness of the physics, or in the quality of training data. No such architecture exists. In this paper, we aim to generalize PBL, by making a first attempt to bring neural architecture search (NAS) to the realm of PBL. We introduce a new method known as physics-based neural architecture search (PhysicsNAS) that is a top-performer across a diverse range of quality in the physical model and the dataset.

Via

Access Paper or Ask Questions

Physics-based Neural Networks for Shape from Polarization

Mar 25, 2019

Yunhao Ba, Rui Chen, Yiqin Wang, Lei Yan, Boxin Shi, Achuta Kadambi

Figure 1 for Physics-based Neural Networks for Shape from Polarization

Figure 2 for Physics-based Neural Networks for Shape from Polarization

Figure 3 for Physics-based Neural Networks for Shape from Polarization

Figure 4 for Physics-based Neural Networks for Shape from Polarization

Abstract:How should prior knowledge from physics inform a neural network solution? We study the blending of physics and deep learning in the context of Shape from Polarization (SfP). The classic SfP problem recovers an object's shape from polarized photographs of the scene. The SfP problem is special because the physical models are only approximate. Previous attempts to solve SfP have been purely model-based, and are susceptible to errors when real-world conditions deviate from the idealized physics. In our solution, there is a subtlety to combining physics and neural networks. Our final solution blends deep learning with synthetic renderings (derived from physics) in the framework of a two-stage encoder. The lessons learned from this exemplary problem foreshadow the future impact of physics-based learning.

Via

Access Paper or Ask Questions