Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhuo Hui

UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model

May 04, 2024

Shuai Yuan, Lei Luo, Zhuo Hui, Can Pu, Xiaoyu Xiang, Rakesh Ranjan, Denis Demandolx

Figure 1 for UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model

Figure 2 for UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model

Figure 3 for UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model

Figure 4 for UnSAMFlow: Unsupervised Optical Flow Guided by Segment Anything Model

Abstract:Traditional unsupervised optical flow methods are vulnerable to occlusions and motion boundaries due to lack of object-level information. Therefore, we propose UnSAMFlow, an unsupervised flow network that also leverages object information from the latest foundation model Segment Anything Model (SAM). We first include a self-supervised semantic augmentation module tailored to SAM masks. We also analyze the poor gradient landscapes of traditional smoothness losses and propose a new smoothness definition based on homography instead. A simple yet effective mask feature module has also been added to further aggregate features on the object level. With all these adaptations, our method produces clear optical flow estimation with sharp boundaries around objects, which outperforms state-of-the-art methods on both KITTI and Sintel datasets. Our method also generalizes well across domains and runs very efficiently.

* Accepted by CVPR 2024. Code is available at https://github.com/facebookresearch/UnSAMFlow

Via

Access Paper or Ask Questions

AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural Representation

Mar 29, 2023

Hyunyoung Jung, Zhuo Hui, Lei Luo, Haitao Yang, Feng Liu, Sungjoo Yoo, Rakesh Ranjan, Denis Demandolx

Figure 1 for AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural Representation

Figure 2 for AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural Representation

Figure 3 for AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural Representation

Figure 4 for AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural Representation

Abstract:To apply optical flow in practice, it is often necessary to resize the input to smaller dimensions in order to reduce computational costs. However, downsizing inputs makes the estimation more challenging because objects and motion ranges become smaller. Even though recent approaches have demonstrated high-quality flow estimation, they tend to fail to accurately model small objects and precise boundaries when the input resolution is lowered, restricting their applicability to high-resolution inputs. In this paper, we introduce AnyFlow, a robust network that estimates accurate flow from images of various resolutions. By representing optical flow as a continuous coordinate-based representation, AnyFlow generates outputs at arbitrary scales from low-resolution inputs, demonstrating superior performance over prior works in capturing tiny objects with detail preservation on a wide range of scenes. We establish a new state-of-the-art performance of cross-dataset generalization on the KITTI dataset, while achieving comparable accuracy on the online benchmarks to other SOTA methods.

* CVPR 2023 (Highlight)

Via

Access Paper or Ask Questions

Learning to Separate Multiple Illuminants in a Single Image

Nov 29, 2018

Zhuo Hui, Ayan Chakrabarti, Kalyan Sunkavalli, Aswin C. Sankaranarayanan

Figure 1 for Learning to Separate Multiple Illuminants in a Single Image

Figure 2 for Learning to Separate Multiple Illuminants in a Single Image

Figure 3 for Learning to Separate Multiple Illuminants in a Single Image

Figure 4 for Learning to Separate Multiple Illuminants in a Single Image

Abstract:We present a method to separate a single image captured under two illuminants, with different spectra, into the two images corresponding to the appearance of the scene under each individual illuminant. We do this by training a deep neural network to predict the per-pixel reflectance chromaticity of the scene, which we use in conjunction with a previous flash/no-flash image-based separation algorithm to produce the final two output images. We design our reflectance chromaticity network and loss functions by incorporating intuitions from the physics of image formation. We show that this leads to significantly better performance than other single image techniques and even approaches the quality of the two image separation method.

Via

Access Paper or Ask Questions

Illuminant Spectra-based Source Separation Using Flash Photography

Nov 27, 2017

Zhuo Hui, Kalyan Sunkavalli, Sunil Hadap, Aswin C. Sankaranarayanan

Figure 1 for Illuminant Spectra-based Source Separation Using Flash Photography

Figure 2 for Illuminant Spectra-based Source Separation Using Flash Photography

Figure 3 for Illuminant Spectra-based Source Separation Using Flash Photography

Figure 4 for Illuminant Spectra-based Source Separation Using Flash Photography

Abstract:Real-world lighting often consists of multiple illuminants with different spectra. Separating and manipulating these illuminants in post-process is a challenging problem that requires either significant manual input or calibrated scene geometry and lighting. In this work, we leverage a flash/no-flash image pair to analyze and edit scene illuminants based on their spectral differences. We derive a novel physics-based relationship between color variations in the observed flash/no-flash intensities and the spectra and surface shading corresponding to individual scene illuminants. Our technique uses this constraint to automatically separate an image into constituent images lit by each illuminant. This separation can be used to support applications like white balancing, lighting editing, and RGB photometric stereo, where we demonstrate results that outperform state-of-the-art techniques on a wide range of images.

Via

Access Paper or Ask Questions

Shape and Spatially-Varying Reflectance Estimation From Virtual Exemplars

Sep 21, 2016

Zhuo Hui, Aswin C Sankaranarayanan

Figure 1 for Shape and Spatially-Varying Reflectance Estimation From Virtual Exemplars

Figure 2 for Shape and Spatially-Varying Reflectance Estimation From Virtual Exemplars

Figure 3 for Shape and Spatially-Varying Reflectance Estimation From Virtual Exemplars

Figure 4 for Shape and Spatially-Varying Reflectance Estimation From Virtual Exemplars

Abstract:This paper addresses the problem of estimating the shape of objects that exhibit spatially-varying reflectance. We assume that multiple images of the object are obtained under a fixed view-point and varying illumination, i.e., the setting of photometric stereo. At the core of our techniques is the assumption that the BRDF at each pixel lies in the non-negative span of a known BRDF dictionary.This assumption enables a per-pixel surface normal and BRDF estimation framework that is computationally tractable and requires no initialization in spite of the underlying problem being non-convex. Our estimation framework first solves for the surface normal at each pixel using a variant of example-based photometric stereo. We design an efficient multi-scale search strategy for estimating the surface normal and subsequently, refine this estimate using a gradient descent procedure. Given the surface normal estimate, we solve for the spatially-varying BRDF by constraining the BRDF at each pixel to be in the span of the BRDF dictionary, here, we use additional priors to further regularize the solution. A hallmark of our approach is that it does not require iterative optimization techniques nor the need for careful initialization, both of which are endemic to most state-of-the-art techniques. We showcase the performance of our technique on a wide range of simulated and real scenes where we outperform competing methods.

* PAMI minor revision. arXiv admin note: substantial text overlap with arXiv:1503.04265

Via

Access Paper or Ask Questions

An Empirical Study of Dimensional Reduction Techniques for Facial Action Units Detection

Mar 25, 2016

Zhuo Hui, Wen-Sheng Chu

Figure 1 for An Empirical Study of Dimensional Reduction Techniques for Facial Action Units Detection

Figure 2 for An Empirical Study of Dimensional Reduction Techniques for Facial Action Units Detection

Figure 3 for An Empirical Study of Dimensional Reduction Techniques for Facial Action Units Detection

Figure 4 for An Empirical Study of Dimensional Reduction Techniques for Facial Action Units Detection

Abstract:Biologically inspired features, such as Gabor filters, result in very high dimensional measurement. Does reducing the dimensionality of the feature space afford advantages beyond computational efficiency? Do some approaches to dimensionality reduction (DR) yield improved action unit detection? To answer these questions, we compared DR approaches in two relatively large databases of spontaneous facial behavior (45 participants in total with over 2 minutes of FACS-coded video per participant). Facial features were tracked and aligned using active appearance models (AAM). SIFT and Gabor features were extracted from local facial regions. We compared linear (PCA and KPCA), manifold (LPP and LLE), supervised (LDA and KDA) and hybrid approaches (LSDA) to DR with respect to AU detection. For further comparison, a no-DR control condition was included as well. Linear support vector machine classifiers with independent train and test sets were used for AU detection. AU detection was quantified using area under the ROC curve and F1. Baseline results for PCA with Gabor features were comparable with previous research. With some notable exceptions, DR improved AU detection relative to no-DR. Locality embedding approaches proved vulnerable to \emph{out-of-sample} problems. Gradient-based SIFT lead to better AU detection than the filter-based Gabor features. For area under the curve, few differences were found between linear and other DR approaches. For F1, results were mixed. For both metrics, the pattern of results varied among action units. These findings suggest that action unit detection may be optimized by using specific DR for specific action units. PCA and LDA were the most efficient approaches; KDA was the least efficient.

* Report on DR

Via

Access Paper or Ask Questions

A Dictionary-based Approach for Estimating Shape and Spatially-Varying Reflectance

Mar 14, 2015

Zhuo Hui, Aswin C. Sankaranarayanan

Figure 1 for A Dictionary-based Approach for Estimating Shape and Spatially-Varying Reflectance

Figure 2 for A Dictionary-based Approach for Estimating Shape and Spatially-Varying Reflectance

Figure 3 for A Dictionary-based Approach for Estimating Shape and Spatially-Varying Reflectance

Figure 4 for A Dictionary-based Approach for Estimating Shape and Spatially-Varying Reflectance

Abstract:We present a technique for estimating the shape and reflectance of an object in terms of its surface normals and spatially-varying BRDF. We assume that multiple images of the object are obtained under fixed view-point and varying illumination, i.e, the setting of photometric stereo. Assuming that the BRDF at each pixel lies in the non-negative span of a known BRDF dictionary, we derive a per-pixel surface normal and BRDF estimation framework that requires neither iterative optimization techniques nor careful initialization, both of which are endemic to most state-of-the-art techniques. We showcase the performance of our technique on a wide range of simulated and real scenes where we outperform competing methods.

* IEEE Intl. Conf. Computational Photography, 2015

Via

Access Paper or Ask Questions