Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pedro V. Sander

LHPF: Look back the History and Plan for the Future in Autonomous Driving

Nov 26, 2024

Sheng Wang, Yao Tian, Xiaodong Mei, Ge Sun, Jie Cheng, Fulong Ma, Pedro V. Sander, Junwei Liang

Abstract:Decision-making and planning in autonomous driving critically reflect the safety of the system, making effective planning imperative. Current imitation learning-based planning algorithms often merge historical trajectories with present observations to predict future candidate paths. However, these algorithms typically assess the current and historical plans independently, leading to discontinuities in driving intentions and an accumulation of errors with each step in a discontinuous plan. To tackle this challenge, this paper introduces LHPF, an imitation learning planner that integrates historical planning information. Our approach employs a historical intention aggregation module that pools historical planning intentions, which are then combined with a spatial query vector to decode the final planning trajectory. Furthermore, we incorporate a comfort auxiliary task to enhance the human-like quality of the driving behavior. Extensive experiments using both real-world and synthetic data demonstrate that LHPF not only surpasses existing advanced learning-based planners in planning performance but also marks the first instance of a purely learning-based planner outperforming the expert. Additionally, the application of the historical intention aggregation module across various backbones highlights the considerable potential of the proposed method. The code will be made publicly available.

Via

Access Paper or Ask Questions

3D Video Loops from Asynchronous Input

Mar 21, 2023

Li Ma, Xiaoyu Li, Jing Liao, Pedro V. Sander

Abstract:Looping videos are short video clips that can be looped endlessly without visible seams or artifacts. They provide a very attractive way to capture the dynamism of natural scenes. Existing methods have been mostly limited to 2D representations. In this paper, we take a step forward and propose a practical solution that enables an immersive experience on dynamic 3D looping scenes. The key challenge is to consider the per-view looping conditions from asynchronous input while maintaining view consistency for the 3D representation. We propose a novel sparse 3D video representation, namely Multi-Tile Video (MTV), which not only provides a view-consistent prior, but also greatly reduces memory usage, making the optimization of a 4D volume tractable. Then, we introduce a two-stage pipeline to construct the 3D looping MTV from completely asynchronous multi-view videos with no time overlap. A novel looping loss based on video temporal retargeting algorithms is adopted during the optimization to loop the 3D scene. Experiments of our framework have shown promise in successfully generating and rendering photorealistic 3D looping videos in real time even on mobile devices. The code, dataset, and live demos are available in https://limacv.github.io/VideoLoop3D_web/.

* For more information, please visit the homepage at https://limacv.github.io/VideoLoop3D_web/

Via

Access Paper or Ask Questions

Water Simulation and Rendering from a Still Photograph

Oct 05, 2022

Ryusuke Sugimoto, Mingming He, Jing Liao, Pedro V. Sander

Figure 1 for Water Simulation and Rendering from a Still Photograph

Figure 2 for Water Simulation and Rendering from a Still Photograph

Figure 3 for Water Simulation and Rendering from a Still Photograph

Figure 4 for Water Simulation and Rendering from a Still Photograph

Abstract:We propose an approach to simulate and render realistic water animation from a single still input photograph. We first segment the water surface, estimate rendering parameters, and compute water reflection textures with a combination of neural networks and traditional optimization techniques. Then we propose an image-based screen space local reflection model to render the water surface overlaid on the input image and generate real-time water animation. Our approach creates realistic results with no user intervention for a wide variety of natural scenes containing large bodies of water with different lighting and water surface conditions. Since our method provides a 3D representation of the water surface, it naturally enables direct editing of water parameters and also supports interactive applications like adding synthetic objects to the scene.

* Accepted for publication at ACM SIGGRAPH Asia (Conference Papers). Videos, demos and updates will be on the project website: https://rsugimoto.net/WaterAnimationProject/

Via

Access Paper or Ask Questions

Deblur-NeRF: Neural Radiance Fields from Blurry Images

Nov 29, 2021

Li Ma, Xiaoyu Li, Jing Liao, Qi Zhang, Xuan Wang, Jue Wang, Pedro V. Sander

Figure 1 for Deblur-NeRF: Neural Radiance Fields from Blurry Images

Figure 2 for Deblur-NeRF: Neural Radiance Fields from Blurry Images

Figure 3 for Deblur-NeRF: Neural Radiance Fields from Blurry Images

Figure 4 for Deblur-NeRF: Neural Radiance Fields from Blurry Images

Abstract:Neural Radiance Field (NeRF) has gained considerable attention recently for 3D scene reconstruction and novel view synthesis due to its remarkable synthesis quality. However, image blurriness caused by defocus or motion, which often occurs when capturing scenes in the wild, significantly degrades its reconstruction quality. To address this problem, We propose Deblur-NeRF, the first method that can recover a sharp NeRF from blurry input. We adopt an analysis-by-synthesis approach that reconstructs blurry views by simulating the blurring process, thus making NeRF robust to blurry inputs. The core of this simulation is a novel Deformable Sparse Kernel (DSK) module that models spatially-varying blur kernels by deforming a canonical sparse kernel at each spatial location. The ray origin of each kernel point is jointly optimized, inspired by the physical blurring process. This module is parameterized as an MLP that has the ability to be generalized to various blur types. Jointly optimizing the NeRF and the DSK module allows us to restore a sharp NeRF. We demonstrate that our method can be used on both camera motion blur and defocus blur: the two most common types of blur in real scenes. Evaluation results on both synthetic and real-world data show that our method outperforms several baselines. The synthetic and real datasets along with the source code will be made publicly available to facilitate future research.

Via

Access Paper or Ask Questions

Microshift: An Efficient Image Compression Algorithm for Hardware

Apr 20, 2021

Bo Zhang, Pedro V. Sander, Chi-Ying Tsui, Amine Bermak

Figure 1 for Microshift: An Efficient Image Compression Algorithm for Hardware

Figure 2 for Microshift: An Efficient Image Compression Algorithm for Hardware

Figure 3 for Microshift: An Efficient Image Compression Algorithm for Hardware

Figure 4 for Microshift: An Efficient Image Compression Algorithm for Hardware

Abstract:In this paper, we propose an image compression algorithm called Microshift. We employ an algorithm hardware co-design methodology, yielding a hardware-friendly compression approach with low power consumption. In our method, the image is first micro-shifted, then the sub-quantized values are further compressed. Two methods, the FAST and MRF model, are proposed to recover the bit-depth by exploiting the spatial correlation of natural images. Both methods can decompress images progressively. Our compression algorithm compresses images to 1.25 bits per pixel on average with PSNR of 33.16 dB, outperforming other on-chip compression algorithms. Then, we propose a hardware architecture and implement the algorithm on an FPGA and ASIC. The results on the VLSI design further validate the low hardware complexity and high power efficiency, showing our method is promising, particularly for low-power wireless vision sensor networks.

* Accepted to IEEE Transactions on Circuits and Systems for Video Technology

Via

Access Paper or Ask Questions

Let's See Clearly: Contaminant Artifact Removal for Moving Cameras

Apr 18, 2021

Xiaoyu Li, Bo Zhang, Jing Liao, Pedro V. Sander

Figure 1 for Let's See Clearly: Contaminant Artifact Removal for Moving Cameras

Figure 2 for Let's See Clearly: Contaminant Artifact Removal for Moving Cameras

Figure 3 for Let's See Clearly: Contaminant Artifact Removal for Moving Cameras

Figure 4 for Let's See Clearly: Contaminant Artifact Removal for Moving Cameras

Abstract:Contaminants such as dust, dirt and moisture adhering to the camera lens can greatly affect the quality and clarity of the resulting image or video. In this paper, we propose a video restoration method to automatically remove these contaminants and produce a clean video. Our approach first seeks to detect attention maps that indicate the regions that need to be restored. In order to leverage the corresponding clean pixels from adjacent frames, we propose a flow completion module to hallucinate the flow of the background scene to the attention regions degraded by the contaminants. Guided by the attention maps and completed flows, we propose a recurrent technique to restore the input frame by fetching clean pixels from adjacent frames. Finally, a multi-frame processing stage is used to further process the entire video sequence in order to enforce temporal consistency. The entire network is trained on a synthetic dataset that approximates the physical lighting properties of contaminant artifacts. This new dataset and our novel framework lead to our method that is able to address different contaminants and outperforms competitive restoration approaches both qualitatively and quantitatively.

* 10 pages, 11 figures

Via

Access Paper or Ask Questions

Deep Sketch-guided Cartoon Video Synthesis

Aug 10, 2020

Xiaoyu Li, Bo Zhang, Jing Liao, Pedro V. Sander

Figure 1 for Deep Sketch-guided Cartoon Video Synthesis

Figure 2 for Deep Sketch-guided Cartoon Video Synthesis

Figure 3 for Deep Sketch-guided Cartoon Video Synthesis

Figure 4 for Deep Sketch-guided Cartoon Video Synthesis

Abstract:We propose a novel framework to produce cartoon videos by fetching the color information from two input keyframes while following the animated motion guided by a user sketch. The key idea of the proposed approach is to estimate the dense cross-domain correspondence between the sketch and cartoon video frames, following by a blending module with occlusion estimation to synthesize the middle frame guided by the sketch. After that, the inputs and the synthetic frame equipped with established correspondence are fed into an arbitrary-time frame interpolation pipeline to generate and refine additional inbetween frames. Finally, a video post-processing approach is used to further improve the result. Compared to common frame interpolation methods, our approach can address frames with relatively large motion and also has the flexibility to enable users to control the generated video sequences by editing the sketch guidance. By explicitly considering the correspondence between frames and the sketch, our methods can achieve high-quality synthetic results compared with image synthesis methods. Our results show that our system generalizes well to different movie frames, achieving better results than existing solutions.

* 14 pages, 16 figures

Via

Access Paper or Ask Questions

Document Rectification and Illumination Correction using a Patch-based CNN

Sep 20, 2019

Xiaoyu Li, Bo Zhang, Jing Liao, Pedro V. Sander

Figure 1 for Document Rectification and Illumination Correction using a Patch-based CNN

Figure 2 for Document Rectification and Illumination Correction using a Patch-based CNN

Figure 3 for Document Rectification and Illumination Correction using a Patch-based CNN

Figure 4 for Document Rectification and Illumination Correction using a Patch-based CNN

Abstract:We propose a novel learning method to rectify document images with various distortion types from a single input image. As opposed to previous learning-based methods, our approach seeks to first learn the distortion flow on input image patches rather than the entire image. We then present a robust technique to stitch the patch results into the rectified document by processing in the gradient domain. Furthermore, we propose a second network to correct the uneven illumination, further improving the readability and OCR accuracy. Due to the less complex distortion present on the smaller image patches, our patch-based approach followed by stitching and illumination correction can significantly improve the overall accuracy in both the synthetic and real datasets.

* 11 pages, 10 figures

Via

Access Paper or Ask Questions

Blind Geometric Distortion Correction on Images Through Deep Learning

Sep 08, 2019

Xiaoyu Li, Bo Zhang, Pedro V. Sander, Jing Liao

Figure 1 for Blind Geometric Distortion Correction on Images Through Deep Learning

Figure 2 for Blind Geometric Distortion Correction on Images Through Deep Learning

Figure 3 for Blind Geometric Distortion Correction on Images Through Deep Learning

Figure 4 for Blind Geometric Distortion Correction on Images Through Deep Learning

Abstract:We propose the first general framework to automatically correct different types of geometric distortion in a single input image. Our proposed method employs convolutional neural networks (CNNs) trained by using a large synthetic distortion dataset to predict the displacement field between distorted images and corrected images. A model fitting method uses the CNN output to estimate the distortion parameters, achieving a more accurate prediction. The final corrected image is generated based on the predicted flow using an efficient, high-quality resampling method. Experimental results demonstrate that our algorithm outperforms traditional correction methods, and allows for interesting applications such as distortion transfer, distortion exaggeration, and co-occurring distortion correction.

* 10 pages, 11 figures, published in CVPR 2019

Via

Access Paper or Ask Questions

Deep Exemplar-based Video Colorization

Jun 24, 2019

Bo Zhang, Mingming He, Jing Liao, Pedro V. Sander, Lu Yuan, Amine Bermak, Dong Chen

Figure 1 for Deep Exemplar-based Video Colorization

Figure 2 for Deep Exemplar-based Video Colorization

Figure 3 for Deep Exemplar-based Video Colorization

Figure 4 for Deep Exemplar-based Video Colorization

Abstract:This paper presents the first end-to-end network for exemplar-based video colorization. The main challenge is to achieve temporal consistency while remaining faithful to the reference style. To address this issue, we introduce a recurrent framework that unifies the semantic correspondence and color propagation steps. Both steps allow a provided reference image to guide the colorization of every frame, thus reducing accumulated propagation errors. Video frames are colorized in sequence based on the colorization history, and its coherency is further enforced by the temporal consistency loss. All of these components, learned end-to-end, help produce realistic videos with good temporal stability. Experiments show our result is superior to the state-of-the-art methods both quantitatively and qualitatively.

Via

Access Paper or Ask Questions