Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chuong Nguyen

GS-2DGS: Geometrically Supervised 2DGS for Reflective Object Reconstruction

Jun 16, 2025

Jinguang Tong, Xuesong li, Fahira Afzal Maken, Sundaram Muthu, Lars Petersson, Chuong Nguyen, Hongdong Li

Abstract:3D modeling of highly reflective objects remains challenging due to strong view-dependent appearances. While previous SDF-based methods can recover high-quality meshes, they are often time-consuming and tend to produce over-smoothed surfaces. In contrast, 3D Gaussian Splatting (3DGS) offers the advantage of high speed and detailed real-time rendering, but extracting surfaces from the Gaussians can be noisy due to the lack of geometric constraints. To bridge the gap between these approaches, we propose a novel reconstruction method called GS-2DGS for reflective objects based on 2D Gaussian Splatting (2DGS). Our approach combines the rapid rendering capabilities of Gaussian Splatting with additional geometric information from foundation models. Experimental results on synthetic and real datasets demonstrate that our method significantly outperforms Gaussian-based techniques in terms of reconstruction and relighting and achieves performance comparable to SDF-based methods while being an order of magnitude faster. Code is available at https://github.com/hirotong/GS2DGS

* Accepted by CVPR2025

Via

Access Paper or Ask Questions

Mastering Agile Jumping Skills from Simple Practices with Iterative Learning Control

Aug 05, 2024

Chuong Nguyen, Lingfan Bao, Quan Nguyen

Figure 1 for Mastering Agile Jumping Skills from Simple Practices with Iterative Learning Control

Figure 2 for Mastering Agile Jumping Skills from Simple Practices with Iterative Learning Control

Figure 3 for Mastering Agile Jumping Skills from Simple Practices with Iterative Learning Control

Figure 4 for Mastering Agile Jumping Skills from Simple Practices with Iterative Learning Control

Abstract:Achieving precise target jumping with legged robots poses a significant challenge due to the long flight phase and the uncertainties inherent in contact dynamics and hardware. Forcefully attempting these agile motions on hardware could result in severe failures and potential damage. Motivated by these challenging problems, we propose an Iterative Learning Control (ILC) approach that aims to learn and refine jumping skills from easy to difficult, instead of directly learning these challenging tasks. We verify that learning from simplicity can enhance safety and target jumping accuracy over trials. Compared to other ILC approaches for legged locomotion, our method can tackle the problem of a long flight phase where control input is not available. In addition, our approach allows the robot to apply what it learns from a simple jumping task to accomplish more challenging tasks within a few trials directly in hardware, instead of learning from scratch. We validate the method via extensive experiments in the A1 model and hardware for various jumping tasks. Starting from a small jump (e.g., a forward leap of 40cm), our learning approach empowers the robot to accomplish a variety of challenging targets, including jumping onto a 20cm high box, jumping to a greater distance of up to 60cm, as well as performing jumps while carrying an unknown payload of 2kg. Our framework can allow the robot to reach the desired position and orientation targets with approximate errors of 1cm and 1 degree within a few trials.

* Legged Robots, Dynamic Jumping, Iterative Learning

Via

Access Paper or Ask Questions

Adaptive-Frequency Model Learning and Predictive Control for Dynamic Maneuvers on Legged Robots

Jul 20, 2024

Chuong Nguyen, Abdullah Altawaitan, Thai Duong, Nikolay Atanasov, Quan Nguyen

Figure 1 for Adaptive-Frequency Model Learning and Predictive Control for Dynamic Maneuvers on Legged Robots

Figure 2 for Adaptive-Frequency Model Learning and Predictive Control for Dynamic Maneuvers on Legged Robots

Figure 3 for Adaptive-Frequency Model Learning and Predictive Control for Dynamic Maneuvers on Legged Robots

Figure 4 for Adaptive-Frequency Model Learning and Predictive Control for Dynamic Maneuvers on Legged Robots

Abstract:Achieving both target accuracy and robustness in dynamic maneuvers with long flight phases, such as high or long jumps, has been a significant challenge for legged robots. To address this challenge, we propose a novel learning-based control approach consisting of model learning and model predictive control (MPC) utilizing an adaptive frequency scheme. Compared to existing MPC techniques, we learn a model directly from experiments, accounting not only for leg dynamics but also for modeling errors and unknown dynamics mismatch in hardware and during contact. Additionally, learning the model with adaptive frequency allows us to cover the entire flight phase and final jumping target, enhancing the prediction accuracy of the jumping trajectory. Using the learned model, we also design an adaptive-frequency MPC to effectively leverage different jumping phases and track the target accurately. In hardware experiments with a Unitree A1 robot, we demonstrate that our approach outperforms baseline MPC using a nominal model, reducing the jumping distance error up to 8 times. We achieve jumping distance errors of less than 3 percent during continuous jumping on uneven terrain with randomly-placed perturbations of random heights (up to 4 cm or 27 percent of the robot's standing height). Our approach obtains distance errors of 1-2 cm on 34 single and continuous jumps with different jumping targets and model uncertainties.

* 8 pages, submitted to the IEEE for possible publication

Via

Access Paper or Ask Questions

SOAF: Scene Occlusion-aware Neural Acoustic Field

Jul 02, 2024

Huiyu Gao, Jiahao Ma, David Ahmedt-Aristizabal, Chuong Nguyen, Miaomiao Liu

Abstract:This paper tackles the problem of novel view audio-visual synthesis along an arbitrary trajectory in an indoor scene, given the audio-video recordings from other known trajectories of the scene. Existing methods often overlook the effect of room geometry, particularly wall occlusion to sound propagation, making them less accurate in multi-room environments. In this work, we propose a new approach called Scene Occlusion-aware Acoustic Field (SOAF) for accurate sound generation. Our approach derives a prior for sound energy field using distance-aware parametric sound-propagation modelling and then transforms it based on scene transmittance learned from the input video. We extract features from the local acoustic field centred around the receiver using a Fibonacci Sphere to generate binaural audio for novel views with a direction-aware attention mechanism. Extensive experiments on the real dataset~\emph{RWAVS} and the synthetic dataset~\emph{SoundSpaces} demonstrate that our method outperforms previous state-of-the-art techniques in audio generation. Project page: https://github.com/huiyu-gao/SOAF/.

Via

Access Paper or Ask Questions

HashPoint: Accelerated Point Searching and Sampling for Neural Rendering

Apr 22, 2024

Jiahao Ma, Miaomiao Liu, David Ahmedt-Aristizaba, Chuong Nguyen

Abstract:In this paper, we address the problem of efficient point searching and sampling for volume neural rendering. Within this realm, two typical approaches are employed: rasterization and ray tracing. The rasterization-based methods enable real-time rendering at the cost of increased memory and lower fidelity. In contrast, the ray-tracing-based methods yield superior quality but demand longer rendering time. We solve this problem by our HashPoint method combining these two strategies, leveraging rasterization for efficient point searching and sampling, and ray marching for rendering. Our method optimizes point searching by rasterizing points within the camera's view, organizing them in a hash table, and facilitating rapid searches. Notably, we accelerate the rendering process by adaptive sampling on the primary surface encountered by the ray. Our approach yields substantial speed-up for a range of state-of-the-art ray-tracing-based methods, maintaining equivalent or superior accuracy across synthetic and real test datasets. The code will be available at https://jiahao-ma.github.io/hashpoint/.

* The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024
* CVPR2024 Highlight

Via

Access Paper or Ask Questions

Homography Guided Temporal Fusion for Road Line and Marking Segmentation

Apr 11, 2024

Shan Wang, Chuong Nguyen, Jiawei Liu, Kaihao Zhang, Wenhan Luo, Yanhao Zhang, Sundaram Muthu, Fahira Afzal Maken, Hongdong Li

Abstract:Reliable segmentation of road lines and markings is critical to autonomous driving. Our work is motivated by the observations that road lines and markings are (1) frequently occluded in the presence of moving vehicles, shadow, and glare and (2) highly structured with low intra-class shape variance and overall high appearance consistency. To solve these issues, we propose a Homography Guided Fusion (HomoFusion) module to exploit temporally-adjacent video frames for complementary cues facilitating the correct classification of the partially occluded road lines or markings. To reduce computational complexity, a novel surface normal estimator is proposed to establish spatial correspondences between the sampled frames, allowing the HomoFusion module to perform a pixel-to-pixel attention mechanism in updating the representation of the occluded road lines or markings. Experiments on ApolloScape, a large-scale lane mark segmentation dataset, and ApolloScape Night with artificial simulated night-time road conditions, demonstrate that our method outperforms other existing SOTA lane mark segmentation models with less than 9\% of their parameters and computational complexity. We show that exploiting available camera intrinsic data and ground plane assumption for cross-frame correspondence can lead to a light-weight network with significantly improved performances in speed and accuracy. We also prove the versatility of our HomoFusion approach by applying it to the problem of water puddle segmentation and achieving SOTA performance.

* Accepted by ICCV 2023

Via

Access Paper or Ask Questions

Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis

Nov 27, 2023

Léo Lebrat, Rodrigo Santa Cruz, Remi Chierchia, Yulia Arzhaeva, Mohammad Ali Armin, Joshua Goldsmith, Jeremy Oorloff, Prithvi Reddy, Chuong Nguyen, Lars Petersson(+5 more)

Figure 1 for Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis

Figure 2 for Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis

Figure 3 for Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis

Figure 4 for Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis

Abstract:Wound management poses a significant challenge, particularly for bedridden patients and the elderly. Accurate diagnostic and healing monitoring can significantly benefit from modern image analysis, providing accurate and precise measurements of wounds. Despite several existing techniques, the shortage of expansive and diverse training datasets remains a significant obstacle to constructing machine learning-based frameworks. This paper introduces Syn3DWound, an open-source dataset of high-fidelity simulated wounds with 2D and 3D annotations. We propose baseline methods and a benchmarking framework for automated 3D morphometry analysis and 2D/3D wound segmentation.

Via

Access Paper or Ask Questions

Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container

Mar 24, 2023

Jinguang Tong, Sundaram Muthu, Fahira Afzal Maken, Chuong Nguyen, Hongdong Li

Abstract:In this paper, we define a new problem of recovering the 3D geometry of an object confined in a transparent enclosure. We also propose a novel method for solving this challenging problem. Transparent enclosures pose challenges of multiple light reflections and refractions at the interface between different propagation media e.g. air or glass. These multiple reflections and refractions cause serious image distortions which invalidate the single viewpoint assumption. Hence the 3D geometry of such objects cannot be reliably reconstructed using existing methods, such as traditional structure from motion or modern neural reconstruction methods. We solve this problem by explicitly modeling the scene as two distinct sub-spaces, inside and outside the transparent enclosure. We use an existing neural reconstruction method (NeuS) that implicitly represents the geometry and appearance of the inner subspace. In order to account for complex light interactions, we develop a hybrid rendering strategy that combines volume rendering with ray tracing. We then recover the underlying geometry and appearance of the model by minimizing the difference between the real and hybrid rendered images. We evaluate our method on both synthetic and real data. Experiment results show that our method outperforms the state-of-the-art (SOTA) methods. Codes and data will be available at https://github.com/hirotong/ReNeuS

* Accepted to CVPR2023

Via

Access Paper or Ask Questions

Smart Headset, Computer Vision and Machine Learning for Efficient Prawn Farm Management

Oct 14, 2022

Mingze Xi, Ashfaqur Rahman, Chuong Nguyen, Stuart Arnold, John McCulloch

Figure 1 for Smart Headset, Computer Vision and Machine Learning for Efficient Prawn Farm Management

Figure 2 for Smart Headset, Computer Vision and Machine Learning for Efficient Prawn Farm Management

Figure 3 for Smart Headset, Computer Vision and Machine Learning for Efficient Prawn Farm Management

Figure 4 for Smart Headset, Computer Vision and Machine Learning for Efficient Prawn Farm Management

Abstract:Understanding the growth and distribution of the prawns is critical for optimising the feed and harvest strategies. An inadequate understanding of prawn growth can lead to reduced financial gain, for example, crops are harvested too early. The key to maintaining a good understanding of prawn growth is frequent sampling. However, the most commonly adopted sampling practice, the cast net approach, is unable to sample the prawns at a high frequency as it is expensive and laborious. An alternative approach is to sample prawns from feed trays that farm workers inspect each day. This will allow growth data collection at a high frequency (each day). But measuring prawns manually each day is a laborious task. In this article, we propose a new approach that utilises smart glasses, depth camera, computer vision and machine learning to detect prawn distribution and growth from feed trays. A smart headset was built to allow farmers to collect prawn data while performing daily feed tray checks. A computer vision + machine learning pipeline was developed and demonstrated to detect the growth trends of prawns in 4 prawn ponds over a growing season.

* Submitted to Elsevier Aquacultural Engineering

Via

Access Paper or Ask Questions

Multiview Detection with Cardboard Human Modeling

Jul 10, 2022

Jiahao Ma, Zicheng Duan, Yunzhong Hou, Liang Zheng, Chuong Nguyen

Figure 1 for Multiview Detection with Cardboard Human Modeling

Figure 2 for Multiview Detection with Cardboard Human Modeling

Figure 3 for Multiview Detection with Cardboard Human Modeling

Figure 4 for Multiview Detection with Cardboard Human Modeling

Abstract:Multiview detection uses multiple calibrated cameras with overlapping fields of views to locate occluded pedestrians. In this field, existing methods typically adopt a "human modeling - aggregation" strategy. To find robust pedestrian representations, some intuitively use locations of detected 2D bounding boxes, while others use entire frame features projected to the ground plane. However, the former does not consider human appearance and leads to many ambiguities, and the latter suffers from projection errors due to the lack of accurate height of the human torso and head. In this paper, we propose a new pedestrian representation scheme based on human point clouds modeling. Specifically, using ray tracing for holistic human depth estimation, we model pedestrians as upright, thin cardboard point clouds on the ground. Then, we aggregate the point clouds of the pedestrian cardboard across multiple views for a final decision. Compared with existing representations, the proposed method explicitly leverages human appearance and reduces projection errors significantly by relatively accurate height estimation. On two standard evaluation benchmarks, the proposed method achieves very competitive results.

* The thesis is not perfect enough

Via

Access Paper or Ask Questions