Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Meida Chen

Deformable Beta Splatting

Jan 27, 2025

Rong Liu, Dylan Sun, Meida Chen, Yue Wang, Andrew Feng

Abstract:3D Gaussian Splatting (3DGS) has advanced radiance field reconstruction by enabling real-time rendering. However, its reliance on Gaussian kernels for geometry and low-order Spherical Harmonics (SH) for color encoding limits its ability to capture complex geometries and diverse colors. We introduce Deformable Beta Splatting (DBS), a deformable and compact approach that enhances both geometry and color representation. DBS replaces Gaussian kernels with deformable Beta Kernels, which offer bounded support and adaptive frequency control to capture fine geometric details with higher fidelity while achieving better memory efficiency. In addition, we extended the Beta Kernel to color encoding, which facilitates improved representation of diffuse and specular components, yielding superior results compared to SH-based methods. Furthermore, Unlike prior densification techniques that depend on Gaussian properties, we mathematically prove that adjusting regularized opacity alone ensures distribution-preserved Markov chain Monte Carlo (MCMC), independent of the splatting kernel type. Experimental results demonstrate that DBS achieves state-of-the-art visual quality while utilizing only 45% of the parameters and rendering 1.5x faster than 3DGS-based methods. Notably, for the first time, splatting-based methods outperform state-of-the-art Neural Radiance Fields, highlighting the superior performance and efficiency of DBS for real-time radiance field rendering.

Via

Access Paper or Ask Questions

SplatMAP: Online Dense Monocular SLAM with 3D Gaussian Splatting

Jan 14, 2025

Yue Hu, Rong Liu, Meida Chen, Peter Beerel, Andrew Feng

Abstract:Achieving high-fidelity 3D reconstruction from monocular video remains challenging due to the inherent limitations of traditional methods like Structure-from-Motion (SfM) and monocular SLAM in accurately capturing scene details. While differentiable rendering techniques such as Neural Radiance Fields (NeRF) address some of these challenges, their high computational costs make them unsuitable for real-time applications. Additionally, existing 3D Gaussian Splatting (3DGS) methods often focus on photometric consistency, neglecting geometric accuracy and failing to exploit SLAM's dynamic depth and pose updates for scene refinement. We propose a framework integrating dense SLAM with 3DGS for real-time, high-fidelity dense reconstruction. Our approach introduces SLAM-Informed Adaptive Densification, which dynamically updates and densifies the Gaussian model by leveraging dense point clouds from SLAM. Additionally, we incorporate Geometry-Guided Optimization, which combines edge-aware geometric constraints and photometric consistency to jointly optimize the appearance and geometry of the 3DGS scene representation, enabling detailed and accurate SLAM mapping reconstruction. Experiments on the Replica and TUM-RGBD datasets demonstrate the effectiveness of our approach, achieving state-of-the-art results among monocular systems. Specifically, our method achieves a PSNR of 36.864, SSIM of 0.985, and LPIPS of 0.040 on Replica, representing improvements of 10.7%, 6.4%, and 49.4%, respectively, over the previous SOTA. On TUM-RGBD, our method outperforms the closest baseline by 10.2%, 6.6%, and 34.7% in the same metrics. These results highlight the potential of our framework in bridging the gap between photometric and geometric dense 3D scene representations, paving the way for practical and efficient monocular dense reconstruction.

Via

Access Paper or Ask Questions

Open-Vocabulary High-Resolution 3D (OVHR3D) Data Segmentation and Annotation Framework

Dec 09, 2024

Jiuyi Xu, Meida Chen, Andrew Feng, Yangming Shi, Zifan Yu

Abstract:In the domain of the U.S. Army modeling and simulation, the availability of high quality annotated 3D data is pivotal to creating virtual environments for training and simulations. Traditional methodologies for 3D semantic and instance segmentation, such as KpConv, RandLA, Mask3D, etc., are designed to train on extensive labeled datasets to obtain satisfactory performance in practical tasks. This requirement presents a significant challenge, given the inherent scarcity of manually annotated 3D datasets, particularly for the military use cases. Recognizing this gap, our previous research leverages the One World Terrain data repository manually annotated databases, as showcased at IITSEC 2019 and 2021, to enrich the training dataset for deep learning models. However, collecting and annotating large scale 3D data for specific tasks remains costly and inefficient. To this end, the objective of this research is to design and develop a comprehensive and efficient framework for 3D segmentation tasks to assist in 3D data annotation. This framework integrates Grounding DINO and Segment anything Model, augmented by an enhancement in 2D image rendering via 3D mesh. Furthermore, the authors have also developed a user friendly interface that facilitates the 3D annotation process, offering intuitive visualization of rendered images and the 3D point cloud.

* Interservice/Industry Training, Simulation and Education Conference (2024)

Via

Access Paper or Ask Questions

AtomGS: Atomizing Gaussian Splatting for High-Fidelity Radiance Field

May 22, 2024

Rong Liu, Rui Xu, Yue Hu, Meida Chen, Andrew Feng

Abstract:3D Gaussian Splatting (3DGS) has recently advanced radiance field reconstruction by offering superior capabilities for novel view synthesis and real-time rendering speed. However, its strategy of blending optimization and adaptive density control might lead to sub-optimal results; it can sometimes yield noisy geometry and blurry artifacts due to prioritizing optimizing large Gaussians at the cost of adequately densifying smaller ones. To address this, we introduce AtomGS, consisting of Atomized Proliferation and Geometry-Guided Optimization. The Atomized Proliferation constrains ellipsoid Gaussians of various sizes into more uniform-sized Atom Gaussians. The strategy enhances the representation of areas with fine features by placing greater emphasis on densification in accordance with scene details. In addition, we proposed a Geometry-Guided Optimization approach that incorporates an Edge-Aware Normal Loss. This optimization method effectively smooths flat surfaces while preserving intricate details. Our evaluation shows that AtomGS outperforms existing state-of-the-art methods in rendering quality. Additionally, it achieves competitive accuracy in geometry reconstruction and offers a significant improvement in training speed over other SDF-based methods. More interactive demos can be found in our website (https://rongliu-leo.github.io/AtomGS/).

Via

Access Paper or Ask Questions

TokenMotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection Via Learnable Token Selection

Nov 05, 2023

Zifan Yu, Erfan Bank Tavakoli, Meida Chen, Suya You, Raghuveer Rao, Sanjeev Agarwal, Fengbo Ren

Figure 1 for TokenMotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection Via Learnable Token Selection

Figure 2 for TokenMotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection Via Learnable Token Selection

Figure 3 for TokenMotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection Via Learnable Token Selection

Figure 4 for TokenMotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection Via Learnable Token Selection

Abstract:The area of Video Camouflaged Object Detection (VCOD) presents unique challenges in the field of computer vision due to texture similarities between target objects and their surroundings, as well as irregular motion patterns caused by both objects and camera movement. In this paper, we introduce TokenMotion (TMNet), which employs a transformer-based model to enhance VCOD by extracting motion-guided features using a learnable token selection. Evaluated on the challenging MoCA-Mask dataset, TMNet achieves state-of-the-art performance in VCOD. It outperforms the existing state-of-the-art method by a 12.8% improvement in weighted F-measure, an 8.4% enhancement in S-measure, and a 10.7% boost in mean IoU. The results demonstrate the benefits of utilizing motion-guided features via learnable token selection within a transformer-based framework to tackle the intricate task of VCOD.

* 5 pages

Via

Access Paper or Ask Questions

TransUPR: A Transformer-based Uncertain Point Refiner for LiDAR Point Cloud Semantic Segmentation

Feb 20, 2023

Zifan Yu, Meida Chen, Zhikang Zhang, Suya You, Fengbo Ren

Figure 1 for TransUPR: A Transformer-based Uncertain Point Refiner for LiDAR Point Cloud Semantic Segmentation

Figure 2 for TransUPR: A Transformer-based Uncertain Point Refiner for LiDAR Point Cloud Semantic Segmentation

Figure 3 for TransUPR: A Transformer-based Uncertain Point Refiner for LiDAR Point Cloud Semantic Segmentation

Figure 4 for TransUPR: A Transformer-based Uncertain Point Refiner for LiDAR Point Cloud Semantic Segmentation

Abstract:In this work, we target the problem of uncertain points refinement for image-based LiDAR point cloud semantic segmentation (LiDAR PCSS). This problem mainly results from the boundary-blurring problem of convolution neural networks (CNNs) and quantitation loss of spherical projection, which are often hard to avoid for common image-based LiDAR PCSS approaches. We propose a plug-and-play transformer-based uncertain point refiner (TransUPR) to address the problem. Through local feature aggregation, uncertain point localization, and self-attention-based transformer design, TransUPR, integrated into an existing range image-based LiDAR PCSS approach (e.g., CENet), achieves the state-of-the-art performance (68.2% mIoU) on Semantic-KITTI benchmark, which provides a performance improvement of 0.6% on the mIoU.

* 5 pages

Via

Access Paper or Ask Questions

STPLS3D: A Large-Scale Synthetic and Real Aerial Photogrammetry 3D Point Cloud Dataset

Mar 17, 2022

Meida Chen, Qingyong Hu, Thomas Hugues, Andrew Feng, Yu Hou, Kyle McCullough, Lucio Soibelman

Figure 1 for STPLS3D: A Large-Scale Synthetic and Real Aerial Photogrammetry 3D Point Cloud Dataset

Figure 2 for STPLS3D: A Large-Scale Synthetic and Real Aerial Photogrammetry 3D Point Cloud Dataset

Figure 3 for STPLS3D: A Large-Scale Synthetic and Real Aerial Photogrammetry 3D Point Cloud Dataset

Figure 4 for STPLS3D: A Large-Scale Synthetic and Real Aerial Photogrammetry 3D Point Cloud Dataset

Abstract:Although various 3D datasets with different functions and scales have been proposed recently, it remains challenging for individuals to complete the whole pipeline of large-scale data collection, sanitization, and annotation. Moreover, the created datasets usually suffer from extremely imbalanced class distribution or partial low-quality data samples. Motivated by this, we explore the procedurally synthetic 3D data generation paradigm to equip individuals with the full capability of creating large-scale annotated photogrammetry point clouds. Specifically, we introduce a synthetic aerial photogrammetry point clouds generation pipeline that takes full advantage of open geospatial data sources and off-the-shelf commercial packages. Unlike generating synthetic data in virtual games, where the simulated data usually have limited gaming environments created by artists, the proposed pipeline simulates the reconstruction process of the real environment by following the same UAV flight pattern on different synthetic terrain shapes and building densities, which ensure similar quality, noise pattern, and diversity with real data. In addition, the precise semantic and instance annotations can be generated fully automatically, avoiding the expensive and time-consuming manual annotation. Based on the proposed pipeline, we present a richly-annotated synthetic 3D aerial photogrammetry point cloud dataset, termed STPLS3D, with more than 16 $km^2$ of landscapes and up to 18 fine-grained semantic categories. For verification purposes, we also provide a parallel dataset collected from four areas in the real environment. Extensive experiments conducted on our datasets demonstrate the effectiveness and quality of the proposed synthetic dataset.

Via

Access Paper or Ask Questions

Ground material classification and for UAV-based photogrammetric 3D data A 2D-3D Hybrid Approach

Sep 24, 2021

Meida Chen, Andrew Feng, Yu Hou, Kyle McCullough, Pratusha Bhuvana Prasad, Lucio Soibelman

Figure 1 for Ground material classification and for UAV-based photogrammetric 3D data A 2D-3D Hybrid Approach

Figure 2 for Ground material classification and for UAV-based photogrammetric 3D data A 2D-3D Hybrid Approach

Figure 3 for Ground material classification and for UAV-based photogrammetric 3D data A 2D-3D Hybrid Approach

Figure 4 for Ground material classification and for UAV-based photogrammetric 3D data A 2D-3D Hybrid Approach

Abstract:In recent years, photogrammetry has been widely used in many areas to create photorealistic 3D virtual data representing the physical environment. The innovation of small unmanned aerial vehicles (sUAVs) has provided additional high-resolution imaging capabilities with low cost for mapping a relatively large area of interest. These cutting-edge technologies have caught the US Army and Navy's attention for the purpose of rapid 3D battlefield reconstruction, virtual training, and simulations. Our previous works have demonstrated the importance of information extraction from the derived photogrammetric data to create semantic-rich virtual environments (Chen et al., 2019). For example, an increase of simulation realism and fidelity was achieved by segmenting and replacing photogrammetric trees with game-ready tree models. In this work, we further investigated the semantic information extraction problem and focused on the ground material segmentation and object detection tasks. The main innovation of this work was that we leveraged both the original 2D images and the derived 3D photogrammetric data to overcome the challenges faced when using each individual data source. For ground material segmentation, we utilized an existing convolutional neural network architecture (i.e., 3DMV) which was originally designed for segmenting RGB-D sensed indoor data. We improved its performance for outdoor photogrammetric data by introducing a depth pooling layer in the architecture to take into consideration the distance between the source images and the reconstructed terrain model. To test the performance of our improved 3DMV, a ground truth ground material database was created using data from the One World Terrain (OWT) data repository. Finally, a workflow for importing the segmented ground materials into a virtual simulation scene was introduced, and visual results are reported in this paper.

* Interservice/Industry Training, Simulation, and Education Conference (I/ITSEC) 2021

Via

Access Paper or Ask Questions

Utilizing Satellite Imagery Datasets and Machine Learning Data Models to Evaluate Infrastructure Change in Undeveloped Regions

Sep 01, 2020

Kyle McCullough, Andrew Feng, Meida Chen, Ryan McAlinden

Figure 1 for Utilizing Satellite Imagery Datasets and Machine Learning Data Models to Evaluate Infrastructure Change in Undeveloped Regions

Figure 2 for Utilizing Satellite Imagery Datasets and Machine Learning Data Models to Evaluate Infrastructure Change in Undeveloped Regions

Figure 3 for Utilizing Satellite Imagery Datasets and Machine Learning Data Models to Evaluate Infrastructure Change in Undeveloped Regions

Figure 4 for Utilizing Satellite Imagery Datasets and Machine Learning Data Models to Evaluate Infrastructure Change in Undeveloped Regions

Abstract:In the globalized economic world, it has become important to understand the purpose behind infrastructural and construction initiatives occurring within developing regions of the earth. This is critical when the financing for such projects must be coming from external sources, as is occurring throughout major portions of the African continent. When it comes to imagery analysis to research these regions, ground and aerial coverage is either non-existent or not commonly acquired. However, imagery from a large number of commercial, private, and government satellites have produced enormous datasets with global coverage, compiling geospatial resources that can be mined and processed using machine learning algorithms and neural networks. The downside is that a majority of these geospatial data resources are in a state of technical stasis, as it is difficult to quickly parse and determine a plan for request and processing when acquiring satellite image data. A goal of this research is to allow automated monitoring for largescale infrastructure projects, such as railways, to determine reliable metrics that define and predict the direction construction initiatives could take, allowing for a directed monitoring via narrowed and targeted satellite imagery requests. By utilizing photogrammetric techniques on available satellite data to create 3D Meshes and Digital Surface Models (DSM) we hope to effectively predict transport routes. In understanding the potential directions that largescale transport infrastructure will take through predictive modeling, it becomes much easier to track, understand, and monitor progress, especially in areas with limited imagery coverage.

* Interservice/Industry Training, Simulation, and Education Conference (I/ITSEC) 2020

Via

Access Paper or Ask Questions

Semantic Segmentation and Data Fusion of Microsoft Bing 3D Cities and Small UAV-based Photogrammetric Data

Aug 21, 2020

Meida Chen, Andrew Feng, Kyle McCullough, Pratusha Bhuvana Prasad, Ryan McAlinden, Lucio Soibelman

Figure 1 for Semantic Segmentation and Data Fusion of Microsoft Bing 3D Cities and Small UAV-based Photogrammetric Data

Figure 2 for Semantic Segmentation and Data Fusion of Microsoft Bing 3D Cities and Small UAV-based Photogrammetric Data

Figure 3 for Semantic Segmentation and Data Fusion of Microsoft Bing 3D Cities and Small UAV-based Photogrammetric Data

Figure 4 for Semantic Segmentation and Data Fusion of Microsoft Bing 3D Cities and Small UAV-based Photogrammetric Data

Abstract:With state-of-the-art sensing and photogrammetric techniques, Microsoft Bing Maps team has created over 125 highly detailed 3D cities from 11 different countries that cover hundreds of thousands of square kilometer areas. The 3D city models were created using the photogrammetric technique with high-resolution images that were captured from aircraft-mounted cameras. Such a large 3D city database has caught the attention of the US Army for creating virtual simulation environments to support military operations. However, the 3D city models do not have semantic information such as buildings, vegetation, and ground and cannot allow sophisticated user-level and system-level interaction. At I/ITSEC 2019, the authors presented a fully automated data segmentation and object information extraction framework for creating simulation terrain using UAV-based photogrammetric data. This paper discusses the next steps in extending our designed data segmentation framework for segmenting 3D city data. In this study, the authors first investigated the strengths and limitations of the existing framework when applied to the Bing data. The main differences between UAV-based and aircraft-based photogrammetric data are highlighted. The data quality issues in the aircraft-based photogrammetric data, which can negatively affect the segmentation performance, are identified. Based on the findings, a workflow was designed specifically for segmenting Bing data while considering its characteristics. In addition, since the ultimate goal is to combine the use of both small unmanned aerial vehicle (UAV) collected data and the Bing data in a virtual simulation environment, data from these two sources needed to be aligned and registered together. To this end, the authors also proposed a data registration workflow that utilized the traditional iterative closest point (ICP) with the extracted semantic information.

* Interservice/Industry Training, Simulation, and Education Conference (I/ITSEC) 2020

Via

Access Paper or Ask Questions