Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yisong Chen

HUG: Hierarchical Urban Gaussian Splatting with Block-Based Reconstruction

Apr 23, 2025

Zhongtao Wang, Mai Su, Huishan Au, Yilong Li, Xizhe Cao, Chengwei Pan, Yisong Chen, Guoping Wang

Abstract:As urban 3D scenes become increasingly complex and the demand for high-quality rendering grows, efficient scene reconstruction and rendering techniques become crucial. We present HUG, a novel approach to address inefficiencies in handling large-scale urban environments and intricate details based on 3D Gaussian splatting. Our method optimizes data partitioning and the reconstruction pipeline by incorporating a hierarchical neural Gaussian representation. We employ an enhanced block-based reconstruction pipeline focusing on improving reconstruction quality within each block and reducing the need for redundant training regions around block boundaries. By integrating neural Gaussian representation with a hierarchical architecture, we achieve high-quality scene rendering at a low computational cost. This is demonstrated by our state-of-the-art results on public benchmarks, which prove the effectiveness and advantages in large-scale urban scene representation.

Via

Access Paper or Ask Questions

SAIP-Net: Enhancing Remote Sensing Image Segmentation via Spectral Adaptive Information Propagation

Apr 23, 2025

Zhongtao Wang, Xizhe Cao, Yisong Chen, Guoping Wang

Abstract:Semantic segmentation of remote sensing imagery demands precise spatial boundaries and robust intra-class consistency, challenging conventional hierarchical models. To address limitations arising from spatial domain feature fusion and insufficient receptive fields, this paper introduces SAIP-Net, a novel frequency-aware segmentation framework that leverages Spectral Adaptive Information Propagation. SAIP-Net employs adaptive frequency filtering and multi-scale receptive field enhancement to effectively suppress intra-class feature inconsistencies and sharpen boundary lines. Comprehensive experiments demonstrate significant performance improvements over state-of-the-art methods, highlighting the effectiveness of spectral-adaptive strategies combined with expanded receptive fields for remote sensing image segmentation.

Via

Access Paper or Ask Questions

The Role of Machine Learning in Reducing Healthcare Costs: The Impact of Medication Adherence and Preventive Care on Hospitalization Expenses

Apr 10, 2025

Yixin Zhang, Yisong Chen

Abstract:This study reveals the important role of prevention care and medication adherence in reducing hospitalizations. By using a structured dataset of 1,171 patients, four machine learning models Logistic Regression, Gradient Boosting, Random Forest, and Artificial Neural Networks are applied to predict five-year hospitalization risk, with the Gradient Boosting model achieving the highest accuracy of 81.2%. The result demonstrated that patients with high medication adherence and consistent preventive care can reduce 38.3% and 37.7% in hospitalization risk. The finding also suggests that targeted preventive care can have positive Return on Investment (ROI), and therefore ML models can effectively direct personalized interventions and contribute to long-term medical savings.

Via

Access Paper or Ask Questions

AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network

Aug 09, 2021

Zizhuang Wei, Qingtian Zhu, Chen Min, Yisong Chen, Guoping Wang

Figure 1 for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network

Figure 2 for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network

Figure 3 for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network

Figure 4 for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network

Abstract:In this paper, we present a novel recurrent multi-view stereo network based on long short-term memory (LSTM) with adaptive aggregation, namely AA-RMVSNet. We firstly introduce an intra-view aggregation module to adaptively extract image features by using context-aware convolution and multi-scale aggregation, which efficiently improves the performance on challenging regions, such as thin objects and large low-textured surfaces. To overcome the difficulty of varying occlusion in complex scenes, we propose an inter-view cost volume aggregation module for adaptive pixel-wise view aggregation, which is able to preserve better-matched pairs among all views. The two proposed adaptive aggregation modules are lightweight, effective and complementary regarding improving the accuracy and completeness of 3D reconstruction. Instead of conventional 3D CNNs, we utilize a hybrid network with recurrent structure for cost volume regularization, which allows high-resolution reconstruction and finer hypothetical plane sweep. The proposed network is trained end-to-end and achieves excellent performance on various datasets. It ranks $1^{st}$ among all submissions on Tanks and Temples benchmark and achieves competitive results on DTU dataset, which exhibits strong generalizability and robustness. Implementation of our method is available at https://github.com/QT-Zhu/AA-RMVSNet.

Via

Access Paper or Ask Questions

Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking

Jul 21, 2020

Jianfeng Yan, Zizhuang Wei, Hongwei Yi, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai

Figure 1 for Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking

Figure 2 for Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking

Figure 3 for Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking

Figure 4 for Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking

Abstract:In this paper, we propose an efficient and effective dense hybrid recurrent multi-view stereo net with dynamic consistency checking, namely $D^{2}$HC-RMVSNet, for accurate dense point cloud reconstruction. Our novel hybrid recurrent multi-view stereo net consists of two core modules: 1) a light DRENet (Dense Reception Expanded) module to extract dense feature maps of original size with multi-scale context information, 2) a HU-LSTM (Hybrid U-LSTM) to regularize 3D matching volume into predicted depth map, which efficiently aggregates different scale information by coupling LSTM and U-Net architecture. To further improve the accuracy and completeness of reconstructed point clouds, we leverage a dynamic consistency checking strategy instead of prefixed parameters and strategies widely adopted in existing methods for dense point cloud reconstruction. In doing so, we dynamically aggregate geometric consistency matching error among all the views. Our method ranks \textbf{$1^{st}$} on the complex outdoor \textsl{Tanks and Temples} benchmark over all the methods. Extensive experiments on the in-door DTU dataset show our method exhibits competitive performance to the state-of-the-art method while dramatically reduces memory consumption, which costs only $19.4\%$ of R-MVSNet memory consumption. The codebase is available at \hyperlink{https://github.com/yhw-yhw/D2HC-RMVSNet}{https://github.com/yhw-yhw/D2HC-RMVSNet}.

* ECCV2020
* Accepted by ECCV2020 as Spotlight

Via

Access Paper or Ask Questions

Graph-Based Parallel Large Scale Structure from Motion

Dec 23, 2019

Yu Chen, Shuhan Shen, Yisong Chen, Guoping Wang

Figure 1 for Graph-Based Parallel Large Scale Structure from Motion

Figure 2 for Graph-Based Parallel Large Scale Structure from Motion

Figure 3 for Graph-Based Parallel Large Scale Structure from Motion

Figure 4 for Graph-Based Parallel Large Scale Structure from Motion

Abstract:While Structure from Motion (SfM) achieves great success in 3D reconstruction, it still meets challenges on large scale scenes. In this work, large scale SfM is deemed as a graph problem, and we tackle it in a divide-and-conquer manner. Firstly, the images clustering algorithm divides images into clusters with strong connectivity, leading to robust local reconstructions. Then followed with an image expansion step, the connection and completeness of scenes are enhanced by expanding along with a maximum spanning tree. After local reconstructions, we construct a minimum spanning tree (MinST) to find accurate similarity transformations. Then the MinST is transformed into a Minimum Height Tree (MHT) to find a proper anchor node and is further utilized to prevent error accumulation. When evaluated on different kinds of datasets, our approach shows superiority over the state-of-the-art in accuracy and efficiency. Our algorithm is open-sourced at https://github.com/AIBluefisher/GraphSfM.

Via

Access Paper or Ask Questions

Bundle Adjustment Revisited

Dec 09, 2019

Yu Chen, Yisong Chen, Guoping Wang

Figure 1 for Bundle Adjustment Revisited

Figure 2 for Bundle Adjustment Revisited

Figure 3 for Bundle Adjustment Revisited

Figure 4 for Bundle Adjustment Revisited

Abstract:3D reconstruction has been developing all these two decades, from moderate to medium size and to large scale. It's well known that bundle adjustment plays an important role in 3D reconstruction, mainly in Structure from Motion(SfM) and Simultaneously Localization and Mapping(SLAM). While bundle adjustment optimizes camera parameters and 3D points as a non-negligible final step, it suffers from memory and efficiency requirements in very large scale reconstruction. In this paper, we study the development of bundle adjustment elaborately in both conventional and distributed approaches. The detailed derivation and pseudo code are also given in this paper.

* 9 pages, 9 figures

Via

Access Paper or Ask Questions

Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation

Dec 06, 2019

Hongwei Yi, Zizhuang Wei, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai

Figure 1 for Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation

Figure 2 for Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation

Figure 3 for Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation

Figure 4 for Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation

Abstract:In this paper, we propose an effective and efficient pyramid multi-view stereo (MVS) net for accurate and complete dense point cloud reconstruction. Different from existing deep-learning based MVS methods, our VA-MVSNet incorporates the cost variance between different views by introducing two novel self-adaptive view aggregation: pixel-wise view aggregation and voxel-wise view aggregation. Moreover, to enhance the point cloud reconstruction on the texture-less regions, we extend VA-MVSNet with pyramid multi-scale images input as PVA-MVSNet, where multi-metric constraints are leveraged to aggregate the reliable depth estimation at the coarser scale to fill-in the mismatched regions at the finer scale. Experimental results show that our approach establishes a new state-of-the-art on the DTU dataset with significant improvements in the completeness and overall quality of 3D reconstruction, and ranks 1st on the Tanks and Temples benchmark among all published deep-learning based methods. Our codebase is available at https://github.com/yhw-yhw/PVAMVSNet.

Via

Access Paper or Ask Questions

X-GANs: Image Reconstruction Made Easy for Extreme Cases

Aug 06, 2018

Longfei Liu, Sheng Li, Yisong Chen, Guoping Wang

Figure 1 for X-GANs: Image Reconstruction Made Easy for Extreme Cases

Figure 2 for X-GANs: Image Reconstruction Made Easy for Extreme Cases

Figure 3 for X-GANs: Image Reconstruction Made Easy for Extreme Cases

Figure 4 for X-GANs: Image Reconstruction Made Easy for Extreme Cases

Abstract:Image reconstruction including image restoration and denoising is a challenging problem in the field of image computing. We present a new method, called X-GANs, for reconstruction of arbitrary corrupted resource based on a variant of conditional generative adversarial networks (conditional GANs). In our method, a novel generator and multi-scale discriminators are proposed, as well as the combined adversarial losses, which integrate a VGG perceptual loss, an adversarial perceptual loss, and an elaborate corresponding point loss together based on the analysis of image feature. Our conditional GANs have enabled a variety of applications in image reconstruction, including image denoising, image restoration from quite a sparse sampling, image inpainting, image recovery from the severely polluted block or even color-noise dominated images, which are extreme cases and haven't been addressed in the status quo. We have significantly improved the accuracy and quality of image reconstruction. Extensive perceptual experiments on datasets ranging from human faces to natural scenes demonstrate that images reconstructed by the presented approach are considerably more realistic than alternative work. Our method can also be extended to handle high-ratio image compression.

* 9 pages, 12 figures

Via

Access Paper or Ask Questions

Foreground segmentation based on multi-resolution and matting

Feb 10, 2014

Xintong Yu, Xiaohan Liu, Yisong Chen

Figure 1 for Foreground segmentation based on multi-resolution and matting

Figure 2 for Foreground segmentation based on multi-resolution and matting

Figure 3 for Foreground segmentation based on multi-resolution and matting

Abstract:We propose a foreground segmentation algorithm that does foreground extraction under different scales and refines the result by matting. First, the input image is filtered and resampled to 5 different resolutions. Then each of them is segmented by adaptive figure-ground classification and the best segmentation is automatically selected by an evaluation score that maximizes the difference between foreground and background. This segmentation is upsampled to the original size, and a corresponding trimap is built. Closed-form matting is employed to label the boundary region, and the result is refined by a final figure-ground classification. Experiments show the success of our method in treating challenging images with cluttered background and adapting to loose initial bounding-box.

* 5 pages. 7 figures

Via

Access Paper or Ask Questions