Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hongtao Zhang

The University of Western Australia

GarmentGS: Point-Cloud Guided Gaussian Splatting for High-Fidelity Non-Watertight 3D Garment Reconstruction

May 04, 2025

Zhihao Tang, Shenghao Yang, Hongtao Zhang, Mingbo Zhao

Abstract:Traditional 3D garment creation requires extensive manual operations, resulting in time and labor costs. Recently, 3D Gaussian Splatting has achieved breakthrough progress in 3D scene reconstruction and rendering, attracting widespread attention and opening new pathways for 3D garment reconstruction. However, due to the unstructured and irregular nature of Gaussian primitives, it is difficult to reconstruct high-fidelity, non-watertight 3D garments. In this paper, we present GarmentGS, a dense point cloud-guided method that can reconstruct high-fidelity garment surfaces with high geometric accuracy and generate non-watertight, single-layer meshes. Our method introduces a fast dense point cloud reconstruction module that can complete garment point cloud reconstruction in 10 minutes, compared to traditional methods that require several hours. Furthermore, we use dense point clouds to guide the movement, flattening, and rotation of Gaussian primitives, enabling better distribution on the garment surface to achieve superior rendering effects and geometric accuracy. Through numerical and visual comparisons, our method achieves fast training and real-time rendering while maintaining competitive quality.

Via

Access Paper or Ask Questions

Cooperative Hybrid Multi-Agent Pathfinding Based on Shared Exploration Maps

Mar 28, 2025

Ning Liu, Sen Shen, Xiangrui Kong, Hongtao Zhang, Thomas Bräunl

Abstract:Multi-Agent Pathfinding is used in areas including multi-robot formations, warehouse logistics, and intelligent vehicles. However, many environments are incomplete or frequently change, making it difficult for standard centralized planning or pure reinforcement learning to maintain both global solution quality and local flexibility. This paper introduces a hybrid framework that integrates D* Lite global search with multi-agent reinforcement learning, using a switching mechanism and a freeze-prevention strategy to handle dynamic conditions and crowded settings. We evaluate the framework in the discrete POGEMA environment and compare it with baseline methods. Experimental outcomes indicate that the proposed framework substantially improves success rate, collision rate, and path efficiency. The model is further tested on the EyeSim platform, where it maintains feasible Pathfinding under frequent changes and large-scale robot deployments.

* 22 pages,7 figures

Via

Access Paper or Ask Questions

Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm

Sep 10, 2024

Jinwei Zhao, Marco Gori, Alessandro Betti, Stefano Melacci, Hongtao Zhang, Jiedong Liu, Xinhong Hei

Figure 1 for Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm

Figure 2 for Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm

Figure 3 for Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm

Figure 4 for Dynamic Decoupling of Placid Terminal Attractor-based Gradient Descent Algorithm

Abstract:Gradient descent (GD) and stochastic gradient descent (SGD) have been widely used in a large number of application domains. Therefore, understanding the dynamics of GD and improving its convergence speed is still of great importance. This paper carefully analyzes the dynamics of GD based on the terminal attractor at different stages of its gradient flow. On the basis of the terminal sliding mode theory and the terminal attractor theory, four adaptive learning rates are designed. Their performances are investigated in light of a detailed theoretical investigation, and the running times of the learning procedures are evaluated and compared. The total times of their learning processes are also studied in detail. To evaluate their effectiveness, various simulation results are investigated on a function approximation problem and an image classification problem.

* 8 pages, 4 figures

Via

Access Paper or Ask Questions

IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation

Mar 28, 2024

Jiacui Huang, Hongtao Zhang, Mingbo Zhao, Zhou Wu

Abstract:Vision-and-Language Navigation (VLN) is a challenging task that requires a robot to navigate in photo-realistic environments with human natural language promptings. Recent studies aim to handle this task by constructing the semantic spatial map representation of the environment, and then leveraging the strong ability of reasoning in large language models for generalizing code for guiding the robot navigation. However, these methods face limitations in instance-level and attribute-level navigation tasks as they cannot distinguish different instances of the same object. To address this challenge, we propose a new method, namely, Instance-aware Visual Language Map (IVLMap), to empower the robot with instance-level and attribute-level semantic mapping, where it is autonomously constructed by fusing the RGBD video data collected from the robot agent with special-designed natural language map indexing in the bird's-in-eye view. Such indexing is instance-level and attribute-level. In particular, when integrated with a large language model, IVLMap demonstrates the capability to i) transform natural language into navigation targets with instance and attribute information, enabling precise localization, and ii) accomplish zero-shot end-to-end navigation tasks based on natural language commands. Extensive navigation experiments are conducted. Simulation results illustrate that our method can achieve an average improvement of 14.4\% in navigation accuracy. Code and demo are released at https://ivlmap.github.io/.

Via

Access Paper or Ask Questions

Analysis of Intelligent Reflecting Surface-Enhanced Mobility Through a Line-of-Sight State Transition Model

Mar 12, 2024

Hongtao Zhang, Haoyan Wei

Abstract:Rapid signal fluctuations due to blockage effects cause excessive handovers (HOs) and degrade mobility performance. By reconfiguring line-of-sight (LoS) Links through passive reflections, intelligent reflective surface (IRS) has the potential to address this issue. Due to the lack of introducing blocking effects, existing HO analyses cannot capture excessive HOs or exploit enhancements via IRSs. This paper proposes an LoS state transition model enabling analysis of mobility enhancement achieved by IRS-reconfigured LoS links, where LoS link blocking and reconfiguration utilizing IRS during user movement are explicitly modeled as stochastic processes. Specifically, the condition for blocking LoS links is characterized as a set of possible blockage locations, the distribution of available IRSs is thinned by the criteria for reconfiguring LoS links. In addition, BSs potentially handed over are categorized by probabilities of LoS states to enable HO decision analysis. By projecting distinct gains of LoS states onto a uniform equivalent distance criterion, mobility enhanced by IRS is quantified through the compact expression of HO probability. Results show the probability of dropping into non-LoS decreases by 70% when deploying IRSs with the density of 93/km$^2$, and HOs decrease by 67% under the optimal IRS distributed deployment parameter.

* 13 pages, 11 figures, submitted to IEEE

Via

Access Paper or Ask Questions

Discrete-Time Modeling and Handover Analysis of Intelligent Reflecting Surface-Assisted Networks

Mar 12, 2024

Hongtao Zhang, Haoyan Wei

Abstract:Owning to the reflection gain and double path loss featured by intelligent reflecting surface (IRS) channels, handover (HO) locations become irregular and the signal strength fluctuates sharply with variations in IRS connections during HO, the risk of HO failures (HOFs) is exacerbated and thus HO parameters require reconfiguration. However, existing HO models only assume monotonic negative exponential path loss and cannot obtain sound HO parameters. This paper proposes a discrete-time model to explicitly track the HO process with variations in IRS connections, where IRS connections and HO process are discretized as finite states by measurement intervals, and transitions between states are modeled as stochastic processes. Specifically, to capture signal fluctuations during HO, IRS connection state-dependent distributions of the user-IRS distance are modified by the correlation between measurement intervals. In addition, states of the HO process are formed with Time-to-Trigger and HO margin whose transition probabilities are integrated concerning all IRS connection states. Trigger location distributions and probabilities of HO, HOF, and ping-pong (PP) are obtained by tracing user HO states. Results show IRSs mitigate PPs by 48% but exacerbate HOFs by 90% under regular parameters. Optimal parameters are mined ensuring probabilities of HOF and PP are both less than 0.1%.

* 13 pages, 12 figures, submitted to IEEE

Via

Access Paper or Ask Questions

PLCNet: Patch-wise Lane Correction Network for Automatic Lane Correction in High-definition Maps

Jan 25, 2024

Haiyang Peng, Yi Zhan, Benkang Wang, Hongtao Zhang

Abstract:In High-definition (HD) maps, lane elements constitute the majority of components and demand stringent localization requirements to ensure safe vehicle navigation. Vision lane detection with LiDAR position assignment is a prevalent method to acquire initial lanes for HD maps. However, due to incorrect vision detection and coarse camera-LiDAR calibration, initial lanes may deviate from their true positions within an uncertain range. To mitigate the need for manual lane correction, we propose a patch-wise lane correction network (PLCNet) to automatically correct the positions of initial lane points in local LiDAR images that are transformed from point clouds. PLCNet first extracts multi-scale image features and crops patch (ROI) features centered at each initial lane point. By applying ROIAlign, the fix-sized ROI features are flattened into 1D features. Then, a 1D lane attention module is devised to compute instance-level lane features with adaptive weights. Finally, lane correction offsets are inferred by a multi-layer perceptron and used to correct the initial lane positions. Considering practical applications, our automatic method supports merging local corrected lanes into global corrected lanes. Through extensive experiments on a self-built dataset, we demonstrate that PLCNet achieves fast and effective initial lane correction.

Via

Access Paper or Ask Questions

AI in Pharma for Personalized Sequential Decision-Making: Methods, Applications and Opportunities

Nov 30, 2023

Yuhan Li, Hongtao Zhang, Keaven Anderson, Songzi Li, Ruoqing Zhu

Abstract:In the pharmaceutical industry, the use of artificial intelligence (AI) has seen consistent growth over the past decade. This rise is attributed to major advancements in statistical machine learning methodologies, computational capabilities and the increased availability of large datasets. AI techniques are applied throughout different stages of drug development, ranging from drug discovery to post-marketing benefit-risk assessment. Kolluri et al. provided a review of several case studies that span these stages, featuring key applications such as protein structure prediction, success probability estimation, subgroup identification, and AI-assisted clinical trial monitoring. From a regulatory standpoint, there was a notable uptick in submissions incorporating AI components in 2021. The most prevalent therapeutic areas leveraging AI were oncology (27%), psychiatry (15%), gastroenterology (12%), and neurology (11%). The paradigm of personalized or precision medicine has gained significant traction in recent research, partly due to advancements in AI techniques \cite{hamburg2010path}. This shift has had a transformative impact on the pharmaceutical industry. Departing from the traditional "one-size-fits-all" model, personalized medicine incorporates various individual factors, such as environmental conditions, lifestyle choices, and health histories, to formulate customized treatment plans. By utilizing sophisticated machine learning algorithms, clinicians and researchers are better equipped to make informed decisions in areas such as disease prevention, diagnosis, and treatment selection, thereby optimizing health outcomes for each individual.

Via

Access Paper or Ask Questions

A Scalable Arrangement Method for Aperiodic Array Antennas to Reduce Peak Sidelobe Level

Jul 04, 2023

Jiao Zhang, Hongtao Zhang, Xuelei Chen, Fengquan Wu, Yufeng Liu, Wenmei Zhang

Abstract:Peak sidelobe level reduction (PSLR) is crucial in the application of large-scale array antenna, which directly determines the radiation performance of array antenna. We study the PSLR of subarray level aperiodic arrays and propose three array structures: dislocated subarrays with uniform elements (DSUE), uniform subarrays with random elements (USRE), dislocated subarrays with random elements (DSRE). To optimize the dislocation position of subarrays and random position of elements, the improved Bat algorithm (IBA) is applied. To draw the comparison of PSLR effect among these three array structures, we take three size of array antennas from small to large as examples to simulate and calculate the redundancy and peak sidelobe level (PSLL) of them. The results show that DSRE is the optimal array structure by analyzing the dislocation distance of subarray, scanning angle and applicable frequency. The proposed design method is a universal and scalable method, which is of great application value to the design of large-scale aperiodic array antenna.

Via

Access Paper or Ask Questions

Machine Learning Guided 3D Image Recognition for Carbonate Pore and Mineral Volumes Determination

Nov 08, 2021

Omar Alfarisi, Aikifa Raza, Hongtao Zhang, Djamel Ozzane, Mohamed Sassi, Tiejun Zhang

Figure 1 for Machine Learning Guided 3D Image Recognition for Carbonate Pore and Mineral Volumes Determination

Figure 2 for Machine Learning Guided 3D Image Recognition for Carbonate Pore and Mineral Volumes Determination

Figure 3 for Machine Learning Guided 3D Image Recognition for Carbonate Pore and Mineral Volumes Determination

Figure 4 for Machine Learning Guided 3D Image Recognition for Carbonate Pore and Mineral Volumes Determination

Abstract:Automated image processing algorithms can improve the quality, efficiency, and consistency of classifying the morphology of heterogeneous carbonate rock and can deal with a massive amount of data and images seamlessly. Geoscientists face difficulties in setting the direction of the optimum method for determining petrophysical properties from rock images, Micro-Computed Tomography (uCT), or Magnetic Resonance Imaging (MRI). Most of the successful work is from the homogeneous rocks focusing on 2D images with less focus on 3D and requiring numerical simulation. Currently, image analysis methods converge to three approaches: image processing, artificial intelligence, and combined image processing with artificial intelligence. In this work, we propose two methods to determine the porosity from 3D uCT and MRI images: an image processing method with Image Resolution Optimized Gaussian Algorithm (IROGA); advanced image recognition method enabled by Machine Learning Difference of Gaussian Random Forest (MLDGRF). We have built reference 3D micro models and collected images for calibration of IROGA and MLDGRF methods. To evaluate the predictive capability of these calibrated approaches, we ran them on 3D uCT and MRI images of natural heterogeneous carbonate rock. We measured the porosity and lithology of the carbonate rock using three and two industry-standard ways, respectively, as reference values. Notably, IROGA and MLDGRF have produced porosity results with an accuracy of 96.2% and 97.1% on the training set and 91.7% and 94.4% on blind test validation, respectively, in comparison with the three experimental measurements. We measured limestone and pyrite reference values using two methods, X-ray powder diffraction, and grain density measurements. MLDGRF has produced lithology (limestone and Pyrite) volumes with 97.7% accuracy.

Via

Access Paper or Ask Questions