Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhuo He

A Convolutional Neural Deferred Shader for Physics Based Rendering

Dec 22, 2025

Zhuo He, Yingdong Ru, Qianying Liu, Paul Henderson, Nicolas Pugeault

Figure 1 for A Convolutional Neural Deferred Shader for Physics Based Rendering

Figure 2 for A Convolutional Neural Deferred Shader for Physics Based Rendering

Figure 3 for A Convolutional Neural Deferred Shader for Physics Based Rendering

Figure 4 for A Convolutional Neural Deferred Shader for Physics Based Rendering

Abstract:Recent advances in neural rendering have achieved impressive results on photorealistic shading and relighting, by using a multilayer perceptron (MLP) as a regression model to learn the rendering equation from a real-world dataset. Such methods show promise for photorealistically relighting real-world objects, which is difficult to classical rendering, as there is no easy-obtained material ground truth. However, significant challenges still remain the dense connections in MLPs result in a large number of parameters, which requires high computation resources, complicating the training, and reducing performance during rendering. Data driven approaches require large amounts of training data for generalization; unbalanced data might bias the model to ignore the unusual illumination conditions, e.g. dark scenes. This paper introduces pbnds+: a novel physics-based neural deferred shading pipeline utilizing convolution neural networks to decrease the parameters and improve the performance in shading and relighting tasks; Energy regularization is also proposed to restrict the model reflection during dark illumination. Extensive experiments demonstrate that our approach outperforms classical baselines, a state-of-the-art neural shading model, and a diffusion-based method.

Via

Access Paper or Ask Questions

Generative Fields: Uncovering Hierarchical Feature Control for StyleGAN via Inverted Receptive Fields

Apr 24, 2025

Zhuo He, Paul Henderson, Nicolas Pugeault

Abstract:StyleGAN has demonstrated the ability of GANs to synthesize highly-realistic faces of imaginary people from random noise. One limitation of GAN-based image generation is the difficulty of controlling the features of the generated image, due to the strong entanglement of the low-dimensional latent space. Previous work that aimed to control StyleGAN with image or text prompts modulated sampling in W latent space, which is more expressive than Z latent space. However, W space still has restricted expressivity since it does not control the feature synthesis directly; also the feature embedding in W space requires a pre-training process to reconstruct the style signal, limiting its application. This paper introduces the concept of "generative fields" to explain the hierarchical feature synthesis in StyleGAN, inspired by the receptive fields of convolution neural networks (CNNs). Additionally, we propose a new image editing pipeline for StyleGAN using generative field theory and the channel-wise style latent space S, utilizing the intrinsic structural feature of CNNs to achieve disentangled control of feature synthesis at synthesis time.

Via

Access Paper or Ask Questions

Beyond Reconstruction: A Physics Based Neural Deferred Shader for Photo-realistic Rendering

Apr 16, 2025

Zhuo He, Paul Henderson, Nicolas Pugeault

Figure 1 for Beyond Reconstruction: A Physics Based Neural Deferred Shader for Photo-realistic Rendering

Figure 2 for Beyond Reconstruction: A Physics Based Neural Deferred Shader for Photo-realistic Rendering

Figure 3 for Beyond Reconstruction: A Physics Based Neural Deferred Shader for Photo-realistic Rendering

Figure 4 for Beyond Reconstruction: A Physics Based Neural Deferred Shader for Photo-realistic Rendering

Abstract:Deep learning based rendering has demonstrated major improvements for photo-realistic image synthesis, applicable to various applications including visual effects in movies and photo-realistic scene building in video games. However, a significant limitation is the difficulty of decomposing the illumination and material parameters, which limits such methods to reconstruct an input scene, without any possibility to control these parameters. This paper introduces a novel physics based neural deferred shading pipeline to decompose the data-driven rendering process, learn a generalizable shading function to produce photo-realistic results for shading and relighting tasks, we also provide a shadow estimator to efficiently mimic shadowing effect. Our model achieves improved performance compared to classical models and a state-of-art neural shading model, and enables generalizable photo-realistic shading from arbitrary illumination input.

Via

Access Paper or Ask Questions

Can Real-to-Sim Approaches Capture Dynamic Fabric Behavior for Robotic Fabric Manipulation?

Mar 20, 2025

Yingdong Ru, Lipeng Zhuang, Zhuo He, Florent P. Audonnet, Gerardo Aragon-Caramasa

Abstract:This paper presents a rigorous evaluation of Real-to-Sim parameter estimation approaches for fabric manipulation in robotics. The study systematically assesses three state-of-the-art approaches, namely two differential pipelines and a data-driven approach. We also devise a novel physics-informed neural network approach for physics parameter estimation. These approaches are interfaced with two simulations across multiple Real-to-Sim scenarios (lifting, wind blowing, and stretching) for five different fabric types and evaluated on three unseen scenarios (folding, fling, and shaking). We found that the simulation engines and the choice of Real-to-Sim approaches significantly impact fabric manipulation performance in our evaluation scenarios. Moreover, PINN observes superior performance in quasi-static tasks but shows limitations in dynamic scenarios.

Via

Access Paper or Ask Questions

Generative AI for RF Sensing in IoT systems

Jul 10, 2024

Li Wang, Chao Zhang, Qiyang Zhao, Hang Zou, Samson Lasaulce, Giuseppe Valenzise, Zhuo He, Merouane Debbah

Figure 1 for Generative AI for RF Sensing in IoT systems

Figure 2 for Generative AI for RF Sensing in IoT systems

Figure 3 for Generative AI for RF Sensing in IoT systems

Figure 4 for Generative AI for RF Sensing in IoT systems

Abstract:The development of wireless sensing technologies, using signals such as Wi-Fi, infrared, and RF to gather environmental data, has significantly advanced within Internet of Things (IoT) systems. Among these, Radio Frequency (RF) sensing stands out for its cost-effective and non-intrusive monitoring of human activities and environmental changes. However, traditional RF sensing methods face significant challenges, including noise, interference, incomplete data, and high deployment costs, which limit their effectiveness and scalability. This paper investigates the potential of Generative AI (GenAI) to overcome these limitations within the IoT ecosystem. We provide a comprehensive review of state-of-the-art GenAI techniques, focusing on their application to RF sensing problems. By generating high-quality synthetic data, enhancing signal quality, and integrating multi-modal data, GenAI offers robust solutions for RF environment reconstruction, localization, and imaging. Additionally, GenAI's ability to generalize enables IoT devices to adapt to new environments and unseen tasks, improving their efficiency and performance. The main contributions of this article include a detailed analysis of the challenges in RF sensing, the presentation of innovative GenAI-based solutions, and the proposal of a unified framework for diverse RF sensing tasks. Through case studies, we demonstrate the effectiveness of integrating GenAI models, leading to advanced, scalable, and intelligent IoT systems.

Via

Access Paper or Ask Questions

Few Clicks Suffice: Active Test-Time Adaptation for Semantic Segmentation

Dec 04, 2023

Longhui Yuan, Shuang Li, Zhuo He, Binhui Xie

Figure 1 for Few Clicks Suffice: Active Test-Time Adaptation for Semantic Segmentation

Figure 2 for Few Clicks Suffice: Active Test-Time Adaptation for Semantic Segmentation

Figure 3 for Few Clicks Suffice: Active Test-Time Adaptation for Semantic Segmentation

Figure 4 for Few Clicks Suffice: Active Test-Time Adaptation for Semantic Segmentation

Abstract:Test-time adaptation (TTA) adapts the pre-trained models during inference using unlabeled test data and has received a lot of research attention due to its potential practical value. Unfortunately, without any label supervision, existing TTA methods rely heavily on heuristic or empirical studies. Where to update the model always falls into suboptimal or brings more computational resource consumption. Meanwhile, there is still a significant performance gap between the TTA approaches and their supervised counterparts. Motivated by active learning, in this work, we propose the active test-time adaptation for semantic segmentation setup. Specifically, we introduce the human-in-the-loop pattern during the testing phase, which queries very few labels to facilitate predictions and model updates in an online manner. To do so, we propose a simple but effective ATASeg framework, which consists of two parts, i.e., model adapter and label annotator. Extensive experiments demonstrate that ATASeg bridges the performance gap between TTA methods and their supervised counterparts with only extremely few annotations, even one click for labeling surpasses known SOTA TTA methods by 2.6% average mIoU on ACDC benchmark. Empirical results imply that progress in either the model adapter or the label annotator will bring improvements to the ATASeg framework, giving it large research and reality potential.

* 15 pages, 10 figures

Via

Access Paper or Ask Questions

A new method using deep transfer learning on ECG to predict the response to cardiac resynchronization therapy

Jun 02, 2023

Zhuo He, Hongjin Si, Xinwei Zhang, Qing-Hui Chen, Jiangang Zou, Weihua Zhou

Figure 1 for A new method using deep transfer learning on ECG to predict the response to cardiac resynchronization therapy

Figure 2 for A new method using deep transfer learning on ECG to predict the response to cardiac resynchronization therapy

Figure 3 for A new method using deep transfer learning on ECG to predict the response to cardiac resynchronization therapy

Figure 4 for A new method using deep transfer learning on ECG to predict the response to cardiac resynchronization therapy

Abstract:Background: Cardiac resynchronization therapy (CRT) has emerged as an effective treatment for heart failure patients with electrical dyssynchrony. However, accurately predicting which patients will respond to CRT remains a challenge. This study explores the application of deep transfer learning techniques to train a predictive model for CRT response. Methods: In this study, the short-time Fourier transform (STFT) technique was employed to transform ECG signals into two-dimensional images. A transfer learning approach was then applied on the MIT-BIT ECG database to pre-train a convolutional neural network (CNN) model. The model was fine-tuned to extract relevant features from the ECG images, and then tested on our dataset of CRT patients to predict their response. Results: Seventy-one CRT patients were enrolled in this study. The transfer learning model achieved an accuracy of 72% in distinguishing responders from non-responders in the local dataset. Furthermore, the model showed good sensitivity (0.78) and specificity (0.79) in identifying CRT responders. The performance of our model outperformed clinic guidelines and traditional machine learning approaches. Conclusion: The utilization of ECG images as input and leveraging the power of transfer learning allows for improved accuracy in identifying CRT responders. This approach offers potential for enhancing patient selection and improving outcomes of CRT.

Via

Access Paper or Ask Questions

A new method using deep learning to predict the response to cardiac resynchronization therapy

May 04, 2023

Kristoffer Larsena, Zhuo He, Chen Zhao, Xinwei Zhang, Quiying Sha, Claudio T Mesquitad, Diana Paeze, Ernest V. Garciaf, Jiangang Zou, Amalia Peix(+1 more)

Figure 1 for A new method using deep learning to predict the response to cardiac resynchronization therapy

Figure 2 for A new method using deep learning to predict the response to cardiac resynchronization therapy

Figure 3 for A new method using deep learning to predict the response to cardiac resynchronization therapy

Figure 4 for A new method using deep learning to predict the response to cardiac resynchronization therapy

Abstract:Background. Clinical parameters measured from gated single-photon emission computed tomography myocardial perfusion imaging (SPECT MPI) have value in predicting cardiac resynchronization therapy (CRT) patient outcomes, but still show limitations. The purpose of this study is to combine clinical variables, features from electrocardiogram (ECG), and parameters from assessment of cardiac function with polarmaps from gated SPECT MPI through deep learning (DL) to predict CRT response. Methods. 218 patients who underwent rest gated SPECT MPI were enrolled in this study. CRT response was defined as an increase in left ventricular ejection fraction (LVEF) > 5% at a 6-month follow up. A DL model was constructed by combining a pre-trained VGG16 module and a multilayer perceptron. Two modalities of data were input to the model: polarmap images from SPECT MPI and tabular data from clinical features and ECG parameters. Gradient-weighted Class Activation Mapping (Grad-CAM) was applied to the VGG16 module to provide explainability for the polarmaps. For comparison, four machine learning (ML) models were trained using only the tabular features. Results. Modeling was performed on 218 patients who underwent CRT implantation with a response rate of 55.5% (n = 121). The DL model demonstrated average AUC (0.83), accuracy (0.73), sensitivity (0.76), and specificity (0.69) surpassing the ML models and guideline criteria. Guideline recommendations presented accuracy (0.53), sensitivity (0.75), and specificity (0.26). Conclusions. The DL model outperformed the ML models, showcasing the additional predictive benefit of utilizing SPECT MPI polarmaps. Incorporating additional patient data directly in the form of medical imagery can improve CRT response prediction.

Via

Access Paper or Ask Questions

A new method using machine learning to integrate ECG and gated SPECT MPI for Cardiac Resynchronization Therapy Decision Support on behalf of the VISION-CRT

Nov 06, 2022

Fernando de A. Fernandes, Kristoffer Larsen, Zhuo He, Erivelton Nascimento, Amalia Peix, Qiuying Sha, Diana Paez, Ernest V. Garcia, Weihua Zhou, Claudio T Mesquita

Abstract:Cardiac resynchronization therapy (CRT) has been established as an important therapy for heart failure. Mechanical dyssynchrony has the potential to predict responders to CRT. The aim of this study was to report the development and the validation of machine learning (ML) models which integrates ECG, gated SPECT MPI (GMPS) and clinical variables to predict patients' response to CRT. This analysis included 153 patients who met criteria for CRT from a prospective cohort study. The variables were used to modeling predictive methods for CRT. Patients were classified as responders for an increase of LVEF>=5% at follow-up. In a second analysis, patients were classified super-responders for increase of LVEF>=15%. For ML, variable selection was applied, and Prediction Analysis of Microarrays (PAM) approach was used for response modeling while Naive Bayes (NB) was used for super-response. They were compared to models obtained with guideline variables. PAM had AUC of 0.80 against 0.71 of logistic regression with guideline variables (p = 0.47). The sensitivity (0.86) and specificity (0.75) were better than for guideline alone, sensitivity (0.72) and specificity (0.22). Neural network with guideline variables outperformed NB (AUC = 0.87 vs 0.86; p = 0.88). Its sensitivity and specificity (1.0 and 0.75, respectively) was better than guideline alone (0.40 and 0.06, respectively). Compared to guideline criteria, ML methods trended towards improved CRT response and super-response prediction. GMPS had a central role in the acquisition of most parameters. Further studies are needed to validate the models.

Via

Access Paper or Ask Questions

Automatic reorientation by deep learning to generate short axis SPECT myocardial perfusion images

Aug 07, 2022

Fubao Zhu, Guojie Wang, Chen Zhao, Saurabh Malhotra, Min Zhao, Zhuo He, Jianzhou Shi, Zhixin Jiang, Weihua Zhou

Figure 1 for Automatic reorientation by deep learning to generate short axis SPECT myocardial perfusion images

Figure 2 for Automatic reorientation by deep learning to generate short axis SPECT myocardial perfusion images

Figure 3 for Automatic reorientation by deep learning to generate short axis SPECT myocardial perfusion images

Figure 4 for Automatic reorientation by deep learning to generate short axis SPECT myocardial perfusion images

Abstract:Single photon emission computed tomography (SPECT) myocardial perfusion images (MPI) can be displayed both in traditional short-axis (SA) cardiac planes and polar maps for interpretation and quantification. It is essential to reorient the reconstructed transaxial SPECT MPI into standard SA slices. This study is aimed to develop a deep-learning-based approach for automatic reorientation of MPI. Methods: A total of 254 patients were enrolled, including 228 stress SPECT MPIs and 248 rest SPECT MPIs. Five-fold cross-validation with 180 stress and 201 rest MPIs was used for training and internal validation; the remaining images were used for testing. The rigid transformation parameters (translation and rotation) from manual reorientation were annotated by an experienced operator and used as the ground truth. A convolutional neural network (CNN) was designed to predict the transformation parameters. Then, the derived transform was applied to the grid generator and sampler in spatial transformer network (STN) to generate the reoriented image. A loss function containing mean absolute errors for translation and mean square errors for rotation was employed. A three-stage optimization strategy was adopted for model optimization: 1) optimize the translation parameters while fixing the rotation parameters; 2) optimize rotation parameters while fixing the translation parameters; 3) optimize both translation and rotation parameters together.

* 27 pages,7 figures

Via

Access Paper or Ask Questions