Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chengxi Zang

Federated Causal Inference in Healthcare: Methods, Challenges, and Applications

May 04, 2025

Haoyang Li, Jie Xu, Kyra Gan, Fei Wang, Chengxi Zang

Abstract:Federated causal inference enables multi-site treatment effect estimation without sharing individual-level data, offering a privacy-preserving solution for real-world evidence generation. However, data heterogeneity across sites, manifested in differences in covariate, treatment, and outcome, poses significant challenges for unbiased and efficient estimation. In this paper, we present a comprehensive review and theoretical analysis of federated causal effect estimation across both binary/continuous and time-to-event outcomes. We classify existing methods into weight-based strategies and optimization-based frameworks and further discuss extensions including personalized models, peer-to-peer communication, and model decomposition. For time-to-event outcomes, we examine federated Cox and Aalen-Johansen models, deriving asymptotic bias and variance under heterogeneity. Our analysis reveals that FedProx-style regularization achieves near-optimal bias-variance trade-offs compared to naive averaging and meta-analysis. We review related software tools and conclude by outlining opportunities, challenges, and future directions for scalable, fair, and trustworthy federated causal inference in distributed healthcare systems.

Via

Access Paper or Ask Questions

Emerging Opportunities of Using Large Language Models for Translation Between Drug Molecules and Indications

Feb 16, 2024

David Oniani, Jordan Hilsman, Chengxi Zang, Junmei Wang, Lianjin Cai, Jan Zawala, Yanshan Wang

Abstract:A drug molecule is a substance that changes the organism's mental or physical state. Every approved drug has an indication, which refers to the therapeutic use of that drug for treating a particular medical condition. While the Large Language Model (LLM), a generative Artificial Intelligence (AI) technique, has recently demonstrated effectiveness in translating between molecules and their textual descriptions, there remains a gap in research regarding their application in facilitating the translation between drug molecules and indications, or vice versa, which could greatly benefit the drug discovery process. The capability of generating a drug from a given indication would allow for the discovery of drugs targeting specific diseases or targets and ultimately provide patients with better treatments. In this paper, we first propose a new task, which is the translation between drug molecules and corresponding indications, and then test existing LLMs on this new task. Specifically, we consider nine variations of the T5 LLM and evaluate them on two public datasets obtained from ChEMBL and DrugBank. Our experiments show the early results of using LLMs for this task and provide a perspective on the state-of-the-art. We also emphasize the current limitations and discuss future work that has the potential to improve the performance on this task. The creation of molecules from indications, or vice versa, will allow for more efficient targeting of diseases and significantly reduce the cost of drug discovery, with the potential to revolutionize the field of drug discovery in the era of generative AI.

Via

Access Paper or Ask Questions

SCEHR: Supervised Contrastive Learning for Clinical Risk Prediction using Electronic Health Records

Oct 11, 2021

Chengxi Zang, Fei Wang

Figure 1 for SCEHR: Supervised Contrastive Learning for Clinical Risk Prediction using Electronic Health Records

Figure 2 for SCEHR: Supervised Contrastive Learning for Clinical Risk Prediction using Electronic Health Records

Figure 3 for SCEHR: Supervised Contrastive Learning for Clinical Risk Prediction using Electronic Health Records

Figure 4 for SCEHR: Supervised Contrastive Learning for Clinical Risk Prediction using Electronic Health Records

Abstract:Contrastive learning has demonstrated promising performance in image and text domains either in a self-supervised or a supervised manner. In this work, we extend the supervised contrastive learning framework to clinical risk prediction problems based on longitudinal electronic health records (EHR). We propose a general supervised contrastive loss $\mathcal{L}_{\text{Contrastive Cross Entropy} } + \lambda \mathcal{L}_{\text{Supervised Contrastive Regularizer}}$ for learning both binary classification (e.g. in-hospital mortality prediction) and multi-label classification (e.g. phenotyping) in a unified framework. Our supervised contrastive loss practices the key idea of contrastive learning, namely, pulling similar samples closer and pushing dissimilar ones apart from each other, simultaneously by its two components: $\mathcal{L}_{\text{Contrastive Cross Entropy} }$ tries to contrast samples with learned anchors which represent positive and negative clusters, and $\mathcal{L}_{\text{Supervised Contrastive Regularizer}}$ tries to contrast samples with each other according to their supervised labels. We propose two versions of the above supervised contrastive loss and our experiments on real-world EHR data demonstrate that our proposed loss functions show benefits in improving the performance of strong baselines and even state-of-the-art models on benchmarking tasks for clinical risk predictions. Our loss functions work well with extremely imbalanced data which are common for clinical risk prediction problems. Our loss functions can be easily used to replace (binary or multi-label) cross-entropy loss adopted in existing clinical predictive models. The Pytorch code is released at \url{https://github.com/calvin-zcx/SCEHR}.

Via

Access Paper or Ask Questions

Contrastive Learning Improves Critical Event Prediction in COVID-19 Patients

Jan 11, 2021

Tingyi Wanyan, Hossein Honarvar, Suraj K. Jaladanki, Chengxi Zang, Nidhi Naik, Sulaiman Somani, Jessica K. De Freitas, Ishan Paranjpe, Akhil Vaid, Riccardo Miotto(+6 more)

Figure 1 for Contrastive Learning Improves Critical Event Prediction in COVID-19 Patients

Figure 2 for Contrastive Learning Improves Critical Event Prediction in COVID-19 Patients

Figure 3 for Contrastive Learning Improves Critical Event Prediction in COVID-19 Patients

Figure 4 for Contrastive Learning Improves Critical Event Prediction in COVID-19 Patients

Abstract:Machine Learning (ML) models typically require large-scale, balanced training data to be robust, generalizable, and effective in the context of healthcare. This has been a major issue for developing ML models for the coronavirus-disease 2019 (COVID-19) pandemic where data is highly imbalanced, particularly within electronic health records (EHR) research. Conventional approaches in ML use cross-entropy loss (CEL) that often suffers from poor margin classification. For the first time, we show that contrastive loss (CL) improves the performance of CEL especially for imbalanced EHR data and the related COVID-19 analyses. This study has been approved by the Institutional Review Board at the Icahn School of Medicine at Mount Sinai. We use EHR data from five hospitals within the Mount Sinai Health System (MSHS) to predict mortality, intubation, and intensive care unit (ICU) transfer in hospitalized COVID-19 patients over 24 and 48 hour time windows. We train two sequential architectures (RNN and RETAIN) using two loss functions (CEL and CL). Models are tested on full sample data set which contain all available data and restricted data set to emulate higher class imbalance.CL models consistently outperform CEL models with the restricted data set on these tasks with differences ranging from 0.04 to 0.15 for AUPRC and 0.05 to 0.1 for AUROC. For the restricted sample, only the CL model maintains proper clustering and is able to identify important features, such as pulse oximetry. CL outperforms CEL in instances of severe class imbalance, on three EHR outcomes with respect to three performance metrics: predictive power, clustering, and feature importance. We believe that the developed CL framework can be expanded and used for EHR ML work in general.

Via

Access Paper or Ask Questions

Visualizing Deep Graph Generative Models for Drug Discovery

Jul 20, 2020

Karan Yang, Chengxi Zang, Fei Wang

Figure 1 for Visualizing Deep Graph Generative Models for Drug Discovery

Figure 2 for Visualizing Deep Graph Generative Models for Drug Discovery

Figure 3 for Visualizing Deep Graph Generative Models for Drug Discovery

Figure 4 for Visualizing Deep Graph Generative Models for Drug Discovery

Abstract:Drug discovery aims at designing novel molecules with specific desired properties for clinical trials. Over past decades, drug discovery and development have been a costly and time consuming process. Driven by big chemical data and AI, deep generative models show great potential to accelerate the drug discovery process. Existing works investigate different deep generative frameworks for molecular generation, however, less attention has been paid to the visualization tools to quickly demo and evaluate model's results. Here, we propose a visualization framework which provides interactive visualization tools to visualize molecules generated during the encoding and decoding process of deep graph generative models, and provide real time molecular optimization functionalities. Our work tries to empower black box AI driven drug discovery models with some visual interpretabilities.

* 4 pages, 2020 KDD Workshop on Applied Data Science for Healthcare

Via

Access Paper or Ask Questions

MoFlow: An Invertible Flow Model for Generating Molecular Graphs

Jun 17, 2020

Chengxi Zang, Fei Wang

Figure 1 for MoFlow: An Invertible Flow Model for Generating Molecular Graphs

Figure 2 for MoFlow: An Invertible Flow Model for Generating Molecular Graphs

Figure 3 for MoFlow: An Invertible Flow Model for Generating Molecular Graphs

Figure 4 for MoFlow: An Invertible Flow Model for Generating Molecular Graphs

Abstract:Generating molecular graphs with desired chemical properties driven by deep graph generative models provides a very promising way to accelerate drug discovery process. Such graph generative models usually consist of two steps: learning latent representations and generation of molecular graphs. However, to generate novel and chemically-valid molecular graphs from latent representations is very challenging because of the chemical constraints and combinatorial complexity of molecular graphs. In this paper, we propose MoFlow, a flow-based graph generative model to learn invertible mappings between molecular graphs and their latent representations. To generate molecular graphs, our MoFlow first generates bonds (edges) through a Glow based model, then generates atoms (nodes) given bonds by a novel graph conditional flow, and finally assembles them into a chemically valid molecular graph with a posthoc validity correction. Our MoFlow has merits including exact and tractable likelihood training, efficient one-pass embedding and generation, chemical validity guarantees, 100\% reconstruction of training data, and good generalization ability. We validate our model by four tasks: molecular graph generation and reconstruction, visualization of the continuous latent space, property optimization, and constrained property optimization. Our MoFlow achieves state-of-the-art performance, which implies its potential efficiency and effectiveness to explore large chemical space for drug discovery.

Via

Access Paper or Ask Questions

Neural Dynamics on Complex Networks

Aug 18, 2019

Chengxi Zang, Fei Wang

Figure 1 for Neural Dynamics on Complex Networks

Figure 2 for Neural Dynamics on Complex Networks

Figure 3 for Neural Dynamics on Complex Networks

Figure 4 for Neural Dynamics on Complex Networks

Abstract:We introduce a deep learning model to learn continuous-time dynamics on complex networks and infer the semantic labels of nodes in the network at terminal time. We formulate the problem as an optimal control problem by minimizing a loss function consisting of a running loss of network dynamics, a terminal loss of nodes' labels, and a neural-differential-equation-system constraint. We solve the problem by a differential deep learning framework: as for the forward process of the system, rather than forwarding through a discrete number of hidden layers, we integrate the ordinary differential equation systems on graphs over continuous time; as for the backward learning process, we learn the optimal control parameters by back-propagation during solving initial value problem. We validate our model by learning complex dynamics on various real-world complex networks, and then apply our model to graph semi-supervised classification tasks. The promising experimental results demonstrate our model's capability of jointly capturing the structure, dynamics and semantics of complex systems.

* Department of Healthcare Policy and Research, Weill Cornell Medicine\\ chz4001@med.cornell.edu, few2001@med.cornell.edu

Via

Access Paper or Ask Questions