Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiaxin Xu

POINT$^{2}$: A Polymer Informatics Training and Testing Database

Mar 30, 2025

Jiaxin Xu, Gang Liu, Ruilan Guo, Meng Jiang, Tengfei Luo

Abstract:The advancement of polymer informatics has been significantly propelled by the integration of machine learning (ML) techniques, enabling the rapid prediction of polymer properties and expediting the discovery of high-performance polymeric materials. However, the field lacks a standardized workflow that encompasses prediction accuracy, uncertainty quantification, ML interpretability, and polymer synthesizability. In this study, we introduce POINT$^{2}$ (POlymer INformatics Training and Testing), a comprehensive benchmark database and protocol designed to address these critical challenges. Leveraging the existing labeled datasets and the unlabeled PI1M dataset, a collection of approximately one million virtual polymers generated via a recurrent neural network trained on the realistic polymers, we develop an ensemble of ML models, including Quantile Random Forests, Multilayer Perceptrons with dropout, Graph Neural Networks, and pretrained large language models. These models are coupled with diverse polymer representations such as Morgan, MACCS, RDKit, Topological, Atom Pair fingerprints, and graph-based descriptors to achieve property predictions, uncertainty estimations, model interpretability, and template-based polymerization synthesizability across a spectrum of properties, including gas permeability, thermal conductivity, glass transition temperature, melting temperature, fractional free volume, and density. The POINT$^{2}$ database can serve as a valuable resource for the polymer informatics community for polymer discovery and optimization.

Via

Access Paper or Ask Questions

A Criterion for Extending Continuous-Mixture Identifiability Results

Mar 05, 2025

Michael R. Powers, Jiaxin Xu

Abstract:For continuous mixtures of random variables, we provide a simple criterion -- generating-function accessibility -- to extend previously known kernel-based identifiability (or unidentifiability) results to new kernel distributions. This criterion, based on functional relationships between the relevant kernels' moment-generating functions or Laplace transforms, may be applied to continuous mixtures of both discrete and continuous random variables. To illustrate the proposed approach, we present results for several specific kernels.

Via

Access Paper or Ask Questions

From Abstract to Actionable: Pairwise Shapley Values for Explainable AI

Feb 18, 2025

Jiaxin Xu, Hung Chau, Angela Burden

Abstract:Explainable AI (XAI) is critical for ensuring transparency, accountability, and trust in machine learning systems as black-box models are increasingly deployed within high-stakes domains. Among XAI methods, Shapley values are widely used for their fairness and consistency axioms. However, prevalent Shapley value approximation methods commonly rely on abstract baselines or computationally intensive calculations, which can limit their interpretability and scalability. To address such challenges, we propose Pairwise Shapley Values, a novel framework that grounds feature attributions in explicit, human-relatable comparisons between pairs of data instances proximal in feature space. Our method introduces pairwise reference selection combined with single-value imputation to deliver intuitive, model-agnostic explanations while significantly reducing computational overhead. Here, we demonstrate that Pairwise Shapley Values enhance interpretability across diverse regression and classification scenarios--including real estate pricing, polymer property prediction, and drug discovery datasets. We conclude that the proposed methods enable more transparent AI systems and advance the real-world applicability of XAI.

Via

Access Paper or Ask Questions

Robot-Initiated Social Control of Sedentary Behavior: Comparing the Impact of Relationship- and Target-Focused Strategies

Feb 12, 2025

Jiaxin Xu, Sterre Anna Mariam van der Horst, Chao Zhang, Raymond H. Cuijpers, Wijnand A. IJsselsteijn

Abstract:To design social robots to effectively promote health behavior change, it is essential to understand how people respond to various health communication strategies employed by these robots. This study examines the effectiveness of two types of social control strategies from a social robot, relationship-focused strategies (emphasizing relational consequences) and target-focused strategies (emphasizing health consequences), in encouraging people to reduce sedentary behavior. A two-session lab experiment was conducted (n = 135), where participants first played a game with a robot, followed by the robot persuading them to stand up and move using one of the strategies. Half of the participants joined a second session to have a repeated interaction with the robot. Results showed that relationship-focused strategies motivated participants to stay active longer. Repeated sessions did not strengthen participants' relationship with the robot, but those who felt more attached to the robot responded more actively to the target-focused strategies. These findings offer valuable insights for designing persuasive strategies for social robots in health communication contexts.

* Accepted to HRI 2025

Via

Access Paper or Ask Questions

Benchmarking Complex Instruction-Following with Multiple Constraints Composition

Jul 04, 2024

Bosi Wen, Pei Ke, Xiaotao Gu, Lindong Wu, Hao Huang, Jinfeng Zhou, Wenchuang Li, Binxin Hu, Wendy Gao, Jiaxin Xu(+4 more)

Figure 1 for Benchmarking Complex Instruction-Following with Multiple Constraints Composition

Figure 2 for Benchmarking Complex Instruction-Following with Multiple Constraints Composition

Figure 3 for Benchmarking Complex Instruction-Following with Multiple Constraints Composition

Figure 4 for Benchmarking Complex Instruction-Following with Multiple Constraints Composition

Abstract:Instruction following is one of the fundamental capabilities of large language models (LLMs). As the ability of LLMs is constantly improving, they have been increasingly applied to deal with complex human instructions in real-world scenarios. Therefore, how to evaluate the ability of complex instruction-following of LLMs has become a critical research problem. Existing benchmarks mainly focus on modeling different types of constraints in human instructions while neglecting the composition of different constraints, which is an indispensable constituent in complex instructions. To this end, we propose ComplexBench, a benchmark for comprehensively evaluating the ability of LLMs to follow complex instructions composed of multiple constraints. We propose a hierarchical taxonomy for complex instructions, including 4 constraint types, 19 constraint dimensions, and 4 composition types, and manually collect a high-quality dataset accordingly. To make the evaluation reliable, we augment LLM-based evaluators with rules to effectively verify whether generated texts can satisfy each constraint and composition. Furthermore, we obtain the final evaluation score based on the dependency structure determined by different composition types. ComplexBench identifies significant deficiencies in existing LLMs when dealing with complex instructions with multiple constraints composition.

* 20 pages, 7 figures

Via

Access Paper or Ask Questions

Demonstrating HumanTHOR: A Simulation Platform and Benchmark for Human-Robot Collaboration in a Shared Workspace

Jun 10, 2024

Chenxu Wang, Boyuan Du, Jiaxin Xu, Peiyan Li, Di Guo, Huaping Liu

Abstract:Human-robot collaboration (HRC) in a shared workspace has become a common pattern in real-world robot applications and has garnered significant research interest. However, most existing studies for human-in-the-loop (HITL) collaboration with robots in a shared workspace evaluate in either simplified game environments or physical platforms, falling short in limited realistic significance or limited scalability. To support future studies, we build an embodied framework named HumanTHOR, which enables humans to act in the simulation environment through VR devices to support HITL collaborations in a shared workspace. To validate our system, we build a benchmark of everyday tasks and conduct a preliminary user study with two baseline algorithms. The results show that the robot can effectively assist humans in collaboration, demonstrating the significance of HRC. The comparison among different levels of baselines affirms that our system can adequately evaluate robot capabilities and serve as a benchmark for different robot algorithms. The experimental results also indicate that there is still much room in the area and our system can provide a preliminary foundation for future HRC research in a shared workspace. More information about the simulation environment, experiment videos, benchmark descriptions, and additional supplementary materials can be found on the website: https://sites.google.com/view/humanthor/.

* In RSS 2024

Via

Access Paper or Ask Questions

Inverse Molecular Design with Multi-Conditional Diffusion Guidance

Jan 24, 2024

Gang Liu, Jiaxin Xu, Tengfei Luo, Meng Jiang

Abstract:Inverse molecular design with diffusion models holds great potential for advancements in material and drug discovery. Despite success in unconditional molecule generation, integrating multiple properties such as synthetic score and gas permeability as condition constraints into diffusion models remains unexplored. We introduce multi-conditional diffusion guidance. The proposed Transformer-based denoising model has a condition encoder that learns the representations of numerical and categorical conditions. The denoising model, consisting of a structure encoder-decoder, is trained for denoising under the representation of conditions. The diffusion process becomes graph-dependent to accurately estimate graph-related noise in molecules, unlike the previous models that focus solely on the marginal distributions of atoms or bonds. We extensively validate our model for multi-conditional polymer and small molecule generation. Results demonstrate our superiority across metrics from distribution learning to condition control for molecular properties. An inverse polymer design task for gas separation with feedback from domain experts further demonstrates its practical utility.

* 20 pages, 8 figures, 7 tables

Via

Access Paper or Ask Questions

Data-Centric Learning from Unlabeled Graphs with Diffusion Model

Mar 17, 2023

Gang Liu, Eric Inae, Tong Zhao, Jiaxin Xu, Tengfei Luo, Meng Jiang

Abstract:Graph property prediction tasks are important and numerous. While each task offers a small size of labeled examples, unlabeled graphs have been collected from various sources and at a large scale. A conventional approach is training a model with the unlabeled graphs on self-supervised tasks and then fine-tuning the model on the prediction tasks. However, the self-supervised task knowledge could not be aligned or sometimes conflicted with what the predictions needed. In this paper, we propose to extract the knowledge underlying the large set of unlabeled graphs as a specific set of useful data points to augment each property prediction model. We use a diffusion model to fully utilize the unlabeled graphs and design two new objectives to guide the model's denoising process with each task's labeled data to generate task-specific graph examples and their labels. Experiments demonstrate that our data-centric approach performs significantly better than fourteen existing various methods on fifteen tasks. The performance improvement brought by unlabeled data is visible as the generated labeled examples unlike self-supervised learning.

* Preprint. 18 pages, 6 figures

Via

Access Paper or Ask Questions

Graph Rationalization with Environment-based Augmentations

Jun 06, 2022

Gang Liu, Tong Zhao, Jiaxin Xu, Tengfei Luo, Meng Jiang

Figure 1 for Graph Rationalization with Environment-based Augmentations

Figure 2 for Graph Rationalization with Environment-based Augmentations

Figure 3 for Graph Rationalization with Environment-based Augmentations

Figure 4 for Graph Rationalization with Environment-based Augmentations

Abstract:Rationale is defined as a subset of input features that best explains or supports the prediction by machine learning models. Rationale identification has improved the generalizability and interpretability of neural networks on vision and language data. In graph applications such as molecule and polymer property prediction, identifying representative subgraph structures named as graph rationales plays an essential role in the performance of graph neural networks. Existing graph pooling and/or distribution intervention methods suffer from lack of examples to learn to identify optimal graph rationales. In this work, we introduce a new augmentation operation called environment replacement that automatically creates virtual data examples to improve rationale identification. We propose an efficient framework that performs rationale-environment separation and representation learning on the real and augmented examples in latent spaces to avoid the high complexity of explicit graph decoding and encoding. Comparing against recent techniques, experiments on seven molecular and four polymer real datasets demonstrate the effectiveness and efficiency of the proposed augmentation-based graph rationalization framework.

* Accepted by KDD 2022

Via

Access Paper or Ask Questions

Literature Review: Human Segmentation with Static Camera

Oct 28, 2019

Jiaxin Xu, Rui Wang, Vaibhav Rakheja

Figure 1 for Literature Review: Human Segmentation with Static Camera

Figure 2 for Literature Review: Human Segmentation with Static Camera

Figure 3 for Literature Review: Human Segmentation with Static Camera

Figure 4 for Literature Review: Human Segmentation with Static Camera

Abstract:Our research topic is Human segmentation with static camera. This topic can be divided into three sub-tasks, which are object detection, instance identification and segmentation. These sub-tasks are three closely related subjects. The development of each subject has great impact on the other two fields. In this literature review, we will first introduce the background of human segmentation and then talk about issues related to the above three fields as well as how they interact with each other.

Via

Access Paper or Ask Questions