Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ling Wang

LTCI, Telecom ParisTech, Paris

PDCNet: a benchmark and general deep learning framework for activity prediction of peptide-drug conjugates

Jun 15, 2025

Yun Liu, Jintu Huang, Yingying Zhu, Congrui Wen, Yu Pang, Ji-Quan Zhang, Ling Wang

Abstract:Peptide-drug conjugates (PDCs) represent a promising therapeutic avenue for human diseases, particularly in cancer treatment. Systematic elucidation of structure-activity relationships (SARs) and accurate prediction of the activity of PDCs are critical for the rational design and optimization of these conjugates. To this end, we carefully design and construct a benchmark PDCs dataset compiled from literature-derived collections and PDCdb database, and then develop PDCNet, the first unified deep learning framework for forecasting the activity of PDCs. The architecture systematically captures the complex factors underlying anticancer decisions of PDCs in real-word scenarios through a multi-level feature fusion framework that collaboratively characterizes and learns the features of peptides, linkers, and payloads. Leveraging a curated PDCs benchmark dataset, comprehensive evaluation results show that PDCNet demonstrates superior predictive capability, with the highest AUC, F1, MCC and BA scores of 0.9213, 0.7656, 0.7071 and 0.8388 for the test set, outperforming eight established traditional machine learning models. Multi-level validations, including 5-fold cross-validation, threshold testing, ablation studies, model interpretability analysis and external independent testing, further confirm the superiority, robustness, and usability of the PDCNet architecture. We anticipate that PDCNet represents a novel paradigm, incorporating both a benchmark dataset and advanced models, which can accelerate the design and discovery of new PDC-based therapeutic agents.

Via

Access Paper or Ask Questions

Towards provable probabilistic safety for scalable embodied AI systems

Jun 05, 2025

Linxuan He, Qing-Shan Jia, Ang Li, Hongyan Sang, Ling Wang, Jiwen Lu, Tao Zhang, Jie Zhou, Yi Zhang, Yisen Wang(+4 more)

Abstract:Embodied AI systems, comprising AI models and physical plants, are increasingly prevalent across various applications. Due to the rarity of system failures, ensuring their safety in complex operating environments remains a major challenge, which severely hinders their large-scale deployment in safety-critical domains, such as autonomous vehicles, medical devices, and robotics. While achieving provable deterministic safety--verifying system safety across all possible scenarios--remains theoretically ideal, the rarity and complexity of corner cases make this approach impractical for scalable embodied AI systems. To address this challenge, we introduce provable probabilistic safety, which aims to ensure that the residual risk of large-scale deployment remains below a predefined threshold. Instead of attempting exhaustive safety proof across all corner cases, this paradigm establishes a probabilistic safety boundary on overall system performance, leveraging statistical methods to enhance feasibility and scalability. A well-defined probabilistic safety boundary enables embodied AI systems to be deployed at scale while allowing for continuous refinement of safety guarantees. Our work focuses on three core questions: what is provable probabilistic safety, how to prove the probabilistic safety, and how to achieve the provable probabilistic safety. By bridging the gap between theoretical safety assurance and practical deployment, our work offers a pathway toward safer, large-scale adoption of embodied AI systems in safety-critical applications.

Via

Access Paper or Ask Questions

High-order Graph Neural Networks with Common Neighbor Awareness for Link Prediction

Apr 26, 2025

Ling Wang, Minglian Han

Abstract:Link prediction is a fundamental task in dynamic graph learning (DGL), inherently shaped by the topology of the DG. Recent advancements in dynamic graph neural networks (DGNN), primarily by modeling the relationships among nodes via a message passing scheme, have significantly improved link prediction performance. However, DGNNs heavily rely on the pairwise node interactions, which neglect the common neighbor interaction in DGL. To address this limitation, we propose a High-order Graph Neural Networks with Common Neighbor Awareness (HGNN-CNA) for link prediction with two-fold ideas: a) estimating correlation score by considering multi-hop common neighbors for capturing the complex interaction between nodes; b) fusing the correlation into the message-passing process to consider common neighbor interaction directly in DGL. Experimental results on three real DGs demonstrate that the proposed HGNN-CNA acquires a significant accuracy gain over several state-of-the-art models on the link prediction task.

* Accepted By ICAIS&ISAS 2025

Via

Access Paper or Ask Questions

Composite Indicator-Guided Infilling Sampling for Expensive Multi-Objective Optimization

Mar 28, 2025

Huixiang Zhen, Xiaotong Li, Wenyin Gong, Ling Wang, Xiangyun Hu

Abstract:In expensive multi-objective optimization, where the evaluation budget is strictly limited, selecting promising candidate solutions for expensive fitness evaluations is critical for accelerating convergence and improving algorithmic performance. However, designing an optimization strategy that effectively balances convergence, diversity, and distribution remains a challenge. To tackle this issue, we propose a composite indicator-based evolutionary algorithm (CI-EMO) for expensive multi-objective optimization. In each generation of the optimization process, CI-EMO first employs NSGA-III to explore the solution space based on fitness values predicted by surrogate models, generating a candidate population. Subsequently, we design a novel composite performance indicator to guide the selection of candidates for real fitness evaluation. This indicator simultaneously considers convergence, diversity, and distribution to improve the efficiency of identifying promising candidate solutions, which significantly improves algorithm performance. The composite indicator-based candidate selection strategy is easy to achieve and computes efficiency. Component analysis experiments confirm the effectiveness of each element in the composite performance indicator. Comparative experiments on benchmark problems demonstrate that the proposed algorithm outperforms five state-of-the-art expensive multi-objective optimization algorithms.

Via

Access Paper or Ask Questions

Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions

Dec 03, 2024

Kai Sun, Siyan Xue, Fuchun Sun, Haoran Sun, Yu Luo, Ling Wang, Siyuan Wang, Na Guo, Lei Liu, Tian Zhao(+5 more)

Figure 1 for Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions

Figure 2 for Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions

Figure 3 for Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions

Figure 4 for Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions

Abstract:Recent advancements in deep learning have significantly revolutionized the field of clinical diagnosis and treatment, offering novel approaches to improve diagnostic precision and treatment efficacy across diverse clinical domains, thus driving the pursuit of precision medicine. The growing availability of multi-organ and multimodal datasets has accelerated the development of large-scale Medical Multimodal Foundation Models (MMFMs). These models, known for their strong generalization capabilities and rich representational power, are increasingly being adapted to address a wide range of clinical tasks, from early diagnosis to personalized treatment strategies. This review offers a comprehensive analysis of recent developments in MMFMs, focusing on three key aspects: datasets, model architectures, and clinical applications. We also explore the challenges and opportunities in optimizing multimodal representations and discuss how these advancements are shaping the future of healthcare by enabling improved patient outcomes and more efficient clinical workflows.

Via

Access Paper or Ask Questions

Evolutionary Algorithm with Detection Region Method for Constrained Multi-Objective Problems with Binary Constraints

Nov 13, 2024

Weixiong Huang, Rui Wang, Tao Zhang, Sheng Qi, Ling Wang

Figure 1 for Evolutionary Algorithm with Detection Region Method for Constrained Multi-Objective Problems with Binary Constraints

Figure 2 for Evolutionary Algorithm with Detection Region Method for Constrained Multi-Objective Problems with Binary Constraints

Figure 3 for Evolutionary Algorithm with Detection Region Method for Constrained Multi-Objective Problems with Binary Constraints

Figure 4 for Evolutionary Algorithm with Detection Region Method for Constrained Multi-Objective Problems with Binary Constraints

Abstract:Solving constrained multi-objective optimization problems (CMOPs) is a challenging task. While many practical algorithms have been developed to tackle CMOPs, real-world scenarios often present cases where the constraint functions are unknown or unquantifiable, resulting in only binary outcomes (feasible or infeasible). This limitation reduces the effectiveness of constraint violation guidance, which can negatively impact the performance of existing algorithms that rely on this approach. Such challenges are particularly detrimental for algorithms employing the epsilon-based method, as they hinder effective relaxation of the feasible region. To address these challenges, this paper proposes a novel algorithm called DRMCMO based on the detection region method. In DRMCMO, detection regions dynamic monitor feasible solutions to enhance convergence, helping the population escape local optima. Additionally, these regions collaborate with the neighbor pairing strategy to improve population diversity within narrow feasible areas. We have modified three existing test suites to serve as benchmark test problems for CMOPs with binary constraints(CMOP/BC) and conducted comprehensive comparative experiments with state-of-the-art algorithms on these test suites and real-world problems. The results demonstrate the strong competitiveness of DRMCMO against state-of-the-art algorithms. Given the limited research on CMOP/BC, our study offers a new perspective for advancing this field.

Via

Access Paper or Ask Questions

Dropout the High-rate Downsampling: A Novel Design Paradigm for UHD Image Restoration

Nov 10, 2024

Chen Wu, Ling Wang, Long Peng, Dianjie Lu, Zhuoran Zheng

Figure 1 for Dropout the High-rate Downsampling: A Novel Design Paradigm for UHD Image Restoration

Figure 2 for Dropout the High-rate Downsampling: A Novel Design Paradigm for UHD Image Restoration

Figure 3 for Dropout the High-rate Downsampling: A Novel Design Paradigm for UHD Image Restoration

Figure 4 for Dropout the High-rate Downsampling: A Novel Design Paradigm for UHD Image Restoration

Abstract:With the popularization of high-end mobile devices, Ultra-high-definition (UHD) images have become ubiquitous in our lives. The restoration of UHD images is a highly challenging problem due to the exaggerated pixel count, which often leads to memory overflow during processing. Existing methods either downsample UHD images at a high rate before processing or split them into multiple patches for separate processing. However, high-rate downsampling leads to significant information loss, while patch-based approaches inevitably introduce boundary artifacts. In this paper, we propose a novel design paradigm to solve the UHD image restoration problem, called D2Net. D2Net enables direct full-resolution inference on UHD images without the need for high-rate downsampling or dividing the images into several patches. Specifically, we ingeniously utilize the characteristics of the frequency domain to establish long-range dependencies of features. Taking into account the richer local patterns in UHD images, we also design a multi-scale convolutional group to capture local features. Additionally, during the decoding stage, we dynamically incorporate features from the encoding stage to reduce the flow of irrelevant information. Extensive experiments on three UHD image restoration tasks, including low-light image enhancement, image dehazing, and image deblurring, show that our model achieves better quantitative and qualitative results than state-of-the-art methods.

* WACV2025

Via

Access Paper or Ask Questions

BrainDreamer: Reasoning-Coherent and Controllable Image Generation from EEG Brain Signals via Language Guidance

Sep 21, 2024

Ling Wang, Chen Wu, Lin Wang

Figure 1 for BrainDreamer: Reasoning-Coherent and Controllable Image Generation from EEG Brain Signals via Language Guidance

Figure 2 for BrainDreamer: Reasoning-Coherent and Controllable Image Generation from EEG Brain Signals via Language Guidance

Figure 3 for BrainDreamer: Reasoning-Coherent and Controllable Image Generation from EEG Brain Signals via Language Guidance

Figure 4 for BrainDreamer: Reasoning-Coherent and Controllable Image Generation from EEG Brain Signals via Language Guidance

Abstract:Can we directly visualize what we imagine in our brain together with what we describe? The inherent nature of human perception reveals that, when we think, our body can combine language description and build a vivid picture in our brain. Intuitively, generative models should also hold such versatility. In this paper, we introduce BrainDreamer, a novel end-to-end language-guided generative framework that can mimic human reasoning and generate high-quality images from electroencephalogram (EEG) brain signals. Our method is superior in its capacity to eliminate the noise introduced by non-invasive EEG data acquisition and meanwhile achieve a more precise mapping between the EEG and image modality, thus leading to significantly better-generated images. Specifically, BrainDreamer consists of two key learning stages: 1) modality alignment and 2) image generation. In the alignment stage, we propose a novel mask-based triple contrastive learning strategy to effectively align EEG, text, and image embeddings to learn a unified representation. In the generation stage, we inject the EEG embeddings into the pre-trained Stable Diffusion model by designing a learnable EEG adapter to generate high-quality reasoning-coherent images. Moreover, BrainDreamer can accept textual descriptions (e.g., color, position, etc.) to achieve controllable image generation. Extensive experiments show that our method significantly outperforms prior arts in terms of generating quality and quantitative performance.

Via

Access Paper or Ask Questions

Subequivariant Reinforcement Learning in 3D Multi-Entity Physical Environments

Jul 17, 2024

Runfa Chen, Ling Wang, Yu Du, Tianrui Xue, Fuchun Sun, Jianwei Zhang, Wenbing Huang

Figure 1 for Subequivariant Reinforcement Learning in 3D Multi-Entity Physical Environments

Figure 2 for Subequivariant Reinforcement Learning in 3D Multi-Entity Physical Environments

Figure 3 for Subequivariant Reinforcement Learning in 3D Multi-Entity Physical Environments

Figure 4 for Subequivariant Reinforcement Learning in 3D Multi-Entity Physical Environments

Abstract:Learning policies for multi-entity systems in 3D environments is far more complicated against single-entity scenarios, due to the exponential expansion of the global state space as the number of entities increases. One potential solution of alleviating the exponential complexity is dividing the global space into independent local views that are invariant to transformations including translations and rotations. To this end, this paper proposes Subequivariant Hierarchical Neural Networks (SHNN) to facilitate multi-entity policy learning. In particular, SHNN first dynamically decouples the global space into local entity-level graphs via task assignment. Second, it leverages subequivariant message passing over the local entity-level graphs to devise local reference frames, remarkably compressing the representation redundancy, particularly in gravity-affected environments. Furthermore, to overcome the limitations of existing benchmarks in capturing the subtleties of multi-entity systems under the Euclidean symmetry, we propose the Multi-entity Benchmark (MEBEN), a new suite of environments tailored for exploring a wide range of multi-entity reinforcement learning. Extensive experiments demonstrate significant advancements of SHNN on the proposed benchmarks compared to existing methods. Comprehensive ablations are conducted to verify the indispensability of task assignment and subequivariance.

* ICML 2024

Via

Access Paper or Ask Questions

EIT-1M: One Million EEG-Image-Text Pairs for Human Visual-textual Recognition and More

Jul 02, 2024

Xu Zheng, Ling Wang, Kanghao Chen, Yuanhuiyi Lyu, Jiazhou Zhou, Lin Wang

Figure 1 for EIT-1M: One Million EEG-Image-Text Pairs for Human Visual-textual Recognition and More

Figure 2 for EIT-1M: One Million EEG-Image-Text Pairs for Human Visual-textual Recognition and More

Figure 3 for EIT-1M: One Million EEG-Image-Text Pairs for Human Visual-textual Recognition and More

Figure 4 for EIT-1M: One Million EEG-Image-Text Pairs for Human Visual-textual Recognition and More

Abstract:Recently, electroencephalography (EEG) signals have been actively incorporated to decode brain activity to visual or textual stimuli and achieve object recognition in multi-modal AI. Accordingly, endeavors have been focused on building EEG-based datasets from visual or textual single-modal stimuli. However, these datasets offer limited EEG epochs per category, and the complex semantics of stimuli presented to participants compromise their quality and fidelity in capturing precise brain activity. The study in neuroscience unveils that the relationship between visual and textual stimulus in EEG recordings provides valuable insights into the brain's ability to process and integrate multi-modal information simultaneously. Inspired by this, we propose a novel large-scale multi-modal dataset, named EIT-1M, with over 1 million EEG-image-text pairs. Our dataset is superior in its capacity of reflecting brain activities in simultaneously processing multi-modal information. To achieve this, we collected data pairs while participants viewed alternating sequences of visual-textual stimuli from 60K natural images and category-specific texts. Common semantic categories are also included to elicit better reactions from participants' brains. Meanwhile, response-based stimulus timing and repetition across blocks and sessions are included to ensure data diversity. To verify the effectiveness of EIT-1M, we provide an in-depth analysis of EEG data captured from multi-modal stimuli across different categories and participants, along with data quality scores for transparency. We demonstrate its validity on two tasks: 1) EEG recognition from visual or textual stimuli or both and 2) EEG-to-visual generation.

Via

Access Paper or Ask Questions