Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yiping Song

MetaMolGen: A Neural Graph Motif Generation Model for De Novo Molecular Design

Apr 22, 2025

Zimo Yan, Jie Zhang, Zheng Xie, Chang Liu, Yizhen Liu, Yiping Song

Abstract:Molecular generation plays an important role in drug discovery and materials science, especially in data-scarce scenarios where traditional generative models often struggle to achieve satisfactory conditional generalization. To address this challenge, we propose MetaMolGen, a first-order meta-learning-based molecular generator designed for few-shot and property-conditioned molecular generation. MetaMolGen standardizes the distribution of graph motifs by mapping them to a normalized latent space, and employs a lightweight autoregressive sequence model to generate SMILES sequences that faithfully reflect the underlying molecular structure. In addition, it supports conditional generation of molecules with target properties through a learnable property projector integrated into the generative process.Experimental results demonstrate that MetaMolGen consistently generates valid and diverse SMILES sequences under low-data regimes, outperforming conventional baselines. This highlights its advantage in fast adaptation and efficient conditional generation for practical molecular design.

Via

Access Paper or Ask Questions

Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game

Apr 03, 2024

Qianqiao Xu, Zhiliang Tian, Hongyan Wu, Zhen Huang, Yiping Song, Feng Liu, Dongsheng Li

Figure 1 for Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game

Figure 2 for Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game

Figure 3 for Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game

Figure 4 for Learn to Disguise: Avoid Refusal Responses in LLM's Defense via a Multi-agent Attacker-Disguiser Game

Abstract:With the enhanced performance of large models on natural language processing tasks, potential moral and ethical issues of large models arise. There exist malicious attackers who induce large models to jailbreak and generate information containing illegal, privacy-invasive information through techniques such as prompt engineering. As a result, large models counter malicious attackers' attacks using techniques such as safety alignment. However, the strong defense mechanism of the large model through rejection replies is easily identified by attackers and used to strengthen attackers' capabilities. In this paper, we propose a multi-agent attacker-disguiser game approach to achieve a weak defense mechanism that allows the large model to both safely reply to the attacker and hide the defense intent. First, we construct a multi-agent framework to simulate attack and defense scenarios, playing different roles to be responsible for attack, disguise, safety evaluation, and disguise evaluation tasks. After that, we design attack and disguise game algorithms to optimize the game strategies of the attacker and the disguiser and use the curriculum learning process to strengthen the capabilities of the agents. The experiments verify that the method in this paper is more effective in strengthening the model's ability to disguise the defense intent compared with other methods. Moreover, our approach can adapt any black-box large model to assist the model in defense and does not suffer from model version iterations.

* 13 pages, 2 figures

Via

Access Paper or Ask Questions

LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification

Feb 26, 2024

Yiping Song, Juhua Zhang, Zhiliang Tian, Yuxin Yang, Minlie Huang, Dongsheng Li

Figure 1 for LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification

Figure 2 for LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification

Figure 3 for LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification

Figure 4 for LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification

Abstract:As sufficient data are not always publically accessible for model training, researchers exploit limited data with advanced learning algorithms or expand the dataset via data augmentation (DA). Conducting DA in private domain requires private protection approaches (i.e. anonymization and perturbation), but those methods cannot provide protection guarantees. Differential privacy (DP) learning methods theoretically bound the protection but are not skilled at generating pseudo text samples with large models. In this paper, we transfer DP-based pseudo sample generation task to DP-based generated samples discrimination task, where we propose a DP-based DA method with a LLM and a DP-based discriminator for text classification on private domains. We construct a knowledge distillation model as the DP-based discriminator: teacher models, accessing private data, teaches students how to select private samples with calibrated noise to achieve DP. To constrain the distribution of DA's generation, we propose a DP-based tutor that models the noised private distribution and controls samples' generation with a low privacy cost. We theoretically analyze our model's privacy protection and empirically verify our model.

Via

Access Paper or Ask Questions

Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation

Mar 22, 2022

Yingxiu Zhao, Zhiliang Tian, Huaxiu Yao, Yinhe Zheng, Dongkyu Lee, Yiping Song, Jian Sun, Nevin L. Zhang

Figure 1 for Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation

Figure 2 for Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation

Figure 3 for Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation

Figure 4 for Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation

Abstract:Building models of natural language processing (NLP) is challenging in low-resource scenarios where only limited data are available. Optimization-based meta-learning algorithms achieve promising results in low-resource scenarios by adapting a well-generalized model initialization to handle new tasks. Nonetheless, these approaches suffer from the memorization overfitting issue, where the model tends to memorize the meta-training tasks while ignoring support sets when adapting to new tasks. To address this issue, we propose a memory imitation meta-learning (MemIML) method that enhances the model's reliance on support sets for task adaptation. Specifically, we introduce a task-specific memory module to store support set information and construct an imitation module to force query sets to imitate the behaviors of some representative support-set samples stored in the memory. A theoretical analysis is provided to prove the effectiveness of our method, and empirical results also demonstrate that our method outperforms competitive baselines on both text classification and generation tasks.

* ACL 2022

Via

Access Paper or Ask Questions

Trivial bundle embeddings for learning graph representations

Dec 05, 2021

Zheng Xie, Xiaojing Zuo, Yiping Song

Figure 1 for Trivial bundle embeddings for learning graph representations

Figure 2 for Trivial bundle embeddings for learning graph representations

Figure 3 for Trivial bundle embeddings for learning graph representations

Figure 4 for Trivial bundle embeddings for learning graph representations

Abstract:Embedding real-world networks presents challenges because it is not clear how to identify their latent geometries. Embedding some disassortative networks, such as scale-free networks, to the Euclidean space has been shown to incur distortions. Embedding scale-free networks to hyperbolic spaces offer an exciting alternative but incurs distortions when embedding assortative networks with latent geometries not hyperbolic. We propose an inductive model that leverages both the expressiveness of GCNs and trivial bundle to learn inductive node representations for networks with or without node features. A trivial bundle is a simple case of fiber bundles,a space that is globally a product space of its base space and fiber. The coordinates of base space and those of fiber can be used to express the assortative and disassortative factors in generating edges. Therefore, the model has the ability to learn embeddings that can express those factors. In practice, it reduces errors for link prediction and node classification when compared to the Euclidean and hyperbolic GCNs.

* 17 pages,4 figures

Via

Access Paper or Ask Questions

Learning from My Friends: Few-Shot Personalized Conversation Systems via Social Networks

May 21, 2021

Zhiliang Tian, Wei Bi, Zihan Zhang, Dongkyu Lee, Yiping Song, Nevin L. Zhang

Figure 1 for Learning from My Friends: Few-Shot Personalized Conversation Systems via Social Networks

Figure 2 for Learning from My Friends: Few-Shot Personalized Conversation Systems via Social Networks

Figure 3 for Learning from My Friends: Few-Shot Personalized Conversation Systems via Social Networks

Figure 4 for Learning from My Friends: Few-Shot Personalized Conversation Systems via Social Networks

Abstract:Personalized conversation models (PCMs) generate responses according to speaker preferences. Existing personalized conversation tasks typically require models to extract speaker preferences from user descriptions or their conversation histories, which are scarce for newcomers and inactive users. In this paper, we propose a few-shot personalized conversation task with an auxiliary social network. The task requires models to generate personalized responses for a speaker given a few conversations from the speaker and a social network. Existing methods are mainly designed to incorporate descriptions or conversation histories. Those methods can hardly model speakers with so few conversations or connections between speakers. To better cater for newcomers with few resources, we propose a personalized conversation model (PCM) that learns to adapt to new speakers as well as enabling new speakers to learn from resource-rich speakers. Particularly, based on a meta-learning based PCM, we propose a task aggregator (TA) to collect other speakers' information from the social network. The TA provides prior knowledge of the new speaker in its meta-learning. Experimental results show our methods outperform all baselines in appropriateness, diversity, and consistency with speakers.

* Published by AAAI 2021

Via

Access Paper or Ask Questions

When does MAML Work the Best? An Empirical Study on Model-Agnostic Meta-Learning in NLP Applications

May 24, 2020

Zequn Liu, Ruiyi Zhang, Yiping Song, Ming Zhang

Figure 1 for When does MAML Work the Best? An Empirical Study on Model-Agnostic Meta-Learning in NLP Applications

Figure 2 for When does MAML Work the Best? An Empirical Study on Model-Agnostic Meta-Learning in NLP Applications

Figure 3 for When does MAML Work the Best? An Empirical Study on Model-Agnostic Meta-Learning in NLP Applications

Abstract:Model-Agnostic Meta-Learning (MAML), a model-agnostic meta-learning method, is successfully employed in NLP applications including few-shot text classification and multi-domain low-resource language generation. Many impacting factors, including data quantity, similarity among tasks, and the balance between general language model and task-specific adaptation, can affect the performance of MAML in NLP, but few works have thoroughly studied them. In this paper, we conduct an empirical study to investigate these impacting factors and conclude when MAML works the best based on the experimental results.

Via

Access Paper or Ask Questions

Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation

May 13, 2020

Zhiliang Tian, Wei Bi, Dongkyu Lee, Lanqing Xue, Yiping Song, Xiaojiang Liu, Nevin L. Zhang

Figure 1 for Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation

Figure 2 for Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation

Figure 3 for Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation

Figure 4 for Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation

Abstract:Neural conversation models are known to generate appropriate but non-informative responses in general. A scenario where informativeness can be significantly enhanced is Conversing by Reading (CbR), where conversations take place with respect to a given external document. In previous work, the external document is utilized by (1) creating a context-aware document memory that integrates information from the document and the conversational context, and then (2) generating responses referring to the memory. In this paper, we propose to create the document memory with some anticipated responses in mind. This is achieved using a teacher-student framework. The teacher is given the external document, the context, and the ground-truth response, and learns how to build a response-aware document memory from three sources of information. The student learns to construct a response-anticipated document memory from the first two sources, and the teacher's insight on memory creation. Empirical results show that our model outperforms the previous state-of-the-art for the CbR task.

* To appear at ACL 2020

Via

Access Paper or Ask Questions

Multi-task Learning via Adaptation to Similar Tasks for Mortality Prediction of Diverse Rare Diseases

May 11, 2020

Luchen Liu, Zequn Liu, Haoxian Wu, Zichang Wang, Jianhao Shen, Yiping Song, Ming Zhang

Figure 1 for Multi-task Learning via Adaptation to Similar Tasks for Mortality Prediction of Diverse Rare Diseases

Figure 2 for Multi-task Learning via Adaptation to Similar Tasks for Mortality Prediction of Diverse Rare Diseases

Figure 3 for Multi-task Learning via Adaptation to Similar Tasks for Mortality Prediction of Diverse Rare Diseases

Figure 4 for Multi-task Learning via Adaptation to Similar Tasks for Mortality Prediction of Diverse Rare Diseases

Abstract:Mortality prediction of diverse rare diseases using electronic health record (EHR) data is a crucial task for intelligent healthcare. However, data insufficiency and the clinical diversity of rare diseases make it hard for directly training deep learning models on individual disease data or all the data from different diseases. Mortality prediction for these patients with different diseases can be viewed as a multi-task learning problem with insufficient data and large task number. But the tasks with little training data also make it hard to train task-specific modules in multi-task learning models. To address the challenges of data insufficiency and task diversity, we propose an initialization-sharing multi-task learning method (Ada-Sit) which learns the parameter initialization for fast adaptation to dynamically measured similar tasks. We use Ada-Sit to train long short-term memory networks (LSTM) based prediction models on longitudinal EHR data. And experimental results demonstrate that the proposed model is effective for mortality prediction of diverse rare diseases.

* 10 pages, 3 Figures, submitted to AMIA Annual Symposium

Via

Access Paper or Ask Questions

Learning to Customize Language Model for Generation-based dialog systems

Oct 31, 2019

Yiping Song, Zequn Liu, Wei Bi, Rui Yan, Ming Zhang

Figure 1 for Learning to Customize Language Model for Generation-based dialog systems

Figure 2 for Learning to Customize Language Model for Generation-based dialog systems

Figure 3 for Learning to Customize Language Model for Generation-based dialog systems

Figure 4 for Learning to Customize Language Model for Generation-based dialog systems

Abstract:Personalized conversation systems have received increasing attention recently. Existing personalized conversation models tend to employ the meta-learning framework that first finds the initial parameters, then fine-tunes on a few personal utterances. However, fine-tuning can only make slight changes to the initial parameters, resulting in similar language models for different users. In this paper, we propose to customize a conversation model with unique network structures for each user. Concretely, we introduce a private network to the language model, whose structure will evolve during training to better capture the unique characteristics of the user. The private network is only trained on the corpora of the corresponding user, and similar users can share partial private structure for data reuse purpose. Experiment results show that our algorithm excels all the baselines in terms of personality, quality, and diversity measurement.

Via

Access Paper or Ask Questions