Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shuang Liu

Provable Robust Overfitting Mitigation in Wasserstein Distributionally Robust Optimization

Mar 06, 2025

Shuang Liu, Yihan Wang, Yifan Zhu, Yibo Miao, Xiao-Shan Gao

Abstract:Wasserstein distributionally robust optimization (WDRO) optimizes against worst-case distributional shifts within a specified uncertainty set, leading to enhanced generalization on unseen adversarial examples, compared to standard adversarial training which focuses on pointwise adversarial perturbations. However, WDRO still suffers fundamentally from the robust overfitting problem, as it does not consider statistical error. We address this gap by proposing a novel robust optimization framework under a new uncertainty set for adversarial noise via Wasserstein distance and statistical error via Kullback-Leibler divergence, called the Statistically Robust WDRO. We establish a robust generalization bound for the new optimization framework, implying that out-of-distribution adversarial performance is at least as good as the statistically robust training loss with high probability. Furthermore, we derive conditions under which Stackelberg and Nash equilibria exist between the learner and the adversary, giving an optimal robust model in certain sense. Finally, through extensive experiments, we demonstrate that our method significantly mitigates robust overfitting and enhances robustness within the framework of WDRO.

* ICLR 2025

Via

Access Paper or Ask Questions

Generalization Bound and New Algorithm for Clean-Label Backdoor Attack

Jun 02, 2024

Lijia Yu, Shuang Liu, Yibo Miao, Xiao-Shan Gao, Lijun Zhang

Figure 1 for Generalization Bound and New Algorithm for Clean-Label Backdoor Attack

Figure 2 for Generalization Bound and New Algorithm for Clean-Label Backdoor Attack

Figure 3 for Generalization Bound and New Algorithm for Clean-Label Backdoor Attack

Figure 4 for Generalization Bound and New Algorithm for Clean-Label Backdoor Attack

Abstract:The generalization bound is a crucial theoretical tool for assessing the generalizability of learning methods and there exist vast literatures on generalizability of normal learning, adversarial learning, and data poisoning. Unlike other data poison attacks, the backdoor attack has the special property that the poisoned triggers are contained in both the training set and the test set and the purpose of the attack is two-fold. To our knowledge, the generalization bound for the backdoor attack has not been established. In this paper, we fill this gap by deriving algorithm-independent generalization bounds in the clean-label backdoor attack scenario. Precisely, based on the goals of backdoor attack, we give upper bounds for the clean sample population errors and the poison population errors in terms of the empirical error on the poisoned training dataset. Furthermore, based on the theoretical result, a new clean-label backdoor attack is proposed that computes the poisoning trigger by combining adversarial noise and indiscriminate poison. We show its effectiveness in a variety of settings.

Via

Access Paper or Ask Questions

Game-Theoretic Unlearnable Example Generator

Jan 31, 2024

Shuang Liu, Yihan Wang, Xiao-Shan Gao

Abstract:Unlearnable example attacks are data poisoning attacks aiming to degrade the clean test accuracy of deep learning by adding imperceptible perturbations to the training samples, which can be formulated as a bi-level optimization problem. However, directly solving this optimization problem is intractable for deep neural networks. In this paper, we investigate unlearnable example attacks from a game-theoretic perspective, by formulating the attack as a nonzero sum Stackelberg game. First, the existence of game equilibria is proved under the normal setting and the adversarial training setting. It is shown that the game equilibrium gives the most powerful poison attack in that the victim has the lowest test accuracy among all networks within the same hypothesis space, when certain loss functions are used. Second, we propose a novel attack method, called the Game Unlearnable Example (GUE), which has three main gradients. (1) The poisons are obtained by directly solving the equilibrium of the Stackelberg game with a first-order algorithm. (2) We employ an autoencoder-like generative network model as the poison attacker. (3) A novel payoff function is introduced to evaluate the performance of the poison. Comprehensive experiments demonstrate that GUE can effectively poison the model in various scenarios. Furthermore, the GUE still works by using a relatively small percentage of the training data to train the generator, and the poison generator can generalize to unseen data well. Our implementation code can be found at https://github.com/hong-xian/gue.

Via

Access Paper or Ask Questions

Data-Dependent Stability Analysis of Adversarial Training

Jan 06, 2024

Yihan Wang, Shuang Liu, Xiao-Shan Gao

Abstract:Stability analysis is an essential aspect of studying the generalization ability of deep learning, as it involves deriving generalization bounds for stochastic gradient descent-based training algorithms. Adversarial training is the most widely used defense against adversarial example attacks. However, previous generalization bounds for adversarial training have not included information regarding the data distribution. In this paper, we fill this gap by providing generalization bounds for stochastic gradient descent-based adversarial training that incorporate data distribution information. We utilize the concepts of on-average stability and high-order approximate Lipschitz conditions to examine how changes in data distribution and adversarial budget can affect robust generalization gaps. Our derived generalization bounds for both convex and non-convex losses are at least as good as the uniform stability-based counterparts which do not include data distribution information. Furthermore, our findings demonstrate how distribution shifts from data poisoning attacks can impact robust generalization.

Via

Access Paper or Ask Questions

A Pilot Study on Dialogue-Level Dependency Parsing for Chinese

May 21, 2023

Gongyao Jiang, Shuang Liu, Meishan Zhang, Min Zhang

Figure 1 for A Pilot Study on Dialogue-Level Dependency Parsing for Chinese

Figure 2 for A Pilot Study on Dialogue-Level Dependency Parsing for Chinese

Figure 3 for A Pilot Study on Dialogue-Level Dependency Parsing for Chinese

Figure 4 for A Pilot Study on Dialogue-Level Dependency Parsing for Chinese

Abstract:Dialogue-level dependency parsing has received insufficient attention, especially for Chinese. To this end, we draw on ideas from syntactic dependency and rhetorical structure theory (RST), developing a high-quality human-annotated corpus, which contains 850 dialogues and 199,803 dependencies. Considering that such tasks suffer from high annotation costs, we investigate zero-shot and few-shot scenarios. Based on an existing syntactic treebank, we adopt a signal-based method to transform seen syntactic dependencies into unseen ones between elementary discourse units (EDUs), where the signals are detected by masked language modeling. Besides, we apply single-view and multi-view data selection to access reliable pseudo-labeled instances. Experimental results show the effectiveness of these baselines. Moreover, we discuss several crucial points about our dataset and approach.

Via

Access Paper or Ask Questions

A Review on Machine Theory of Mind

Mar 21, 2023

Yuanyuan Mao, Shuang Liu, Pengshuai Zhao, Qin Ni, Xin Lin, Liang He

Abstract:Theory of Mind (ToM) is the ability to attribute mental states to others, the basis of human cognition. At present, there has been growing interest in the AI with cognitive abilities, for example in healthcare and the motoring industry. Beliefs, desires, and intentions are the early abilities of infants and the foundation of human cognitive ability, as well as for machine with ToM. In this paper, we review recent progress in machine ToM on beliefs, desires, and intentions. And we shall introduce the experiments, datasets and methods of machine ToM on these three aspects, summarize the development of different tasks and datasets in recent years, and compare well-behaved models in aspects of advantages, limitations and applicable conditions, hoping that this study can guide researchers to quickly keep up with latest trend in this field. Unlike other domains with a specific task and resolution framework, machine ToM lacks a unified instruction and a series of standard evaluation tasks, which make it difficult to formally compare the proposed models. We argue that, one method to address this difficulty is now to present a standard assessment criteria and dataset, better a large-scale dataset covered multiple aspects of ToM.

* arXiv admin note: text overlap with arXiv:2102.11938 by other authors

Via

Access Paper or Ask Questions

Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding

Dec 19, 2022

Haoli Bai, Zhiguang Liu, Xiaojun Meng, Wentao Li, Shuang Liu, Nian Xie, Rongfu Zheng, Liangwei Wang, Lu Hou, Jiansheng Wei(+2 more)

Figure 1 for Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding

Figure 2 for Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding

Figure 3 for Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding

Figure 4 for Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding

Abstract:Unsupervised pre-training on millions of digital-born or scanned documents has shown promising advances in visual document understanding~(VDU). While various vision-language pre-training objectives are studied in existing solutions, the document textline, as an intrinsic granularity in VDU, has seldom been explored so far. A document textline usually contains words that are spatially and semantically correlated, which can be easily obtained from OCR engines. In this paper, we propose Wukong-Reader, trained with new pre-training objectives to leverage the structural knowledge nested in document textlines. We introduce textline-region contrastive learning to achieve fine-grained alignment between the visual regions and texts of document textlines. Furthermore, masked region modeling and textline-grid matching are also designed to enhance the visual and layout representations of textlines. Experiments show that our Wukong-Reader has superior performance on various VDU tasks such as information extraction. The fine-grained alignment over textlines also empowers Wukong-Reader with promising localization ability.

Via

Access Paper or Ask Questions

Micro-Vibration Modes Reconstruction Based on Micro-Doppler Coincidence Imaging

Aug 30, 2022

Shuang Liu, Chenjin Deng, Chaoran Wang, Zunwang Bo, Shensheng Han, Zihuai Lin

Figure 1 for Micro-Vibration Modes Reconstruction Based on Micro-Doppler Coincidence Imaging

Figure 2 for Micro-Vibration Modes Reconstruction Based on Micro-Doppler Coincidence Imaging

Figure 3 for Micro-Vibration Modes Reconstruction Based on Micro-Doppler Coincidence Imaging

Figure 4 for Micro-Vibration Modes Reconstruction Based on Micro-Doppler Coincidence Imaging

Abstract:Micro-vibration, a ubiquitous nature phenomenon, can be seen as a characteristic feature on the objects, these vibrations always have tiny amplitudes which are much less than the wavelengths of the sensing systems, thus these motions information can only be reflected in the phase item of echo. Normally the conventional radar system can detect these micro vibrations through the time frequency analyzing, but these vibration characteristics can only be reflected by time-frequency spectrum, the spatial distribution of these micro vibrations can not be reconstructed precisely. Ghost imaging (GI), a novel imaging method also known as Coincidence Imaging that originated in the quantum and optical fields, can reconstruct unknown images using computational methods. To reconstruct the spatial distribution of micro vibrations, this paper proposes a new method based on a coincidence imaging system. A detailed model of target micro-vibration is created first, taking into account two categories: discrete and continuous targets. We use the first-order field correlation feature to obtain objective different micro vibration distribution based on the complex target models and time-frequency analysis in this work.

Via

Access Paper or Ask Questions

Achieve Optimal Adversarial Accuracy for Adversarial Deep Learning using Stackelberg Game

Jul 17, 2022

Xiao-Shan Gao, Shuang Liu, Lijia Yu

Figure 1 for Achieve Optimal Adversarial Accuracy for Adversarial Deep Learning using Stackelberg Game

Figure 2 for Achieve Optimal Adversarial Accuracy for Adversarial Deep Learning using Stackelberg Game

Abstract:Adversarial deep learning is to train robust DNNs against adversarial attacks, which is one of the major research focuses of deep learning. Game theory has been used to answer some of the basic questions about adversarial deep learning such as the existence of a classifier with optimal robustness and the existence of optimal adversarial samples for a given class of classifiers. In most previous work, adversarial deep learning was formulated as a simultaneous game and the strategy spaces are assumed to be certain probability distributions in order for the Nash equilibrium to exist. But, this assumption is not applicable to the practical situation. In this paper, we give answers to these basic questions for the practical case where the classifiers are DNNs with a given structure, by formulating the adversarial deep learning as sequential games. The existence of Stackelberg equilibria for these games are proved. Furthermore, it is shown that the equilibrium DNN has the largest adversarial accuracy among all DNNs with the same structure, when Carlini-Wagner's margin loss is used. Trade-off between robustness and accuracy in adversarial deep learning is also studied from game theoretical aspect.

Via

Access Paper or Ask Questions

Improving Policy Optimization with Generalist-Specialist Learning

Jun 26, 2022

Zhiwei Jia, Xuanlin Li, Zhan Ling, Shuang Liu, Yiran Wu, Hao Su

Figure 1 for Improving Policy Optimization with Generalist-Specialist Learning

Figure 2 for Improving Policy Optimization with Generalist-Specialist Learning

Figure 3 for Improving Policy Optimization with Generalist-Specialist Learning

Figure 4 for Improving Policy Optimization with Generalist-Specialist Learning

Abstract:Generalization in deep reinforcement learning over unseen environment variations usually requires policy learning over a large set of diverse training variations. We empirically observe that an agent trained on many variations (a generalist) tends to learn faster at the beginning, yet its performance plateaus at a less optimal level for a long time. In contrast, an agent trained only on a few variations (a specialist) can often achieve high returns under a limited computational budget. To have the best of both worlds, we propose a novel generalist-specialist training framework. Specifically, we first train a generalist on all environment variations; when it fails to improve, we launch a large population of specialists with weights cloned from the generalist, each trained to master a selected small subset of variations. We finally resume the training of the generalist with auxiliary rewards induced by demonstrations of all specialists. In particular, we investigate the timing to start specialist training and compare strategies to learn generalists with assistance from specialists. We show that this framework pushes the envelope of policy learning on several challenging and popular benchmarks including Procgen, Meta-World and ManiSkill.

* ICML 2022

Via

Access Paper or Ask Questions