Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xinwen Hou

From Easy to Hard: Building a Shortcut for Differentially Private Image Synthesis

Apr 02, 2025

Kecen Li, Chen Gong, Xiaochen Li, Yuzhong Zhao, Xinwen Hou, Tianhao Wang

Abstract:Differentially private (DP) image synthesis aims to generate synthetic images from a sensitive dataset, alleviating the privacy leakage concerns of organizations sharing and utilizing synthetic images. Although previous methods have significantly progressed, especially in training diffusion models on sensitive images with DP Stochastic Gradient Descent (DP-SGD), they still suffer from unsatisfactory performance. In this work, inspired by curriculum learning, we propose a two-stage DP image synthesis framework, where diffusion models learn to generate DP synthetic images from easy to hard. Unlike existing methods that directly use DP-SGD to train diffusion models, we propose an easy stage in the beginning, where diffusion models learn simple features of the sensitive images. To facilitate this easy stage, we propose to use `central images', simply aggregations of random samples of the sensitive dataset. Intuitively, although those central images do not show details, they demonstrate useful characteristics of all images and only incur minimal privacy costs, thus helping early-phase model training. We conduct experiments to present that on the average of four investigated image datasets, the fidelity and utility metrics of our synthetic images are 33.1% and 2.1% better than the state-of-the-art method.

* Accepted at IEEE S&P (Oakland) 2025; code available at https://github.com/SunnierLee/DP-FETA

Via

Access Paper or Ask Questions

DAS3D: Dual-modality Anomaly Synthesis for 3D Anomaly Detection

Oct 13, 2024

Kecen Li, Bingquan Dai, Jingjing Fu, Xinwen Hou

Figure 1 for DAS3D: Dual-modality Anomaly Synthesis for 3D Anomaly Detection

Figure 2 for DAS3D: Dual-modality Anomaly Synthesis for 3D Anomaly Detection

Figure 3 for DAS3D: Dual-modality Anomaly Synthesis for 3D Anomaly Detection

Figure 4 for DAS3D: Dual-modality Anomaly Synthesis for 3D Anomaly Detection

Abstract:Synthesizing anomaly samples has proven to be an effective strategy for self-supervised 2D industrial anomaly detection. However, this approach has been rarely explored in multi-modality anomaly detection, particularly involving 3D and RGB images. In this paper, we propose a novel dual-modality augmentation method for 3D anomaly synthesis, which is simple and capable of mimicking the characteristics of 3D defects. Incorporating with our anomaly synthesis method, we introduce a reconstruction-based discriminative anomaly detection network, in which a dual-modal discriminator is employed to fuse the original and reconstructed embedding of two modalities for anomaly detection. Additionally, we design an augmentation dropout mechanism to enhance the generalizability of the discriminator. Extensive experiments show that our method outperforms the state-of-the-art methods on detection precision and achieves competitive segmentation performance on both MVTec 3D-AD and Eyescandies datasets.

Via

Access Paper or Ask Questions

Keep Various Trajectories: Promoting Exploration of Ensemble Policies in Continuous Control

Oct 17, 2023

Chao Li, Chen Gong, Qiang He, Xinwen Hou

Abstract:The combination of deep reinforcement learning (DRL) with ensemble methods has been proved to be highly effective in addressing complex sequential decision-making problems. This success can be primarily attributed to the utilization of multiple models, which enhances both the robustness of the policy and the accuracy of value function estimation. However, there has been limited analysis of the empirical success of current ensemble RL methods thus far. Our new analysis reveals that the sample efficiency of previous ensemble DRL algorithms may be limited by sub-policies that are not as diverse as they could be. Motivated by these findings, our study introduces a new ensemble RL algorithm, termed \textbf{T}rajectories-awar\textbf{E} \textbf{E}nsemble exploratio\textbf{N} (TEEN). The primary goal of TEEN is to maximize the expected return while promoting more diverse trajectories. Through extensive experiments, we demonstrate that TEEN not only enhances the sample diversity of the ensemble policy compared to using sub-policies alone but also improves the performance over ensemble RL algorithms. On average, TEEN outperforms the baseline ensemble DRL algorithms by 41\% in performance on the tested representative environments.

Via

Access Paper or Ask Questions

Recover Triggered States: Protect Model Against Backdoor Attack in Reinforcement Learning

Apr 10, 2023

Hao Chen, Chen Gong, Yizhe Wang, Xinwen Hou

Abstract:A backdoor attack allows a malicious user to manipulate the environment or corrupt the training data, thus inserting a backdoor into the trained agent. Such attacks compromise the RL system's reliability, leading to potentially catastrophic results in various key fields. In contrast, relatively limited research has investigated effective defenses against backdoor attacks in RL. This paper proposes the Recovery Triggered States (RTS) method, a novel approach that effectively protects the victim agents from backdoor attacks. RTS involves building a surrogate network to approximate the dynamics model. Developers can then recover the environment from the triggered state to a clean state, thereby preventing attackers from activating backdoors hidden in the agent by presenting the trigger. When training the surrogate to predict states, we incorporate agent action information to reduce the discrepancy between the actions taken by the agent on predicted states and the actions taken on real states. RTS is the first approach to defend against backdoor attacks in a single-agent setting. Our results show that using RTS, the cumulative reward only decreased by 1.41% under the backdoor attack.

Via

Access Paper or Ask Questions

Centralized Cooperative Exploration Policy for Continuous Control Tasks

Jan 06, 2023

Chao Li, Chen Gong, Qiang He, Xinwen Hou, Yu Liu

Figure 1 for Centralized Cooperative Exploration Policy for Continuous Control Tasks

Figure 2 for Centralized Cooperative Exploration Policy for Continuous Control Tasks

Figure 3 for Centralized Cooperative Exploration Policy for Continuous Control Tasks

Figure 4 for Centralized Cooperative Exploration Policy for Continuous Control Tasks

Abstract:The deep reinforcement learning (DRL) algorithm works brilliantly on solving various complex control tasks. This phenomenal success can be partly attributed to DRL encouraging intelligent agents to sufficiently explore the environment and collect diverse experiences during the agent training process. Therefore, exploration plays a significant role in accessing an optimal policy for DRL. Despite recent works making great progress in continuous control tasks, exploration in these tasks has remained insufficiently investigated. To explicitly encourage exploration in continuous control tasks, we propose CCEP (Centralized Cooperative Exploration Policy), which utilizes underestimation and overestimation of value functions to maintain the capacity of exploration. CCEP first keeps two value functions initialized with different parameters, and generates diverse policies with multiple exploration styles from a pair of value functions. In addition, a centralized policy framework ensures that CCEP achieves message delivery between multiple policies, furthermore contributing to exploring the environment cooperatively. Extensive experimental results demonstrate that CCEP achieves higher exploration capacity. Empirical analysis shows diverse exploration styles in the learned policies by CCEP, reaping benefits in more exploration regions. And this exploration capacity of CCEP ensures it outperforms the current state-of-the-art methods across multiple continuous control tasks shown in experiments.

* 13 pages. Accepted by AAMAS 2023 (extended abstract). The presented document here is the full version of our paper

Via

Access Paper or Ask Questions

Unsupervised Domain Adaptation GAN Inversion for Image Editing

Nov 22, 2022

Siyu Xing, Chen Gong, Hewei Guo, Xiao-Yu Zhang, Xinwen Hou, Yu Liu

Abstract:Existing GAN inversion methods work brilliantly for high-quality image reconstruction and editing while struggling with finding the corresponding high-quality images for low-quality inputs. Therefore, recent works are directed toward leveraging the supervision of paired high-quality and low-quality images for inversion. However, these methods are infeasible in real-world scenarios and further hinder performance improvement. In this paper, we resolve this problem by introducing Unsupervised Domain Adaptation (UDA) into the Inversion process, namely UDA-Inversion, for both high-quality and low-quality image inversion and editing. Particularly, UDA-Inversion first regards the high-quality and low-quality images as the source domain and unlabeled target domain, respectively. Then, a discrepancy function is presented to measure the difference between two domains, after which we minimize the source error and the discrepancy between the distributions of two domains in the latent space to obtain accurate latent codes for low-quality images. Without direct supervision, constructive representations of high-quality images can be spontaneously learned and transformed into low-quality images based on unsupervised domain adaptation. Experimental results indicate that UDA-inversion is the first that achieves a comparable level of performance with supervised methods in low-quality images across multiple domain datasets. We hope this work provides a unique inspiration for latent embedding distributions in image process tasks.

Via

Access Paper or Ask Questions

Mind Your Data! Hiding Backdoors in Offline Reinforcement Learning Datasets

Oct 07, 2022

Chen Gong, Zhou Yang, Yunpeng Bai, Junda He, Jieke Shi, Arunesh Sinha, Bowen Xu, Xinwen Hou, Guoliang Fan, David Lo

Figure 1 for Mind Your Data! Hiding Backdoors in Offline Reinforcement Learning Datasets

Figure 2 for Mind Your Data! Hiding Backdoors in Offline Reinforcement Learning Datasets

Figure 3 for Mind Your Data! Hiding Backdoors in Offline Reinforcement Learning Datasets

Figure 4 for Mind Your Data! Hiding Backdoors in Offline Reinforcement Learning Datasets

Abstract:A growing body of research works has focused on the Offline Reinforcement Learning (RL) paradigm. Data providers share large pre-collected datasets on which others can train high-quality agents without interacting with the environments. Such an offline RL paradigm has demonstrated effectiveness in many critical tasks, including robot control, autonomous driving, etc. A well-trained agent can be regarded as a software system. However, less attention is paid to investigating the security threats to the offline RL system. In this paper, we focus on a critical security threat: backdoor attacks. Given normal observations, an agent implanted with backdoors takes actions leading to high rewards. However, the same agent takes actions that lead to low rewards if the observations are injected with triggers that can activate the backdoor. In this paper, we propose Baffle (Backdoor Attack for Offline Reinforcement Learning) and evaluate how different Offline RL algorithms react to this attack. Our experiments conducted on four tasks and four offline RL algorithms expose a disquieting fact: none of the existing offline RL algorithms is immune to such a backdoor attack. More specifically, Baffle modifies $10\%$ of the datasets for four tasks (3 robotic controls and 1 autonomous driving). Agents trained on the poisoned datasets perform well in normal settings. However, when triggers are presented, the agents' performance decreases drastically by $63.6\%$, $57.8\%$, $60.8\%$ and $44.7\%$ in the four tasks on average. The backdoor still persists after fine-tuning poisoned agents on clean datasets. We further show that the inserted backdoor is also hard to be detected by a popular defensive method. This paper calls attention to developing more effective protection for the open-source offline RL dataset.

* 13 pages, 6 figures

Via

Access Paper or Ask Questions

Representation Gap in Deep Reinforcement Learning

May 29, 2022

Qiang He, Huangyuan Su, Jieyu Zhang, Xinwen Hou

Figure 1 for Representation Gap in Deep Reinforcement Learning

Figure 2 for Representation Gap in Deep Reinforcement Learning

Figure 3 for Representation Gap in Deep Reinforcement Learning

Figure 4 for Representation Gap in Deep Reinforcement Learning

Abstract:Deep reinforcement learning gives the promise that an agent learns good policy from high-dimensional information. Whereas representation learning removes irrelevant and redundant information and retains pertinent information. We consider the representation capacity of action value function and theoretically reveal its inherent property, \textit{representation gap} with its target action value function. This representation gap is favorable. However, through illustrative experiments, we show that the representation of action value function grows similarly compared with its target value function, i.e. the undesirable inactivity of the representation gap (\textit{representation overlap}). Representation overlap results in a loss of representation capacity, which further leads to sub-optimal learning performance. To activate the representation gap, we propose a simple but effective framework \underline{P}olicy \underline{O}ptimization from \underline{P}reventing \underline{R}epresentation \underline{O}verlaps (POPRO), which regularizes the policy evaluation phase through differing the representation of action value function from its target. We also provide the convergence rate guarantee of POPRO. We evaluate POPRO on gym continuous control suites. The empirical results show that POPRO using pixel inputs outperforms or parallels the sample-efficiency of methods that use state-based features.

* 24 pages, 6 figures

Via

Access Paper or Ask Questions

Value Function Factorisation with Hypergraph Convolution for Cooperative Multi-agent Reinforcement Learning

Dec 09, 2021

Yunpeng Bai, Chen Gong, Bin Zhang, Guoliang Fan, Xinwen Hou

Figure 1 for Value Function Factorisation with Hypergraph Convolution for Cooperative Multi-agent Reinforcement Learning

Figure 2 for Value Function Factorisation with Hypergraph Convolution for Cooperative Multi-agent Reinforcement Learning

Figure 3 for Value Function Factorisation with Hypergraph Convolution for Cooperative Multi-agent Reinforcement Learning

Figure 4 for Value Function Factorisation with Hypergraph Convolution for Cooperative Multi-agent Reinforcement Learning

Abstract:Cooperation between agents in a multi-agent system (MAS) has become a hot topic in recent years, and many algorithms based on centralized training with decentralized execution (CTDE), such as VDN and QMIX, have been proposed. However, these methods disregard the information hidden in the individual action values. In this paper, we propose HyperGraph CoNvolution MIX (HGCN-MIX), a method that combines hypergraph convolution with value decomposition. By treating action values as signals, HGCN-MIX aims to explore the relationship between these signals via a self-learning hypergraph. Experimental results present that HGCN-MIX matches or surpasses state-of-the-art techniques in the StarCraft II multi-agent challenge (SMAC) benchmark on various situations, notably those with a number of agents.

* 6 pages, 3 figures

Via

Access Paper or Ask Questions

Combing Policy Evaluation and Policy Improvement in a Unified f-Divergence Framework

Sep 24, 2021

Chen Gong, Qiang He, Yunpeng Bai, Xiaoyu Chen, Xinwen Hou, Yu Liu, Guoliang Fan

Figure 1 for Combing Policy Evaluation and Policy Improvement in a Unified f-Divergence Framework

Figure 2 for Combing Policy Evaluation and Policy Improvement in a Unified f-Divergence Framework

Figure 3 for Combing Policy Evaluation and Policy Improvement in a Unified f-Divergence Framework

Figure 4 for Combing Policy Evaluation and Policy Improvement in a Unified f-Divergence Framework

Abstract:The framework of deep reinforcement learning (DRL) provides a powerful and widely applicable mathematical formalization for sequential decision-making. In this paper, we start from studying the f-divergence between learning policy and sampling policy and derive a novel DRL framework, termed f-Divergence Reinforcement Learning (FRL). We highlight that the policy evaluation and policy improvement phases are induced by minimizing f-divergence between learning policy and sampling policy, which is distinct from the conventional DRL algorithm objective that maximizes the expected cumulative rewards. Besides, we convert this framework to a saddle-point optimization problem with a specific f function through Fenchel conjugate, which consists of policy evaluation and policy improvement. Then we derive new policy evaluation and policy improvement methods in FRL. Our framework may give new insights for analyzing DRL algorithms. The FRL framework achieves two advantages: (1) policy evaluation and policy improvement processes are derived simultaneously by f-divergence; (2) overestimation issue of value function are alleviated. To evaluate the effectiveness of the FRL framework, we conduct experiments on Atari 2600 video games, which show that our framework matches or surpasses the DRL algorithms we tested.

* 24 pages, 5 figures

Via

Access Paper or Ask Questions