Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaoyu Wen

Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning

May 10, 2024

Xiaoyu Wen, Chenjia Bai, Kang Xu, Xudong Yu, Yang Zhang, Xuelong Li, Zhen Wang

Figure 1 for Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning

Figure 2 for Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning

Figure 3 for Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning

Figure 4 for Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning

Abstract:Cross-domain offline reinforcement learning leverages source domain data with diverse transition dynamics to alleviate the data requirement for the target domain. However, simply merging the data of two domains leads to performance degradation due to the dynamics mismatch. Existing methods address this problem by measuring the dynamics gap via domain classifiers while relying on the assumptions of the transferability of paired domains. In this paper, we propose a novel representation-based approach to measure the domain gap, where the representation is learned through a contrastive objective by sampling transitions from different domains. We show that such an objective recovers the mutual-information gap of transition functions in two domains without suffering from the unbounded issue of the dynamics gap in handling significantly different domains. Based on the representations, we introduce a data filtering algorithm that selectively shares transitions from the source domain according to the contrastive score functions. Empirical results on various tasks demonstrate that our method achieves superior performance, using only 10% of the target data to achieve 89.2% of the performance on 100% target dataset with state-of-the-art methods.

* This paper has been accepted by ICML2024

Via

Access Paper or Ask Questions

Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness

Sep 29, 2023

Xiaoyu Wen, Xudong Yu, Rui Yang, Chenjia Bai, Zhen Wang

Figure 1 for Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness

Figure 2 for Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness

Figure 3 for Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness

Figure 4 for Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness

Abstract:To obtain a near-optimal policy with fewer interactions in Reinforcement Learning (RL), a promising approach involves the combination of offline RL, which enhances sample efficiency by leveraging offline datasets, and online RL, which explores informative transitions by interacting with the environment. Offline-to-Online (O2O) RL provides a paradigm for improving an offline trained agent within limited online interactions. However, due to the significant distribution shift between online experiences and offline data, most offline RL algorithms suffer from performance drops and fail to achieve stable policy improvement in O2O adaptation. To address this problem, we propose the Robust Offline-to-Online (RO2O) algorithm, designed to enhance offline policies through uncertainty and smoothness, and to mitigate the performance drop in online adaptation. Specifically, RO2O incorporates Q-ensemble for uncertainty penalty and adversarial samples for policy and value smoothness, which enable RO2O to maintain a consistent learning procedure in online adaptation without requiring special changes to the learning objective. Theoretical analyses in linear MDPs demonstrate that the uncertainty and smoothness lead to a tighter optimality bound in O2O against distribution shift. Experimental results illustrate the superiority of RO2O in facilitating stable offline-to-online learning and achieving significant improvement with limited online interactions.

Via

Access Paper or Ask Questions

AttendSeg: A Tiny Attention Condenser Neural Network for Semantic Segmentation on the Edge

Apr 29, 2021

Xiaoyu Wen, Mahmoud Famouri, Andrew Hryniowski, Alexander Wong

Figure 1 for AttendSeg: A Tiny Attention Condenser Neural Network for Semantic Segmentation on the Edge

Figure 2 for AttendSeg: A Tiny Attention Condenser Neural Network for Semantic Segmentation on the Edge

Figure 3 for AttendSeg: A Tiny Attention Condenser Neural Network for Semantic Segmentation on the Edge

Abstract:In this study, we introduce \textbf{AttendSeg}, a low-precision, highly compact deep neural network tailored for on-device semantic segmentation. AttendSeg possesses a self-attention network architecture comprising of light-weight attention condensers for improved spatial-channel selective attention at a very low complexity. The unique macro-architecture and micro-architecture design properties of AttendSeg strike a strong balance between representational power and efficiency, achieved via a machine-driven design exploration strategy tailored specifically for the task at hand. Experimental results demonstrated that the proposed AttendSeg can achieve segmentation accuracy comparable to much larger deep neural networks with greater complexity while possessing a significantly lower architecture and computational complexity (requiring as much as >27x fewer MACs, >72x fewer parameters, and >288x lower weight memory requirements), making it well-suited for TinyML applications on the edge.

* 5 pages

Via

Access Paper or Ask Questions