Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xinyuan Fang

Adapting Image-based RL Policies via Predicted Rewards

Jul 23, 2024

Weiyao Wang, Xinyuan Fang, Gregory D. Hager

Abstract:Image-based reinforcement learning (RL) faces significant challenges in generalization when the visual environment undergoes substantial changes between training and deployment. Under such circumstances, learned policies may not perform well leading to degraded results. Previous approaches to this problem have largely focused on broadening the training observation distribution, employing techniques like data augmentation and domain randomization. However, given the sequential nature of the RL decision-making problem, it is often the case that residual errors are propagated by the learned policy model and accumulate throughout the trajectory, resulting in highly degraded performance. In this paper, we leverage the observation that predicted rewards under domain shift, even though imperfect, can still be a useful signal to guide fine-tuning. We exploit this property to fine-tune a policy using reward prediction in the target domain. We have found that, even under significant domain shift, the predicted reward can still provide meaningful signal and fine-tuning substantially improves the original policy. Our approach, termed Predicted Reward Fine-tuning (PRFT), improves performance across diverse tasks in both simulated benchmarks and real-world experiments. More information is available at project web page: https://sites.google.com/view/prft.

* L4DC 2024

Via

Access Paper or Ask Questions

Skydiver: A Spiking Neural Network Accelerator Exploiting Spatio-Temporal Workload Balance

Mar 14, 2022

Qinyu Chen, Chang Gao, Xinyuan Fang, Haitao Luan

Figure 1 for Skydiver: A Spiking Neural Network Accelerator Exploiting Spatio-Temporal Workload Balance

Figure 2 for Skydiver: A Spiking Neural Network Accelerator Exploiting Spatio-Temporal Workload Balance

Figure 3 for Skydiver: A Spiking Neural Network Accelerator Exploiting Spatio-Temporal Workload Balance

Figure 4 for Skydiver: A Spiking Neural Network Accelerator Exploiting Spatio-Temporal Workload Balance

Abstract:Spiking Neural Networks (SNNs) are developed as a promising alternative to Artificial Neural networks (ANNs) due to their more realistic brain-inspired computing models. SNNs have sparse neuron firing over time, i.e., spatio-temporal sparsity; thus, they are useful to enable energy-efficient hardware inference. However, exploiting spatio-temporal sparsity of SNNs in hardware leads to unpredictable and unbalanced workloads, degrading the energy efficiency. In this work, we propose an FPGA-based convolutional SNN accelerator called Skydiver that exploits spatio-temporal workload balance. We propose the Approximate Proportional Relation Construction (APRC) method that can predict the relative workload channel-wisely and a Channel-Balanced Workload Schedule (CBWS) method to increase the hardware workload balance ratio to over 90%. Skydiver was implemented on a Xilinx XC7Z045 FPGA and verified on image segmentation and MNIST classification tasks. Results show improved throughput by 1.4X and 1.2X for the two tasks. Skydiver achieved 22.6 KFPS throughput, and 42.4 uJ/Image prediction energy on the classification task with 98.5% accuracy.

* Accepted to be published in the IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022

Via

Access Paper or Ask Questions