Abstract:Recent advances in learning reusable motion priors have demonstrated their effectiveness in generating naturalistic behaviors. In this paper, we propose a new learning framework in this paradigm for controlling physics-based characters with significantly improved motion quality and diversity over existing state-of-the-art methods. The proposed method uses reinforcement learning (RL) to initially track and imitate life-like movements from unstructured motion clips using the discrete information bottleneck, as adopted in the Vector Quantized Variational AutoEncoder (VQ-VAE). This structure compresses the most relevant information from the motion clips into a compact yet informative latent space, i.e., a discrete space over vector quantized codes. By sampling codes in the space from a trained categorical prior distribution, high-quality life-like behaviors can be generated, similar to the usage of VQ-VAE in computer vision. Although this prior distribution can be trained with the supervision of the encoder's output, it follows the original motion clip distribution in the dataset and could lead to imbalanced behaviors in our setting. To address the issue, we further propose a technique named prior shifting to adjust the prior distribution using curiosity-driven RL. The outcome distribution is demonstrated to offer sufficient behavioral diversity and significantly facilitates upper-level policy learning for downstream tasks. We conduct comprehensive experiments using humanoid characters on two challenging downstream tasks, sword-shield striking and two-player boxing game. Our results demonstrate that the proposed framework is capable of controlling the character to perform considerably high-quality movements in terms of behavioral strategies, diversity, and realism. Videos, codes, and data are available at https://tencent-roboticsx.github.io/NCP/.
Abstract:Deemed as the third generation of neural networks, the event-driven Spiking Neural Networks(SNNs) combined with bio-plausible local learning rules make it promising to build low-power, neuromorphic hardware for SNNs. However, because of the non-linearity and discrete property of spiking neural networks, the training of SNN remains difficult and is still under discussion. Originating from gradient descent, backprop has achieved stunning success in multi-layer SNNs. Nevertheless, it is assumed to lack biological plausibility, while consuming relatively high computational resources. In this paper, we propose a novel learning algorithm inspired by predictive coding theory and show that it can perform supervised learning fully autonomously and successfully as the backprop, utilizing only local Hebbian plasticity. Furthermore, this method achieves a favorable performance compared to the state-of-the-art multi-layer SNNs: test accuracy of 99.25% for the Caltech Face/Motorbike dataset, 84.25% for the ETH-80 dataset, 98.1% for the MNIST dataset and 98.5% for the neuromorphic dataset: N-MNIST. Furthermore, our work provides a new perspective on how supervised learning algorithms are directly implemented in spiking neural circuitry, which may give some new insights into neuromorphological calculation in neuroscience.