Abstract:This paper proposes a novel method to improve locomotion learning for a single quadruped robot using multi-agent deep reinforcement learning (MARL). Many existing methods use single-agent reinforcement learning for an individual robot or MARL for the cooperative task in multi-robot systems. Unlike existing methods, this paper proposes using MARL for the locomotion learning of a single quadruped robot. We develop a learning structure called Multi-Agent Reinforcement Learning for Single Quadruped Robot Locomotion (MASQ), considering each leg as an agent to explore the action space of the quadruped robot, sharing a global critic, and learning collaboratively. Experimental results indicate that MASQ not only speeds up learning convergence but also enhances robustness in real-world settings, suggesting that applying MASQ to single robots such as quadrupeds could surpass traditional single-robot reinforcement learning approaches. Our study provides insightful guidance on integrating MARL with single-robot locomotion learning.
Abstract:Semantic consistency recognition aims to detect and judge whether the semantics of two text sentences are consistent with each other. However, the existing methods usually encounter the challenges of synonyms, polysemy and difficulty to understand long text. To solve the above problems, this paper proposes a co-driven semantic consistency recognition method based on the fusion of Transformer and HowNet sememes knowledge. Multi-level encoding of internal sentence structures via data-driven is carried out firstly by Transformer, sememes knowledge base HowNet is introduced for knowledge-driven to model the semantic knowledge association among sentence pairs. Then, interactive attention calculation is carried out utilizing soft-attention and fusion the knowledge with sememes matrix. Finally, bidirectional long short-term memory network (BiLSTM) is exploited to encode the conceptual semantic information and infer the semantic consistency. Experiments are conducted on two financial text matching datasets (BQ, AFQMC) and a cross-lingual adversarial dataset (PAWSX) for paraphrase identification. Compared with lightweight models including DSSM, MwAN, DRCN, and pre-training models such as ERNIE etc., the proposed model can not only improve the accuracy of semantic consistency recognition effectively (by 2.19%, 5.57% and 6.51% compared with the DSSM, MWAN and DRCN models on the BQ dataset), but also reduce the number of model parameters (to about 16M). In addition, driven by the HowNet sememes knowledge, the proposed method is promising to adapt to scenarios with long text.