Abstract:Most interesting problems in robotics (e.g., locomotion and manipulation) are realized through intermittent contact with the environment. Due to the perception and modeling errors, assuming an exact time for establishing contact with the environment is unrealistic. On the other hand, handling uncertainties in contact timing is notoriously difficult as it gives rise to either handling uncertain complementarity systems or solving combinatorial optimization problems at run-time. This work presents a novel optimal control formulation to find robust control policies under contact timing uncertainties. Our main novelty lies in casting the stochastic problem to a deterministic optimization over the uncertainty set that ensures robustness criterion satisfaction of candidate pre-contact states and optimizes for contact-relevant objectives. This way, we only need to solve a manageable standard nonlinear programming problem without complementarity constraints or combinatorial explosion. Our simulation results on multiple simplified locomotion and manipulation tasks demonstrate the robustness of our uncertainty-aware formulation compared to the nominal optimal control formulation.
Abstract:Incorporating a robotic manipulator into a wheel-legged robot enhances its agility and expands its potential for practical applications. However, the presence of potential instability and uncertainties presents additional challenges for control objectives. In this paper, we introduce an arm-constrained curriculum learning architecture to tackle the issues introduced by adding the manipulator. Firstly, we develop an arm-constrained reinforcement learning algorithm to ensure safety and stability in control performance. Additionally, to address discrepancies in reward settings between the arm and the base, we propose a reward-aware curriculum learning method. The policy is first trained in Isaac gym and transferred to the physical robot to do dynamic grasping tasks, including the door-opening task, fan-twitching task and the relay-baton-picking and following task. The results demonstrate that our proposed approach effectively controls the arm-equipped wheel-legged robot to master dynamic grasping skills, allowing it to chase and catch a moving object while in motion. Please refer to our website (https://acodedog.github.io/wheel-legged-loco-manipulation) for the code and supplemental videos.
Abstract:Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering. However, most powerful LLMs are closed-source or limited in their capability for languages other than English. In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens. Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan 2 excels in vertical domains such as medicine and law. We will release all pre-training model checkpoints to benefit the research community in better understanding the training dynamics of Baichuan 2.
Abstract:The legged robots with variable stiffness actuators (VSAs) can achieve energy-efficient and versatile locomotion. However, equipping legged robots with VSAs in real-world application is usually restricted by (i) the redundant mechanical structure design, (ii) limited stiffness variation range and speed, and (iii) high energy consumption in stiffness modulation. In this paper, we present a novel Variable-Length Leaf-Spring Actuator (VLLSA) in legged robots that aims to address the aforementioned limitations. The design is based on leaf-spring mechanism and we improve the structural design to make the proposed VSA (i) compact and lightweight in mechanical structure, (ii) precise in theoretical modeling, and (iii) capable of modulating stiffness with wide range, fast speed, and low energy consumption. Hardware experiments validate that the legged robot equipped with the proposed VLLSA has compact structure, high dynamic performance and low energy consumption.
Abstract:Wheeled bipedal robots have the capability to execute agile and versatile locomotion tasks in unknown terrains, with balancing being a key criteria in evaluating their dynamic performance. This paper focuses on enhancing the balancing performance of wheeled bipedal robots through innovations in both hardware and software aspects. A bio-inspired mechanical design, inspired by the human barbell squat, is proposed and implemented to achieve an efficient distribution of load onto the limb joints. This design improves knee torque joint efficiency and facilitates control over the distribution of the center of mass (CoM). Meanwhile, a customized balance model, namely the wheeled linear inverted pendulum (wLIP), is developed. The wLIP surpasses other alternatives by providing a more accurate estimation of wheeled robot dynamics while ensuring balancing stability. Experimental results demonstrate that the robot is capable of maintaining balance while manipulating pelvis states and CoM velocity; furthermore, it exhibits robustness against external disturbances and unknown terrains.
Abstract:Endowing a chatbot with personality or an identity is quite challenging but critical to deliver more realistic and natural conversations. In this paper, we address the issue of generating responses that are coherent to a pre-specified agent profile. We design a model consisting of three modules: a profile detector to decide whether a post should be responded using the profile and which key should be addressed, a bidirectional decoder to generate responses forward and backward starting from a selected profile value, and a position detector that predicts a word position from which decoding should start given a selected profile value. We show that general conversation data from social media can be used to generate profile-coherent responses. Manual and automatic evaluation shows that our model can deliver more coherent, natural, and diversified responses.