Abstract:Large Vision-Language Models (LVLMs), due to the remarkable visual reasoning ability to understand images and videos, have received widespread attention in the autonomous driving domain, which significantly advances the development of interpretable end-to-end autonomous driving. However, current evaluations of LVLMs primarily focus on the multi-faceted capabilities in common scenarios, lacking quantifiable and automated assessment in autonomous driving contexts, let alone severe road corner cases that even the state-of-the-art autonomous driving perception systems struggle to handle. In this paper, we propose CODA-LM, a novel vision-language benchmark for self-driving, which provides the first automatic and quantitative evaluation of LVLMs for interpretable autonomous driving including general perception, regional perception, and driving suggestions. CODA-LM utilizes the texts to describe the road images, exploiting powerful text-only large language models (LLMs) without image inputs to assess the capabilities of LVLMs in autonomous driving scenarios, which reveals stronger alignment with human preferences than LVLM judges. Experiments demonstrate that even the closed-sourced commercial LVLMs like GPT-4V cannot deal with road corner cases well, suggesting that we are still far from a strong LVLM-powered intelligent driving agent, and we hope our CODA-LM can become the catalyst to promote future development.
Abstract:Powered ankle prostheses effectively assist people with lower limb amputation to perform daily activities. High performance prostheses with adjustable compliance and capability to predict and implement amputee's intent are crucial for them to be comparable to or better than a real limb. However, current designs fail to provide simple yet effective compliance of the joint with full potential of modification, and lack accurate gait prediction method in real time. This paper proposes an innovative design of powered ankle prosthesis with serial elastic actuator (SEA), and puts forward a MLP based gait recognition method that can accurately and continuously predict more gait parameters for motion sensing and control. The prosthesis mimics biological joint with similar weight, torque, and power which can assist walking of up to 4 m/s. A new design of planar torsional spring is proposed for the SEA, which has better stiffness, endurance, and potential of modification than current designs. The gait recognition system simultaneously generates locomotive speed, gait phase, ankle angle and angular velocity only utilizing signals of single IMU, holding advantage in continuity, adaptability for speed range, accuracy, and capability of multi-functions.