Abstract:LLMs have demonstrated impressive proficiency in generating coherent and high-quality text, making them valuable across a range of text-generation tasks. However, rigorous evaluation of this generated content is crucial, as ensuring its quality remains a significant challenge due to persistent issues such as factual inaccuracies and hallucinations. This paper introduces two fine-tuned general-purpose LLM autoevaluators, REC-12B and REC-70B, specifically designed to evaluate generated text across several dimensions: faithfulness, instruction following, coherence, and completeness. These models not only provide ratings for these metrics but also offer detailed explanations and verifiable citations, thereby enhancing trust in the content. Moreover, the models support various citation modes, accommodating different requirements for latency and granularity. Extensive evaluations on diverse benchmarks demonstrate that our general-purpose LLM auto-evaluator, REC-70B, outperforms state-of-the-art LLMs, excelling in content evaluation by delivering better quality explanations and citations with minimal bias. It achieves Rank \#1 as a generative model on the RewardBench leaderboard\footnote{\url{https://huggingface.co/spaces/allenai/reward-bench}} under the model name \texttt{TextEval-Llama3.1-70B}. Our REC dataset and models are released at \url{https://github.com/adelaidehsu/REC}.
Abstract:In this paper we present Hybrid iterative Linear Quadratic Estimation (HiLQE), an optimization based offline state estimation algorithm for hybrid dynamical systems. We utilize the saltation matrix, a first order approximation of the variational update through an event driven hybrid transition, to calculate gradient information through hybrid events in the backward pass of an iterative linear quadratic optimization over state estimates. This enables accurate computation of the value function approximation at each timestep. Additionally, the forward pass in the iterative algorithm is augmented with hybrid dynamics in the rollout. A reference extension method is used to account for varying impact times when comparing states for the feedback gain in noise calculation. The proposed method is demonstrated on an ASLIP hopper system with position measurements. In comparison to the Salted Kalman Filter (SKF), the algorithm presented here achieves a maximum of 63.55% reduction in estimation error magnitude over all state dimensions near impact events.
Abstract:Prior research has investigated the benefits and costs of double-anonymous review (DAR, also known as double-blind review) in comparison to single-anonymous review (SAR) and open review (OR). Several review papers have attempted to compile experimental results in peer review research both broadly and in engineering and computer science. This document summarizes prior research in peer review that may inform decisions about the format of peer review in the field of robotics and makes some recommendations for potential next steps for robotics publication.
Abstract:Perception, Planning, and Control form the essential components of autonomy in advanced air mobility. This work advances the holistic integration of these components to enhance the performance and robustness of the complete cyber-physical system. We adapt Perception Simplex, a system for verifiable collision avoidance amidst obstacle detection faults, to the vertical landing maneuver for autonomous air mobility vehicles. We improve upon this system by replacing static assumptions of control capabilities with dynamic confirmation, i.e., real-time confirmation of control limitations of the system, ensuring reliable fulfillment of safety maneuvers and overrides, without dependence on overly pessimistic assumptions. Parameters defining control system capabilities and limitations, e.g., maximum deceleration, are continuously tracked within the system and used to make safety-critical decisions. We apply these techniques to propose a verifiable collision avoidance solution for autonomous aerial mobility vehicles operating in cluttered and potentially unsafe environments.
Abstract:Hybrid dynamical systems, i.e. systems that have both continuous and discrete states, are ubiquitous in engineering, but are difficult to work with due to their discontinuous transitions. For example, a robot leg is able to exert very little control effort while it is in the air compared to when it is on the ground. When the leg hits the ground, the penetrating velocity instantaneously collapses to zero. These instantaneous changes in dynamics and discontinuities (or jumps) in state make standard smooth tools for planning, estimation, control, and learning difficult for hybrid systems. One of the key tools for accounting for these jumps is called the saltation matrix. The saltation matrix is the sensitivity update when a hybrid jump occurs and has been used in a variety of fields including robotics, power circuits, and computational neuroscience. This paper presents an intuitive derivation of the saltation matrix and discusses what it captures, where it has been used in the past, how it is used for linear and quadratic forms, how it is computed for rigid body systems with unilateral constraints, and some of the structural properties of the saltation matrix in these cases.
Abstract:Robots operating in close proximity to humans rely heavily on human trust in them to successfully complete their tasks. But what are the real outcomes when this trust is violated? Self-defense law provides a framework for analyzing tangible failure scenarios that can inform the design of robots and their algorithms. Studying self-defense is particularly important for ground robots since they operate within public human environments, where they can pose a legitimate threat to human safety. Moreover, even if ground robots can guarantee human safety, the perception of a threat is enough to warrant human self-defense against robots. In this paper, we synthesize works in law, engineering, and the social sciences to present four actionable recommendations for how the robotics community can craft robots to mitigate the likelihood of self-defense situations arising. We establish how current U.S. self-defense law can justify a human protecting themselves against a robot, discuss the current literature on human attitudes toward robots, and analyze methods that have been produced to allow robots to operate close to humans. Finally, we present hypothetical scenarios that underscore how current robot navigation methods can fail to sufficiently consider self-defense concerns and the need for the recommendations to guide improvements in the field.
Abstract:In order to perform highly dynamic and agile maneuvers, legged robots typically spend time in underactuated domains (e.g. with feet off the ground) where the system has limited command of its acceleration and a constrained amount of time before transitioning to a new domain (e.g. foot touchdown). Meanwhile, these transitions can have instantaneous, unbounded effects on perturbations. These properties make it difficult for local feedback controllers to effectively recover from disturbances as the system evolves through underactuated domains and hybrid impact events. To address this, we utilize the fundamental solution matrix that characterizes the evolution of perturbations through a hybrid trajectory and its 2-norm, which represents the worst-case growth of perturbations. In this paper, the worst-case perturbation analysis is used to explicitly reason about the tracking performance of a hybrid trajectory and is incorporated in an iLQR framework to optimize a trajectory while taking into account the closed-loop convergence of the trajectory under an LQR tracking controller. The generated convergent trajectories are able to recover more effectively from perturbations, are more robust to large disturbances, and use less feedback control effort than trajectories generated with traditional optimization methods.
Abstract:Many controllers for legged robotic systems leverage open- or closed-loop control at discrete hybrid events to enhance stability. These controllers appear in several well studied phenomena such as the Raibert stepping controller, paddle juggling and swing leg retraction. This work introduces hybrid event shaping (HES): a generalized method for analyzing and producing stable hybrid event controllers. HES utilizes the saltation matrix, which gives a closed-form equation for the effect that hybrid events have on stability. We also introduce shape parameters, which are higher order terms that can be tuned completely independently from the system dynamics to promote stability. Optimization methods are used to produce values of these parameters that optimize a stability measure. Hybrid event shaping captures previously developed control methods while also producing new optimally stable trajectories without the need for continuous-domain feedback.