Tony
Abstract:This is the system card published alongside the OpenAI GPT-5 launch, August 2025. GPT-5 is a unified system with a smart and fast model that answers most questions, a deeper reasoning model for harder problems, and a real-time router that quickly decides which model to use based on conversation type, complexity, tool needs, and explicit intent (for example, if you say 'think hard about this' in the prompt). The router is continuously trained on real signals, including when users switch models, preference rates for responses, and measured correctness, improving over time. Once usage limits are reached, a mini version of each model handles remaining queries. This system card focuses primarily on gpt-5-thinking and gpt-5-main, while evaluations for other models are available in the appendix. The GPT-5 system not only outperforms previous models on benchmarks and answers questions more quickly, but -- more importantly -- is more useful for real-world queries. We've made significant advances in reducing hallucinations, improving instruction following, and minimizing sycophancy, and have leveled up GPT-5's performance in three of ChatGPT's most common uses: writing, coding, and health. All of the GPT-5 models additionally feature safe-completions, our latest approach to safety training to prevent disallowed content. Similarly to ChatGPT agent, we have decided to treat gpt-5-thinking as High capability in the Biological and Chemical domain under our Preparedness Framework, activating the associated safeguards. While we do not have definitive evidence that this model could meaningfully help a novice to create severe biological harm -- our defined threshold for High capability -- we have chosen to take a precautionary approach.
Abstract:To better match drivers to riders in our ridesharing application, we revised Lyft's core matching algorithm. We use a novel online reinforcement learning approach that estimates the future earnings of drivers in real time and use this information to find more efficient matches. This change was the first documented implementation of a ridesharing matching algorithm that can learn and improve in real time. We evaluated the new approach during weeks of switchback experimentation in most Lyft markets, and estimated how it benefited drivers, riders, and the platform. In particular, it enabled our drivers to serve millions of additional riders each year, leading to more than $30 million per year in incremental revenue. Lyft rolled out the algorithm globally in 2021.
Abstract:This paper presents a novel robotic arm system, named PAPRAS (Plug-And-Play Robotic Arm System). PAPRAS consists of a portable robotic arm(s), docking mount(s), and software architecture including a control system. By analyzing the target task spaces at home, the dimensions and configuration of PAPRAS are determined. PAPRAS's arm is light (less than 6kg) with an optimized 3D-printed structure, and it has a high payload (3kg) as a human-arm-sized manipulator. A locking mechanism is embedded in the structure for better portability and the 3D-printed docking mount can be installed easily. PAPRAS's software architecture is developed on an open-source framework and optimized for low-latency multiagent-based distributed manipulator control. A process to create new demonstrations is presented to show PAPRAS's ease of use and efficiency. In the paper, simulations and hardware experiments are presented in various demonstrations, including sink-to-dishwasher manipulation, coffee making, mobile manipulation on a quadruped, and suit-up demo to validate the hardware and software design.