Tony
Abstract:The design-make-test-analyze cycle in early-stage drug discovery remains constrained primarily by the "make" step: small-molecule synthesis is slow, costly, and difficult to scale or automate across diverse chemotypes. Enumerated chemical spaces aim to reduce this bottleneck by predefining synthesizable regions of chemical space from available building blocks and reliable reactions, yet existing commercial spaces are still limited by long turnaround times, narrow reaction scope, and substantial manual decision-making in route selection and execution. Here we present the first version of onepot CORE, an enumerated chemical space containing 3.4B molecules and corresponding on-demand synthesis product enabled by an automated synthesis platform and an AI chemist, Phil, that designs, executes, and analyzes experiments. onepot CORE is constructed by (i) selecting a reaction set commonly used in medicinal chemistry, (ii) sourcing and curating building blocks from supplier catalogs, (iii) enumerating candidate products, and (iv) applying ML-based feasibility assessment to prioritize compounds for robust execution. In the current release, the space is supported by seven reactions. We describe an end-to-end workflow - from route selection and automated liquid handling through workup and purification. We further report validation across operational metrics (success rate, timelines, purity, and identity), including NMR confirmation for a representative set of synthesized compounds and assay suitability demonstrated using a series of DPP4 inhibitors. Collectively, onepot CORE illustrates a path toward faster, more reliable access to diverse small molecules, supporting accelerated discovery in pharmaceuticals and beyond.
Abstract:This is the system card published alongside the OpenAI GPT-5 launch, August 2025. GPT-5 is a unified system with a smart and fast model that answers most questions, a deeper reasoning model for harder problems, and a real-time router that quickly decides which model to use based on conversation type, complexity, tool needs, and explicit intent (for example, if you say 'think hard about this' in the prompt). The router is continuously trained on real signals, including when users switch models, preference rates for responses, and measured correctness, improving over time. Once usage limits are reached, a mini version of each model handles remaining queries. This system card focuses primarily on gpt-5-thinking and gpt-5-main, while evaluations for other models are available in the appendix. The GPT-5 system not only outperforms previous models on benchmarks and answers questions more quickly, but -- more importantly -- is more useful for real-world queries. We've made significant advances in reducing hallucinations, improving instruction following, and minimizing sycophancy, and have leveled up GPT-5's performance in three of ChatGPT's most common uses: writing, coding, and health. All of the GPT-5 models additionally feature safe-completions, our latest approach to safety training to prevent disallowed content. Similarly to ChatGPT agent, we have decided to treat gpt-5-thinking as High capability in the Biological and Chemical domain under our Preparedness Framework, activating the associated safeguards. While we do not have definitive evidence that this model could meaningfully help a novice to create severe biological harm -- our defined threshold for High capability -- we have chosen to take a precautionary approach.
Abstract:Despite incredible progress in language models (LMs) in recent years, largely resulting from moving away from specialized models designed for specific tasks to general models based on powerful architectures (e.g. the Transformer) that learn everything from raw data, pre-processing steps such as tokenization remain a barrier to true end-to-end foundation models. We introduce a collection of new techniques that enable a dynamic chunking mechanism which automatically learns content -- and context -- dependent segmentation strategies learned jointly with the rest of the model. Incorporating this into an explicit hierarchical network (H-Net) allows replacing the (implicitly hierarchical) tokenization-LM-detokenization pipeline with a single model learned fully end-to-end. When compute- and data- matched, an H-Net with one stage of hierarchy operating at the byte level outperforms a strong Transformer language model operating over BPE tokens. Iterating the hierarchy to multiple stages further increases its performance by modeling multiple levels of abstraction, demonstrating significantly better scaling with data and matching a token-based Transformer of twice its size. H-Nets pretrained on English show significantly increased character-level robustness, and qualitatively learn meaningful data-dependent chunking strategies without any heuristics or explicit supervision. Finally, the H-Net's improvement over tokenized pipelines is further increased in languages and modalities with weaker tokenization heuristics, such as Chinese and code, or DNA sequences (nearly 4x improvement in data efficiency over baselines), showing the potential of true end-to-end models that learn and scale better from unprocessed data.




Abstract:Track reconstruction is a crucial task in particle experiments and is traditionally very computationally expensive due to its combinatorial nature. Recently, graph neural networks (GNNs) have emerged as a promising approach that can improve scalability. Most of these GNN-based methods, including the edge classification (EC) and the object condensation (OC) approach, require an input graph that needs to be constructed beforehand. In this work, we consider a one-shot OC approach that reconstructs particle tracks directly from a set of hits (point cloud) by recursively applying graph attention networks with an evolving graph structure. This approach iteratively updates the graphs and can better facilitate the message passing across each graph. Preliminary studies on the TrackML dataset show better track performance compared to the methods that require a fixed input graph.