Abstract:Large language models (LLMs) are trained from vast repositories of text authored by millions of distinct authors, reflecting an enormous diversity of human traits. While these models bear the potential to be used as approximations of human subjects in behavioral studies, prior efforts have been limited in steering model responses to match individual human users. In this work, we introduce "Anthology", a method for conditioning LLMs to particular virtual personas by harnessing open-ended life narratives, which we refer to as "backstories." We show that our methodology enhances the consistency and reliability of experimental outcomes while ensuring better representation of diverse sub-populations. Across three nationally representative human surveys conducted as part of Pew Research Center's American Trends Panel (ATP), we demonstrate that Anthology achieves up to 18% improvement in matching the response distributions of human respondents and 27% improvement in consistency metrics. Our code and generated backstories are available at https://github.com/CannyLab/anthology.
Abstract:Diffusion transformers (DiTs) have exhibited remarkable performance in visual generation tasks, such as generating realistic images or videos based on textual instructions. However, larger model sizes and multi-frame processing for video generation lead to increased computational and memory costs, posing challenges for practical deployment on edge devices. Post-Training Quantization (PTQ) is an effective method for reducing memory costs and computational complexity. When quantizing diffusion transformers, we find that applying existing diffusion quantization methods designed for U-Net faces challenges in preserving quality. After analyzing the major challenges for quantizing diffusion transformers, we design an improved quantization scheme: "ViDiT-Q": Video and Image Diffusion Transformer Quantization) to address these issues. Furthermore, we identify highly sensitive layers and timesteps hinder quantization for lower bit-widths. To tackle this, we improve ViDiT-Q with a novel metric-decoupled mixed-precision quantization method (ViDiT-Q-MP). We validate the effectiveness of ViDiT-Q across a variety of text-to-image and video models. While baseline quantization methods fail at W8A8 and produce unreadable content at W4A8, ViDiT-Q achieves lossless W8A8 quantization. ViDiTQ-MP achieves W4A8 with negligible visual quality degradation, resulting in a 2.5x memory optimization and a 1.5x latency speedup.
Abstract:In recent decades, GNSS Radio Occultation soundings have proven an invaluable input to global weather forecasting. The success of government-sponsored programs such as COSMIC is now complemented by commercial low-cost cubesat implementations. The result is access to more than 10,000 soundings per day and improved weather forecasting accuracy. This movement towards commercialization has been supported by several agencies, including the National Aeronautics and Space Administration (NASA), National Oceanic and Atmospheric Administration (NOAA) and the U.S. Air Force (USAF) with programs such as the Commercial Weather Data Pilot (CWDP). This has resulted in further interest in commercially deploying GNSS-RO on complementary platforms. Here, we examine a so far underutilized platform: the high-altitude weather balloon. Such meteorological radiosondes are deployed twice daily at over 900 locations globally and form an essential in-situ data source as a long-standing input to weather forecasting models. Adding GNSS-RO capability to existing radiosonde platforms would greatly expand capability, allowing for persistent and local area monitoring, a feature particularly useful for hurricane and other severe weather monitoring. A prohibitive barrier to entry to this inclusion is cost and complexity as GNSS-RO traditionally requires highly specialized and sensitive equipment. This paper describes a multi-year effort to develop a low-cost and scalable approach to balloon GNSS-RO based on Commercial-Off-The-Shelf (COTS) GNSS receivers. We present hardware prototypes and data processing techniques which demonstrate the technical feasibility of the approach through results from several flight testing campaigns.