Abstract:The capability to generate diverse text is a key challenge facing large language models (LLMs). Thus far, diversity has been studied via metrics such as $n$-gram diversity or diversity of BERT embeddings. However, for these kinds of diversity, the user has little control over the dimensions along which diversity is considered. For example, in the poetry domain, one might desire diversity in terms of rhyme and meter, whereas in the code domain, one might desire diversity in terms of the kinds of expressions used to solve a problem. We propose a diversity metric called structural diversity, where the user provides a mapping from generated text to features capturing the kinds of diversity that they care about. In addition, we propose a novel strategy called chain-of-specification (CoS) prompting for improving diversity by first having the LLM generate a specification encoding one instance of structural features, and then prompting the LLM to generate text that satisfies these features; notably, our strategy works with blackbox LLMs. In our experiments, we show that for structural diversity in the poetry and code domains, CoS significantly improves diversity compared to several baselines.
Abstract:This paper delves into the intersection of computational theory and music, examining the concept of undecidability and its significant, yet overlooked, implications within the realm of modern music composition and production. It posits that undecidability, a principle traditionally associated with theoretical computer science, extends its relevance to the music industry. The study adopts a multidimensional approach, focusing on five key areas: (1) the Turing completeness of Ableton, a widely used digital audio workstation, (2) the undecidability of satisfiability in sound creation utilizing an array of effects, (3) the undecidability of constraints on polymeters in musical compositions, (4) the undecidability of satisfiability in just intonation harmony constraints, and (5) the undecidability of "new ordering systems". In addition to providing theoretical proof for these assertions, the paper elucidates the practical relevance of these concepts for practitioners outside the field of theoretical computer science. The ultimate aim is to foster a new understanding of undecidability in music, highlighting its broader applicability and potential to influence contemporary computer-assisted (and traditional) music making.
Abstract:Significant strides have been made toward designing better generative models in recent years. Despite this progress, however, state-of-the-art approaches are still largely unable to capture complex global structure in data. For example, images of buildings typically contain spatial patterns such as windows repeating at regular intervals; state-of-the-art generative methods can't easily reproduce these structures. We propose to address this problem by incorporating programs representing global structure into the generative model---e.g., a 2D for-loop may represent a configuration of windows. Furthermore, we propose a framework for learning these models by leveraging program synthesis to generate training data. On both synthetic and real-world data, we demonstrate that our approach is substantially better than the state-of-the-art at both generating and completing images that contain global structure.