Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Max Peeperkorn

Mind the Gap: Conformative Decoding to Improve Output Diversity of Instruction-Tuned Large Language Models

Jul 28, 2025

Max Peeperkorn, Tom Kouwenhoven, Dan Brown, Anna Jordanous

Abstract:Instruction-tuning large language models (LLMs) reduces the diversity of their outputs, which has implications for many tasks, particularly for creative tasks. This paper investigates the ``diversity gap'' for a writing prompt narrative generation task. This gap emerges as measured by current diversity metrics for various open-weight and open-source LLMs. The results show significant decreases in diversity due to instruction-tuning. We explore the diversity loss at each fine-tuning stage for the OLMo and OLMo 2 models to further understand how output diversity is affected. The results indicate that DPO has the most substantial impact on diversity. Motivated by these findings, we present a new decoding strategy, conformative decoding, which guides an instruct model using its more diverse base model to reintroduce output diversity. We show that conformative decoding typically increases diversity and even maintains or improves quality.

* 9 pages, 3 figures

Via

Access Paper or Ask Questions

Shaping Shared Languages: Human and Large Language Models' Inductive Biases in Emergent Communication

Mar 06, 2025

Tom Kouwenhoven, Max Peeperkorn, Roy de Kleijn, Tessa Verhoef

Figure 1 for Shaping Shared Languages: Human and Large Language Models' Inductive Biases in Emergent Communication

Figure 2 for Shaping Shared Languages: Human and Large Language Models' Inductive Biases in Emergent Communication

Figure 3 for Shaping Shared Languages: Human and Large Language Models' Inductive Biases in Emergent Communication

Figure 4 for Shaping Shared Languages: Human and Large Language Models' Inductive Biases in Emergent Communication

Abstract:Languages are shaped by the inductive biases of their users. Using a classical referential game, we investigate how artificial languages evolve when optimised for inductive biases in humans and large language models (LLMs) via Human-Human, LLM-LLM and Human-LLM experiments. We show that referentially grounded vocabularies emerge that enable reliable communication in all conditions, even when humans and LLMs collaborate. Comparisons between conditions reveal that languages optimised for LLMs subtly differ from those optimised for humans. Interestingly, interactions between humans and LLMs alleviate these differences and result in vocabularies which are more human-like than LLM-like. These findings advance our understanding of how inductive biases in LLMs play a role in the dynamic nature of human language and contribute to maintaining alignment in human and machine communication. In particular, our work underscores the need to think of new methods that include human interaction in the training processes of LLMs, and shows that using communicative success as a reward signal can be a fruitful, novel direction.

Via

Access Paper or Ask Questions

Searching for Structure: Investigating Emergent Communication with Large Language Models

Dec 10, 2024

Tom Kouwenhoven, Max Peeperkorn, Tessa Verhoef

Figure 1 for Searching for Structure: Investigating Emergent Communication with Large Language Models

Figure 2 for Searching for Structure: Investigating Emergent Communication with Large Language Models

Figure 3 for Searching for Structure: Investigating Emergent Communication with Large Language Models

Figure 4 for Searching for Structure: Investigating Emergent Communication with Large Language Models

Abstract:Human languages have evolved to be structured through repeated language learning and use. These processes introduce biases that operate during language acquisition and shape linguistic systems toward communicative efficiency. In this paper, we investigate whether the same happens if artificial languages are optimised for implicit biases of Large Language Models (LLMs). To this end, we simulate a classical referential game in which LLMs learn and use artificial languages. Our results show that initially unstructured holistic languages are indeed shaped to have some structural properties that allow two LLM agents to communicate successfully. Similar to observations in human experiments, generational transmission increases the learnability of languages, but can at the same time result in non-humanlike degenerate vocabularies. Taken together, this work extends experimental findings, shows that LLMs can be used as tools in simulations of language evolution, and opens possibilities for future human-machine experiments in this field.

Via

Access Paper or Ask Questions

The Curious Case of Representational Alignment: Unravelling Visio-Linguistic Tasks in Emergent Communication

Jul 25, 2024

Tom Kouwenhoven, Max Peeperkorn, Bram van Dijk, Tessa Verhoef

Figure 1 for The Curious Case of Representational Alignment: Unravelling Visio-Linguistic Tasks in Emergent Communication

Figure 2 for The Curious Case of Representational Alignment: Unravelling Visio-Linguistic Tasks in Emergent Communication

Figure 3 for The Curious Case of Representational Alignment: Unravelling Visio-Linguistic Tasks in Emergent Communication

Figure 4 for The Curious Case of Representational Alignment: Unravelling Visio-Linguistic Tasks in Emergent Communication

Abstract:Natural language has the universal properties of being compositional and grounded in reality. The emergence of linguistic properties is often investigated through simulations of emergent communication in referential games. However, these experiments have yielded mixed results compared to similar experiments addressing linguistic properties of human language. Here we address representational alignment as a potential contributing factor to these results. Specifically, we assess the representational alignment between agent image representations and between agent representations and input images. Doing so, we confirm that the emergent language does not appear to encode human-like conceptual visual features, since agent image representations drift away from inputs whilst inter-agent alignment increases. We moreover identify a strong relationship between inter-agent alignment and topographic similarity, a common metric for compositionality, and address its consequences. To address these issues, we introduce an alignment penalty that prevents representational drift but interestingly does not improve performance on a compositional discrimination task. Together, our findings emphasise the key role representational alignment plays in simulations of language emergence.

* Appeared at the 13th edition of the Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2024)

Via

Access Paper or Ask Questions

Is Temperature the Creativity Parameter of Large Language Models?

May 01, 2024

Max Peeperkorn, Tom Kouwenhoven, Dan Brown, Anna Jordanous

Abstract:Large language models (LLMs) are applied to all sorts of creative tasks, and their outputs vary from beautiful, to peculiar, to pastiche, into plain plagiarism. The temperature parameter of an LLM regulates the amount of randomness, leading to more diverse outputs; therefore, it is often claimed to be the creativity parameter. Here, we investigate this claim using a narrative generation task with a predetermined fixed context, model and prompt. Specifically, we present an empirical analysis of the LLM output for different temperature values using four necessary conditions for creativity in narrative generation: novelty, typicality, cohesion, and coherence. We find that temperature is weakly correlated with novelty, and unsurprisingly, moderately correlated with incoherence, but there is no relationship with either cohesion or typicality. However, the influence of temperature on creativity is far more nuanced and weak than suggested by the "creativity parameter" claim; overall results suggest that the LLM generates slightly more novel outputs as temperatures get higher. Finally, we discuss ideas to allow more controlled LLM creativity, rather than relying on chance via changing the temperature parameter.

* To be published in the Proceedings of the 15th International Conference on Computational Creativity (ICCC'24), 8 pages, 2 figures, 2 tables

Via

Access Paper or Ask Questions

Bits of Grass: Does GPT already know how to write like Whitman?

May 10, 2023

Piotr Sawicki, Marek Grzes, Fabricio Goes, Dan Brown, Max Peeperkorn, Aisha Khatun

Abstract:This study examines the ability of GPT-3.5, GPT-3.5-turbo (ChatGPT) and GPT-4 models to generate poems in the style of specific authors using zero-shot and many-shot prompts (which use the maximum context length of 8192 tokens). We assess the performance of models that are not fine-tuned for generating poetry in the style of specific authors, via automated evaluation. Our findings indicate that without fine-tuning, even when provided with the maximum number of 17 poem examples (8192 tokens) in the prompt, these models do not generate poetry in the desired style.

* short paper 5 pages

Via

Access Paper or Ask Questions