Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marc Dymetman

Xerox Research Centre Europe, Grenoble

FaST: Feature-aware Sampling and Tuning for Personalized Preference Alignment with Limited Data

Aug 06, 2025

Thibaut Thonet, Germán Kruszewski, Jos Rozen, Pierre Erbacher, Marc Dymetman

Abstract:LLM-powered conversational assistants are often deployed in a one-size-fits-all manner, which fails to accommodate individual user preferences. Recently, LLM personalization -- tailoring models to align with specific user preferences -- has gained increasing attention as a way to bridge this gap. In this work, we specifically focus on a practical yet challenging setting where only a small set of preference annotations can be collected per user -- a problem we define as Personalized Preference Alignment with Limited Data (PPALLI). To support research in this area, we introduce two datasets -- DnD and ELIP -- and benchmark a variety of alignment techniques on them. We further propose FaST, a highly parameter-efficient approach that leverages high-level features automatically discovered from the data, achieving the best overall performance.

Via

Access Paper or Ask Questions

Guaranteed Generation from Large Language Models

Oct 09, 2024

Minbeom Kim, Thibaut Thonet, Jos Rozen, Hwaran Lee, Kyomin Jung, Marc Dymetman

Figure 1 for Guaranteed Generation from Large Language Models

Figure 2 for Guaranteed Generation from Large Language Models

Figure 3 for Guaranteed Generation from Large Language Models

Figure 4 for Guaranteed Generation from Large Language Models

Abstract:As large language models (LLMs) are increasingly used across various applications, there is a growing need to control text generation to satisfy specific constraints or requirements. This raises a crucial question: Is it possible to guarantee strict constraint satisfaction in generated outputs while preserving the distribution of the original model as much as possible? We first define the ideal distribution - the one closest to the original model, which also always satisfies the expressed constraint - as the ultimate goal of guaranteed generation. We then state a fundamental limitation, namely that it is impossible to reach that goal through autoregressive training alone. This motivates the necessity of combining training-time and inference-time methods to enforce such guarantees. Based on this insight, we propose GUARD, a simple yet effective approach that combines an autoregressive proposal distribution with rejection sampling. Through GUARD's theoretical properties, we show how controlling the KL divergence between a specific proposal and the target ideal distribution simultaneously optimizes inference speed and distributional closeness. To validate these theoretical concepts, we conduct extensive experiments on two text generation settings with hard-to-satisfy constraints: a lexical constraint scenario and a sentiment reversal scenario. These experiments show that GUARD achieves perfect constraint satisfaction while almost preserving the ideal distribution with highly improved inference efficiency. GUARD provides a principled approach to enforcing strict guarantees for LLMs without compromising their generative capabilities.

* 22 pages, 11 figures

Via

Access Paper or Ask Questions

Compositional preference models for aligning LMs

Oct 17, 2023

Dongyoung Go, Tomasz Korbak, Germán Kruszewski, Jos Rozen, Marc Dymetman

Figure 1 for Compositional preference models for aligning LMs

Figure 2 for Compositional preference models for aligning LMs

Figure 3 for Compositional preference models for aligning LMs

Figure 4 for Compositional preference models for aligning LMs

Abstract:As language models (LMs) become more capable, it is increasingly important to align them with human preferences. However, the dominant paradigm for training Preference Models (PMs) for that purpose suffers from fundamental limitations, such as lack of transparency and scalability, along with susceptibility to overfitting the preference dataset. We propose Compositional Preference Models (CPMs), a novel PM framework that decomposes one global preference assessment into several interpretable features, obtains scalar scores for these features from a prompted LM, and aggregates these scores using a logistic regression classifier. CPMs allow to control which properties of the preference data are used to train the preference model and to build it based on features that are believed to underlie the human preference judgment. Our experiments show that CPMs not only improve generalization and are more robust to overoptimization than standard PMs, but also that best-of-n samples obtained using CPMs tend to be preferred over samples obtained using conventional PMs. Overall, our approach demonstrates the benefits of endowing PMs with priors about which features determine human preferences while relying on LM capabilities to extract those features in a scalable and robust way.

Via

Access Paper or Ask Questions

Should you marginalize over possible tokenizations?

Jun 30, 2023

Nadezhda Chirkova, Germán Kruszewski, Jos Rozen, Marc Dymetman

Abstract:Autoregressive language models (LMs) map token sequences to probabilities. The usual practice for computing the probability of any character string (e.g. English sentences) is to first transform it into a sequence of tokens that is scored by the model. However, there are exponentially many token sequences that represent any given string. To truly compute the probability of a string one should marginalize over all tokenizations, which is typically intractable. Here, we analyze whether the practice of ignoring the marginalization is justified. To this end, we devise an importance-sampling-based algorithm that allows us to compute estimates of the marginal probabilities and compare them to the default procedure in a range of state-of-the-art models and datasets. Our results show that the gap in log-likelihood is no larger than 0.5% in most cases, but that it becomes more pronounced for data with long complex words.

* Accepted to ACL 2023

Via

Access Paper or Ask Questions

disco: a toolkit for Distributional Control of Generative Models

Mar 08, 2023

Germán Kruszewski, Jos Rozen, Marc Dymetman

Figure 1 for disco: a toolkit for Distributional Control of Generative Models

Figure 2 for disco: a toolkit for Distributional Control of Generative Models

Figure 3 for disco: a toolkit for Distributional Control of Generative Models

Figure 4 for disco: a toolkit for Distributional Control of Generative Models

Abstract:Pre-trained language models and other generative models have revolutionized NLP and beyond. However, these models tend to reproduce undesirable biases present in their training data. Also, they may overlook patterns that are important but challenging to capture. To address these limitations, researchers have introduced distributional control techniques. These techniques, not limited to language, allow controlling the prevalence (i.e., expectations) of any features of interest in the model's outputs. Despite their potential, the widespread adoption of these techniques has been hindered by the difficulty in adapting complex, disconnected code. Here, we present disco, an open-source Python library that brings these techniques to the broader public.

Via

Access Paper or Ask Questions

Aligning Language Models with Preferences through f-divergence Minimization

Feb 16, 2023

Dongyoung Go, Tomasz Korbak, Germán Kruszewski, Jos Rozen, Nahyeon Ryu, Marc Dymetman

Figure 1 for Aligning Language Models with Preferences through f-divergence Minimization

Figure 2 for Aligning Language Models with Preferences through f-divergence Minimization

Figure 3 for Aligning Language Models with Preferences through f-divergence Minimization

Figure 4 for Aligning Language Models with Preferences through f-divergence Minimization

Abstract:Aligning language models with preferences can be posed as approximating a target distribution representing some desired behavior. Existing approaches differ both in the functional form of the target distribution and the algorithm used to approximate it. For instance, Reinforcement Learning from Human Feedback (RLHF) corresponds to minimizing a reverse KL from an implicit target distribution arising from a KL penalty in the objective. On the other hand, Generative Distributional Control (GDC) has an explicit target distribution and minimizes a forward KL from it using the Distributional Policy Gradient (DPG) algorithm. In this paper, we propose a new approach, f-DPG, which allows the use of any f-divergence to approximate any target distribution. f-DPG unifies both frameworks (RLHF, GDC) and the approximation methods (DPG, RL with KL penalties). We show the practical benefits of various choices of divergence objectives and demonstrate that there is no universally optimal objective but that different divergences are good for approximating different targets. For instance, we discover that for GDC, the Jensen-Shannon divergence frequently outperforms forward KL divergence by a wide margin, leading to significant improvements over prior work.

Via

Access Paper or Ask Questions

On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting

Jun 01, 2022

Tomasz Korbak, Hady Elsahar, Germán Kruszewski, Marc Dymetman

Figure 1 for On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting

Figure 2 for On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting

Figure 3 for On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting

Figure 4 for On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting

Abstract:The availability of large pre-trained models is changing the landscape of Machine Learning research and practice, moving from a training-from-scratch to a fine-tuning paradigm. While in some applications the goal is to "nudge" the pre-trained distribution towards preferred outputs, in others it is to steer it towards a different distribution over the sample space. Two main paradigms have emerged to tackle this challenge: Reward Maximization (RM) and, more recently, Distribution Matching (DM). RM applies standard Reinforcement Learning (RL) techniques, such as Policy Gradients, to gradually increase the reward signal. DM prescribes to first make explicit the target distribution that the model is fine-tuned to approximate. Here we explore the theoretical connections between the two paradigms, and show that methods such as KL-control developed for RM can also be construed as belonging to DM. We further observe that while DM differs from RM, it can suffer from similar training difficulties, such as high gradient variance. We leverage connections between the two paradigms to import the concept of baseline into DM methods. We empirically validate the benefits of adding a baseline on an array of controllable language generation tasks such as constraining topic, sentiment, and gender distributions in texts sampled from a language model. We observe superior performance in terms of constraint satisfaction, stability and sample efficiency.

Via

Access Paper or Ask Questions

Sampling from Discrete Energy-Based Models with Quality/Efficiency Trade-offs

Dec 10, 2021

Bryan Eikema, Germán Kruszewski, Hady Elsahar, Marc Dymetman

Figure 1 for Sampling from Discrete Energy-Based Models with Quality/Efficiency Trade-offs

Figure 2 for Sampling from Discrete Energy-Based Models with Quality/Efficiency Trade-offs

Figure 3 for Sampling from Discrete Energy-Based Models with Quality/Efficiency Trade-offs

Figure 4 for Sampling from Discrete Energy-Based Models with Quality/Efficiency Trade-offs

Abstract:Energy-Based Models (EBMs) allow for extremely flexible specifications of probability distributions. However, they do not provide a mechanism for obtaining exact samples from these distributions. Monte Carlo techniques can aid us in obtaining samples if some proposal distribution that we can easily sample from is available. For instance, rejection sampling can provide exact samples but is often difficult or impossible to apply due to the need to find a proposal distribution that upper-bounds the target distribution everywhere. Approximate Markov chain Monte Carlo sampling techniques like Metropolis-Hastings are usually easier to design, exploiting a local proposal distribution that performs local edits on an evolving sample. However, these techniques can be inefficient due to the local nature of the proposal distribution and do not provide an estimate of the quality of their samples. In this work, we propose a new approximate sampling technique, Quasi Rejection Sampling (QRS), that allows for a trade-off between sampling efficiency and sampling quality, while providing explicit convergence bounds and diagnostics. QRS capitalizes on the availability of high-quality global proposal distributions obtained from deep learning models. We demonstrate the effectiveness of QRS sampling for discrete EBMs over text for the tasks of controlled text generation with distributional constraints and paraphrase generation. We show that we can sample from such EBMs with arbitrary precision at the cost of sampling efficiency.

Via

Access Paper or Ask Questions

Controlling Conditional Language Models with Distributional Policy Gradients

Dec 01, 2021

Tomasz Korbak, Hady Elsahar, German Kruszewski, Marc Dymetman

Figure 1 for Controlling Conditional Language Models with Distributional Policy Gradients

Figure 2 for Controlling Conditional Language Models with Distributional Policy Gradients

Figure 3 for Controlling Conditional Language Models with Distributional Policy Gradients

Figure 4 for Controlling Conditional Language Models with Distributional Policy Gradients

Abstract:Machine learning is shifting towards general-purpose pretrained generative models, trained in a self-supervised manner on large amounts of data, which can then be applied to solve a large number of tasks. However, due to their generic training methodology, these models often fail to meet some of the downstream requirements (e.g. hallucination in abstractive summarization or wrong format in automatic code generation). This raises an important question on how to adapt pre-trained generative models to a new task without destroying its capabilities. Recent work has suggested to solve this problem by representing task-specific requirements through energy-based models (EBMs) and approximating these EBMs using distributional policy gradients (DPG). Unfortunately, this approach is limited to unconditional distributions, represented by unconditional EBMs. In this paper, we extend this approach to conditional tasks by proposing Conditional DPG (CDPG). We evaluate CDPG on three different control objectives across two tasks: summarization with T5 and code generation with GPT-Neo. Our results show that fine-tuning using CDPG robustly moves these pretrained models closer towards meeting control objectives and -- in contrast with baseline approaches -- does not result in catastrophic forgetting.

* CtrlGen: Controllable Generative Modeling in Language and Vision Workshop at NeurIPS 2021

Via

Access Paper or Ask Questions

Energy-Based Models for Code Generation under Compilability Constraints

Jun 09, 2021

Tomasz Korbak, Hady Elsahar, Marc Dymetman, Germán Kruszewski

Figure 1 for Energy-Based Models for Code Generation under Compilability Constraints

Figure 2 for Energy-Based Models for Code Generation under Compilability Constraints

Figure 3 for Energy-Based Models for Code Generation under Compilability Constraints

Figure 4 for Energy-Based Models for Code Generation under Compilability Constraints

Abstract:Neural language models can be successfully trained on source code, leading to applications such as code completion. However, their versatile autoregressive self-supervision objective overlooks important global sequence-level features that are present in the data such as syntactic correctness or compilability. In this work, we pose the problem of learning to generate compilable code as constraint satisfaction. We define an Energy-Based Model (EBM) representing a pre-trained generative model with an imposed constraint of generating only compilable sequences. We then use the KL-Adaptive Distributional Policy Gradient algorithm (Khalifa et al., 2021) to train a generative model approximating the EBM. We conduct experiments showing that our proposed approach is able to improve compilability rates without sacrificing diversity and complexity of the generated samples.

* Accepted for the First Workshop on Natural Language Processing for Programming, ACL 2021

Via

Access Paper or Ask Questions