Abstract:As large language models (LLMs) become integral to intelligent user interfaces (IUIs), their role as decision-making agents raises critical concerns about alignment. Although extensive research has addressed issues such as factuality, bias, and toxicity, comparatively little attention has been paid to measuring alignment to preferences, i.e., the relative desirability of different alternatives, a concept used in decision making, economics, and social choice theory. However, a reliable decision-making agent makes choices that align well with user preferences. In this paper, we generalize existing methods that exploit LLMs for ranking alternative outcomes by addressing alignment with the broader and more flexible concept of user preferences, which includes both strict preferences and indifference among alternatives. To this end, we put forward design principles for using LLMs to implement rational choice functions, and provide the necessary tools to measure preference satisfaction. We demonstrate the applicability of our approach through an empirical study in a practical application of an IUI in the automotive domain.
Abstract:This study is devoted to the automatic generation of German drama texts. We suggest an approach consisting of two key steps: fine-tuning a GPT-2 model (the outline model) to generate outlines of scenes based on keywords and fine-tuning a second model (the generation model) to generate scenes from the scene outline. The input for the neural model comprises two datasets: the German Drama Corpus (GerDraCor) and German Text Archive (Deutsches Textarchiv or DTA). In order to estimate the effectiveness of the proposed method, our models are compared with baseline GPT-2 models. Our models perform well according to automatic quantitative evaluation, but, conversely, manual qualitative analysis reveals a poor quality of generated texts. This may be due to the quality of the dataset or training inputs.