Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Giorgio Franceschelli

DiffSampling: Enhancing Diversity and Accuracy in Neural Text Generation

Feb 19, 2025

Giorgio Franceschelli, Mirco Musolesi

Abstract:Despite their increasing performance, large language models still tend to reproduce training data, generate several repetitions, and focus on the most common grammatical structures and words. A possible cause is the decoding strategy adopted: the most common ones either consider only the most probable tokens, reducing output diversity, or increase the likelihood of unlikely tokens at the cost of output accuracy and correctness. In this paper, we propose a family of three new decoding methods by leveraging a mathematical analysis of the token probability distribution. In particular, the difference between consecutive, sorted probabilities can be used to avoid incorrect tokens and increase the chance of low-probable but accurate words. Experiments concerning math problem solving, extreme summarization, and the divergent association task show that our approach consistently performs at least as well as current alternatives in terms of quality and diversity.

Via

Access Paper or Ask Questions

Thinking Outside the (Gray) Box: A Context-Based Score for Assessing Value and Originality in Neural Text Generation

Feb 18, 2025

Giorgio Franceschelli, Mirco Musolesi

Abstract:Despite the increasing use of large language models for creative tasks, their outputs often lack diversity. Common solutions, such as sampling at higher temperatures, can compromise the quality of the results. Drawing on information theory, we propose a context-based score to quantitatively evaluate value and originality. This score incentivizes accuracy and adherence to the request while fostering divergence from the learned distribution. We propose using our score as a reward in a reinforcement learning framework to fine-tune large language models for maximum performance. We validate our strategy through experiments in poetry generation and math problem solving, demonstrating that it enhances the value and originality of the generated solutions.

Via

Access Paper or Ask Questions

Training Foundation Models as Data Compression: On Information, Model Weights and Copyright Law

Jul 18, 2024

Giorgio Franceschelli, Claudia Cevenini, Mirco Musolesi

Figure 1 for Training Foundation Models as Data Compression: On Information, Model Weights and Copyright Law

Figure 2 for Training Foundation Models as Data Compression: On Information, Model Weights and Copyright Law

Abstract:The training process of foundation models as for other classes of deep learning systems is based on minimizing the reconstruction error over a training set. For this reason, they are susceptible to the memorization and subsequent reproduction of training samples. In this paper, we introduce a training-as-compressing perspective, wherein the model's weights embody a compressed representation of the training data. From a copyright standpoint, this point of view implies that the weights could be considered a reproduction or a derivative work of a potentially protected set of works. We investigate the technical and legal challenges that emerge from this framing of the copyright of outputs generated by foundation models, including their implications for practitioners and researchers. We demonstrate that adopting an information-centric approach to the problem presents a promising pathway for tackling these emerging complex legal issues.

* Accepted for spotlight presentation at GenLaw'24, see https://www.genlaw.org/2024-icml-papers#training-foundation-models-as-data-compression-on-information-model-weights-and-copyright-law

Via

Access Paper or Ask Questions

Creative Beam Search: LLM-as-a-Judge For Improving Response Generation

May 09, 2024

Giorgio Franceschelli, Mirco Musolesi

Abstract:Large language models are revolutionizing several areas, including artificial creativity. However, the process of generation in machines profoundly diverges from that observed in humans. In particular, machine generation is characterized by a lack of intentionality and an underlying creative process. We propose a method called Creative Beam Search that uses Diverse Beam Search and LLM-as-a-Judge to perform response generation and response validation. The results of a qualitative experiment show how our approach can provide better output than standard sampling techniques. We also show that the response validation step is a necessary complement to the response generation step.

Via

Access Paper or Ask Questions

Creative Beam Search

Apr 30, 2024

Giorgio Franceschelli, Mirco Musolesi

Via

Access Paper or Ask Questions

Do Agents Dream of Electric Sheep?: Improving Generalization in Reinforcement Learning through Generative Learning

Mar 12, 2024

Giorgio Franceschelli, Mirco Musolesi

Abstract:The Overfitted Brain hypothesis suggests dreams happen to allow generalization in the human brain. Here, we ask if the same is true for reinforcement learning agents as well. Given limited experience in a real environment, we use imagination-based reinforcement learning to train a policy on dream-like episodes, where non-imaginative, predicted trajectories are modified through generative augmentations. Experiments on four ProcGen environments show that, compared to classic imagination and offline training on collected experience, our method can reach a higher level of generalization when dealing with sparsely rewarded environments.

Via

Access Paper or Ask Questions

Reinforcement Learning for Generative AI: State of the Art, Opportunities and Open Research Challenges

Jul 31, 2023

Giorgio Franceschelli, Mirco Musolesi

Abstract:Generative Artificial Intelligence (AI) is one of the most exciting developments in Computer Science of the last decade. At the same time, Reinforcement Learning (RL) has emerged as a very successful paradigm for a variety of machine learning tasks. In this survey, we discuss the state of the art, opportunities and open research questions in applying RL to generative AI. In particular, we will discuss three types of applications, namely, RL as an alternative way for generation without specified objectives; as a way for generating outputs while concurrently maximizing an objective function; and, finally, as a way of embedding desired characteristics, which cannot be easily captured by means of an objective function, into the generative process. We conclude the survey with an in-depth discussion of the opportunities and challenges in this fascinating emerging area.

Via

Access Paper or Ask Questions

On the Creativity of Large Language Models

Apr 07, 2023

Giorgio Franceschelli, Mirco Musolesi

Abstract:Large Language Models (LLMs) are revolutionizing several areas of Artificial Intelligence. One of the most remarkable applications is creative writing, e.g., poetry or storytelling: the generated outputs are often of astonishing quality. However, a natural question arises: can LLMs be really considered creative? In this article we firstly analyze the development of LLMs under the lens of creativity theories, investigating the key open questions and challenges. Then, we discuss a set of "easy" and "hard" problems in machine creativity, presenting them in relation to LLMs. Finally, we examine the societal impact of these technologies with a particular focus on the creative industries.

Via

Access Paper or Ask Questions

DeepCreativity: Measuring Creativity with Deep Learning Techniques

Jan 16, 2022

Giorgio Franceschelli, Mirco Musolesi

Figure 1 for DeepCreativity: Measuring Creativity with Deep Learning Techniques

Figure 2 for DeepCreativity: Measuring Creativity with Deep Learning Techniques

Abstract:Measuring machine creativity is one of the most fascinating challenges in Artificial Intelligence. This paper explores the possibility of using generative learning techniques for automatic assessment of creativity. The proposed solution does not involve human judgement, it is modular and of general applicability. We introduce a new measure, namely DeepCreativity, based on Margaret Boden's definition of creativity as composed by value, novelty and surprise. We evaluate our methodology (and related measure) considering a case study, i.e., the generation of 19th century American poetry, showing its effectiveness and expressiveness.

* 12 pages, 2 figures

Via

Access Paper or Ask Questions

Copyright in Generative Deep Learning

May 19, 2021

Giorgio Franceschelli, Mirco Musolesi

Abstract:Machine-generated artworks are now part of the contemporary art scene: they are attracting significant investments and they are presented in exhibitions together with those created by human artists. These artworks are mainly based on generative deep learning techniques. Also given their success, several legal problems arise when working with these techniques. In this article we consider a set of key questions in the area of generative deep learning for the arts. Is it possible to use copyrighted works as training set for generative models? How do we legally store their copies in order to perform the training process? And then, who (if someone) will own the copyright on the generated data? We try to answer these questions considering the law in force in both US and EU and the future alternatives, trying to define a set of guidelines for artists and developers working on deep learning generated art.

* 11 pages

Via

Access Paper or Ask Questions