Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ahmed Khalifa

Evolutionary Level Repair

Jun 24, 2025

Debosmita Bhaumik, Julian Togelius, Georgios N. Yannakakis, Ahmed Khalifa

Abstract:We address the problem of game level repair, which consists of taking a designed but non-functional game level and making it functional. This might consist of ensuring the completeness of the level, reachability of objects, or other performance characteristics. The repair problem may also be constrained in that it can only make a small number of changes to the level. We investigate search-based solutions to the level repair problem, particularly using evolutionary and quality-diversity algorithms, with good results. This level repair method is applied to levels generated using a machine learning-based procedural content generation (PCGML) method that generates stylistically appropriate but frequently broken levels. This combination of PCGML for generation and search-based methods for repair shows great promise as a hybrid procedural content generation (PCG) method.

Via

Access Paper or Ask Questions

ScriptDoctor: Automatic Generation of PuzzleScript Games via Large Language Models and Tree Search

Jun 06, 2025

Sam Earle, Ahmed Khalifa, Muhammad Umair Nasir, Zehua Jiang, Graham Todd, Andrzej Banburski-Fahey, Julian Togelius

Abstract:There is much interest in using large pre-trained models in Automatic Game Design (AGD), whether via the generation of code, assets, or more abstract conceptualization of design ideas. But so far this interest largely stems from the ad hoc use of such generative models under persistent human supervision. Much work remains to show how these tools can be integrated into longer-time-horizon AGD pipelines, in which systems interface with game engines to test generated content autonomously. To this end, we introduce ScriptDoctor, a Large Language Model (LLM)-driven system for automatically generating and testing games in PuzzleScript, an expressive but highly constrained description language for turn-based puzzle games over 2D gridworlds. ScriptDoctor generates and tests game design ideas in an iterative loop, where human-authored examples are used to ground the system's output, compilation errors from the PuzzleScript engine are used to elicit functional code, and search-based agents play-test generated games. ScriptDoctor serves as a concrete example of the potential of automated, open-ended LLM-based workflows in generating novel game content.

* 5 pages, 3 figures, 3 tables, submitted to IEEE Conference on Games as a Short Paper

Via

Access Paper or Ask Questions

The Procedural Content Generation Benchmark: An Open-source Testbed for Generative Challenges in Games

Mar 27, 2025

Ahmed Khalifa, Roberto Gallotta, Matthew Barthet, Antonios Liapis, Julian Togelius, Georgios N. Yannakakis

Abstract:This paper introduces the Procedural Content Generation Benchmark for evaluating generative algorithms on different game content creation tasks. The benchmark comes with 12 game-related problems with multiple variants on each problem. Problems vary from creating levels of different kinds to creating rule sets for simple arcade games. Each problem has its own content representation, control parameters, and evaluation metrics for quality, diversity, and controllability. This benchmark is intended as a first step towards a standardized way of comparing generative algorithms. We use the benchmark to score three baseline algorithms: a random generator, an evolution strategy, and a genetic algorithm. Results show that some problems are easier to solve than others, as well as the impact the chosen objective has on quality, diversity, and controllability of the generated artifacts.

* 12 pages, 4 figures, 2 tables, published at FDG2025

Via

Access Paper or Ask Questions

Affectively Framework: Towards Human-like Affect-Based Agents

Jul 25, 2024

Matthew Barthet, Roberto Gallotta, Ahmed Khalifa, Antonios Liapis, Georgios N. Yannakakis

Figure 1 for Affectively Framework: Towards Human-like Affect-Based Agents

Figure 2 for Affectively Framework: Towards Human-like Affect-Based Agents

Figure 3 for Affectively Framework: Towards Human-like Affect-Based Agents

Figure 4 for Affectively Framework: Towards Human-like Affect-Based Agents

Abstract:Game environments offer a unique opportunity for training virtual agents due to their interactive nature, which provides diverse play traces and affect labels. Despite their potential, no reinforcement learning framework incorporates human affect models as part of their observation space or reward mechanism. To address this, we present the \emph{Affectively Framework}, a set of Open-AI Gym environments that integrate affect as part of the observation space. This paper introduces the framework and its three game environments and provides baseline experiments to validate its effectiveness and potential.

* 5 pages, 2 figures, 2 tables

Via

Access Paper or Ask Questions

Evolutionary Machine Learning and Games

Nov 20, 2023

Julian Togelius, Ahmed Khalifa, Sam Earle, Michael Cerny Green, Lisa Soros

Figure 1 for Evolutionary Machine Learning and Games

Figure 2 for Evolutionary Machine Learning and Games

Figure 3 for Evolutionary Machine Learning and Games

Figure 4 for Evolutionary Machine Learning and Games

Abstract:Evolutionary machine learning (EML) has been applied to games in multiple ways, and for multiple different purposes. Importantly, AI research in games is not only about playing games; it is also about generating game content, modeling players, and many other applications. Many of these applications pose interesting problems for EML. We will structure this chapter on EML for games based on whether evolution is used to augment machine learning (ML) or ML is used to augment evolution. For completeness, we also briefly discuss the usage of ML and evolution separately in games.

* 27 pages, 5 figures, part of Evolutionary Machine Learning Book (https://link.springer.com/book/10.1007/978-981-99-3814-8)

Via

Access Paper or Ask Questions

A Preliminary Study on a Conceptual Game Feature Generation and Recommendation System

Aug 16, 2023

M Charity, Yash Bhartia, Daniel Zhang, Ahmed Khalifa, Julian Togelius

Figure 1 for A Preliminary Study on a Conceptual Game Feature Generation and Recommendation System

Figure 2 for A Preliminary Study on a Conceptual Game Feature Generation and Recommendation System

Figure 3 for A Preliminary Study on a Conceptual Game Feature Generation and Recommendation System

Abstract:This paper introduces a system used to generate game feature suggestions based on a text prompt. Trained on the game descriptions of almost 60k games, it uses the word embeddings of a small GLoVe model to extract features and entities found in thematically similar games which are then passed through a generator model to generate new features for a user's prompt. We perform a short user study comparing the features generated from a fine-tuned GPT-2 model, a model using the ConceptNet, and human-authored game features. Although human suggestions won the overall majority of votes, the GPT-2 model outperformed the human suggestions in certain games. This system is part of a larger game design assistant tool that is able to collaborate with users at a conceptual level.

Via

Access Paper or Ask Questions

Lode Enhancer: Level Co-creation Through Scaling

Aug 03, 2023

Debosmita Bhaumik, Julian Togelius, Georgios N. Yannakakis, Ahmed Khalifa

Abstract:We explore AI-powered upscaling as a design assistance tool in the context of creating 2D game levels. Deep neural networks are used to upscale artificially downscaled patches of levels from the puzzle platformer game Lode Runner. The trained networks are incorporated into a web-based editor, where the user can create and edit levels at three different levels of resolution: 4x4, 8x8, and 16x16. An edit at any resolution instantly transfers to the other resolutions. As upscaling requires inventing features that might not be present at lower resolutions, we train neural networks to reproduce these features. We introduce a neural network architecture that is capable of not only learning upscaling but also giving higher priority to less frequent tiles. To investigate the potential of this tool and guide further development, we conduct a qualitative study with 3 designers to understand how they use it. Designers enjoyed co-designing with the tool, liked its underlying concept, and provided feedback for further improvement.

Via

Access Paper or Ask Questions

Lode Encoder: AI-constrained co-creativity

Aug 02, 2023

Debosmita Bhaumik, Ahmed Khalifa, Julian Togelius

Abstract:We present Lode Encoder, a gamified mixed-initiative level creation system for the classic platform-puzzle game Lode Runner. The system is built around several autoencoders which are trained on sets of Lode Runner levels. When fed with the user's design, each autoencoder produces a version of that design which is closer in style to the levels that it was trained on. The Lode Encoder interface allows the user to build and edit levels through 'painting' from the suggestions provided by the autoencoders. Crucially, in order to encourage designers to explore new possibilities, the system does not include more traditional editing tools. We report on the system design and training procedure, as well as on the evolution of the system itself and user tests.

* 2021 IEEE Conference on Games (CoG), Copenhagen, Denmark, 2021, pp. 01-08

Via

Access Paper or Ask Questions

Controllable Path of Destruction

May 31, 2023

Matthew Siper, Sam Earle, Zehua Jiang, Ahmed Khalifa, Julian Togelius

Abstract:Path of Destruction (PoD) is a self-supervised method for learning iterative generators. The core idea is to produce a training set by destroying a set of artifacts, and for each destructive step create a training instance based on the corresponding repair action. A generator trained on this dataset can then generate new artifacts by repairing from arbitrary states. The PoD method is very data-efficient in terms of original training examples and well-suited to functional artifacts composed of categorical data, such as game levels and discrete 3D structures. In this paper, we extend the Path of Destruction method to allow designer control over aspects of the generated artifacts. Controllability is introduced by adding conditional inputs to the state-action pairs that make up the repair trajectories. We test the controllable PoD method in a 2D dungeon setting, as well as in the domain of small 3D Lego cars.

* 8 pages, 6 figures, and 2 tables. Published at CoG Conference 2023

Via

Access Paper or Ask Questions

Play with Emotion: Affect-Driven Reinforcement Learning

Aug 26, 2022

Matthew Barthet, Ahmed Khalifa, Antonios Liapis, Georgios N. Yannakakis

Figure 1 for Play with Emotion: Affect-Driven Reinforcement Learning

Figure 2 for Play with Emotion: Affect-Driven Reinforcement Learning

Figure 3 for Play with Emotion: Affect-Driven Reinforcement Learning

Figure 4 for Play with Emotion: Affect-Driven Reinforcement Learning

Abstract:This paper introduces a paradigm shift by viewing the task of affect modeling as a reinforcement learning (RL) process. According to the proposed paradigm, RL agents learn a policy (i.e. affective interaction) by attempting to maximize a set of rewards (i.e. behavioral and affective patterns) via their experience with their environment (i.e. context). Our hypothesis is that RL is an effective paradigm for interweaving affect elicitation and manifestation with behavioral and affective demonstrations. Importantly, our second hypothesis-building on Damasio's somatic marker hypothesis-is that emotion can be the facilitator of decision-making. We test our hypotheses in a racing game by training Go-Blend agents to model human demonstrations of arousal and behavior; Go-Blend is a modified version of the Go-Explore algorithm which has recently showcased supreme performance in hard exploration tasks. We first vary the arousal-based reward function and observe agents that can effectively display a palette of affect and behavioral patterns according to the specified reward. Then we use arousal-based state selection mechanisms in order to bias the strategies that Go-Blend explores. Our findings suggest that Go-Blend not only is an efficient affect modeling paradigm but, more importantly, affect-driven RL improves exploration and yields higher performing agents, validating Damasio's hypothesis in the domain of games.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions