Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jonas Oppenlaender

An Initial Exploration of Default Images in Text-to-Image Generation

May 14, 2025

Hannu Simonen, Atte Kiviniemi, Jonas Oppenlaender

Abstract:In the creative practice of text-to-image generation (TTI), images are generated from text prompts. However, TTI models are trained to always yield an output, even if the prompt contains unknown terms. In this case, the model may generate what we call "default images": images that closely resemble each other across many unrelated prompts. We argue studying default images is valuable for designing better solutions for TTI and prompt engineering. In this paper, we provide the first investigation into default images on Midjourney, a popular image generator. We describe our systematic approach to create input prompts triggering default images, and present the results of our initial experiments and several small-scale ablation studies. We also report on a survey study investigating how default images affect user satisfaction. Our work lays the foundation for understanding default images in TTI and highlights challenges and future research directions.

* 16 pages, 6 figures

Via

Access Paper or Ask Questions

Artworks Reimagined: Exploring Human-AI Co-Creation through Body Prompting

Aug 10, 2024

Jonas Oppenlaender, Hannah Johnston, Johanna Silvennoinen, Helena Barranha

Figure 1 for Artworks Reimagined: Exploring Human-AI Co-Creation through Body Prompting

Figure 2 for Artworks Reimagined: Exploring Human-AI Co-Creation through Body Prompting

Figure 3 for Artworks Reimagined: Exploring Human-AI Co-Creation through Body Prompting

Figure 4 for Artworks Reimagined: Exploring Human-AI Co-Creation through Body Prompting

Abstract:Image generation using generative artificial intelligence is a popular activity. However, it is almost exclusively performed in the privacy of an individual's home via typing on a keyboard. In this article, we explore body prompting as input for image generation. Body prompting extends interaction with generative AI beyond textual inputs to reconnect the creative act of image generation with the physical act of creating artworks. We implement this concept in an interactive art installation, Artworks Reimagined, designed to transform artworks via body prompting. We deployed the installation at an event with hundreds of visitors in a public and private setting. Our results from a sample of visitors (N=79) show that body prompting was well-received and provides an engaging and fun experience. We identify three distinct patterns of embodied interaction with the generative AI and present insights into participants' experience of body prompting and AI co-creation. We provide valuable recommendations for practitioners seeking to design interactive generative AI experiences in museums, galleries, and other public cultural spaces.

* 16 pages, 5 figures, 2 tables

Via

Access Paper or Ask Questions

The Cultivated Practices of Text-to-Image Generation

Jun 20, 2023

Jonas Oppenlaender

Abstract:Humankind is entering a novel creative era in which anybody can synthesize digital information using generative artificial intelligence (AI). Text-to-image generation, in particular, has become vastly popular and millions of practitioners produce AI-generated images and AI art online. This chapter first gives an overview of the key developments that enabled a healthy co-creative online ecosystem around text-to-image generation to rapidly emerge, followed by a high-level description of key elements in this ecosystem. A particular focus is placed on prompt engineering, a creative practice that has been embraced by the AI art community. It is then argued that the emerging co-creative ecosystem constitutes an intelligent system on its own - a system that both supports human creativity, but also potentially entraps future generations and limits future development efforts in AI. The chapter discusses the potential risks and dangers of cultivating this co-creative ecosystem, such as the bias inherent in today's training data, potential quality degradation in future image generation systems due to synthetic data becoming common place, and the potential long-term effects of text-to-image generation on people's imagination, ambitions, and development.

* In "Humane autonomous technology - Re-thinking experience with and in intelligent systems", Palgrave Macmillan, 2024

Via

Access Paper or Ask Questions

Perceptions and Realities of Text-to-Image Generation

Jun 14, 2023

Jonas Oppenlaender, Johanna Silvennoinen, Ville Paananen, Aku Visuri

Figure 1 for Perceptions and Realities of Text-to-Image Generation

Figure 2 for Perceptions and Realities of Text-to-Image Generation

Figure 3 for Perceptions and Realities of Text-to-Image Generation

Figure 4 for Perceptions and Realities of Text-to-Image Generation

Abstract:Generative artificial intelligence (AI) is a widely popular technology that will have a profound impact on society and individuals. Less than a decade ago, it was thought that creative work would be among the last to be automated - yet today, we see AI encroaching on many creative domains. In this paper, we present the findings of a survey study on people's perceptions of text-to-image generation. We touch on participants' technical understanding of the emerging technology, their fears and concerns, and thoughts about risks and dangers of text-to-image generation to the individual and society. We find that while participants were aware of the risks and dangers associated with the technology, only few participants considered the technology to be a personal risk. The risks for others were more easy to recognize for participants. Artists were particularly seen at risk. Participants who had tried the technology rated its future importance lower than those who had not tried it. This result shows that many people are still oblivious of the potential personal risks of generative artificial intelligence and the impending societal changes associated with this technology.

* 16 pages, 5 figures

Via

Access Paper or Ask Questions

Mapping the Challenges of HCI: An Application and Evaluation of ChatGPT and GPT-4 for Cost-Efficient Question Answering

Jun 08, 2023

Jonas Oppenlaender, Joonas Hämäläinen

Abstract:Large language models (LLMs), such as ChatGPT and GPT-4, are gaining wide-spread real world use. Yet, the two LLMs are closed source, and little is known about the LLMs' performance in real-world use cases. In academia, LLM performance is often measured on benchmarks which may have leaked into ChatGPT's and GPT-4's training data. In this paper, we apply and evaluate ChatGPT and GPT-4 for the real-world task of cost-efficient extractive question answering over a text corpus that was published after the two LLMs completed training. More specifically, we extract research challenges for researchers in the field of HCI from the proceedings of the 2023 Conference on Human Factors in Computing Systems (CHI). We critically evaluate the LLMs on this practical task and conclude that the combination of ChatGPT and GPT-4 makes an excellent cost-efficient means for analyzing a text corpus at scale. Cost-efficiency is key for prototyping research ideas and analyzing text corpora from different perspectives, with implications for applying LLMs in academia and practice. For researchers in HCI, we contribute an interactive visualization of 4392 research challenges in over 90 research topics. We share this visualization and the dataset in the spirit of open science.

* 12 pages, 3 figures, 4 tables

Via

Access Paper or Ask Questions

Using Text-to-Image Generation for Architectural Design Ideation

Apr 20, 2023

Ville Paananen, Jonas Oppenlaender, Aku Visuri

Figure 1 for Using Text-to-Image Generation for Architectural Design Ideation

Figure 2 for Using Text-to-Image Generation for Architectural Design Ideation

Figure 3 for Using Text-to-Image Generation for Architectural Design Ideation

Figure 4 for Using Text-to-Image Generation for Architectural Design Ideation

Abstract:The recent progress of text-to-image generation has been recognized in architectural design. Our study is the first to investigate the potential of text-to-image generators in supporting creativity during the early stages of the architectural design process. We conducted a laboratory study with 17 architecture students, who developed a concept for a culture center using three popular text-to-image generators: Midjourney, Stable Diffusion, and DALL-E. Through standardized questionnaires and group interviews, we found that image generation could be a meaningful part of the design process when design constraints are carefully considered. Generative tools support serendipitous discovery of ideas and an imaginative mindset, enriching the design process. We identified several challenges of image generators and provided considerations for software development and educators to support creativity and emphasize designers' imaginative mindset. By understanding the limitations and potential of text-to-image generators, architects and designers can leverage this technology in their design process and education, facilitating innovation and effective communication of concepts.

Via

Access Paper or Ask Questions

Prompt Engineering for Text-Based Generative Art

Apr 20, 2022

Jonas Oppenlaender

Figure 1 for Prompt Engineering for Text-Based Generative Art

Figure 2 for Prompt Engineering for Text-Based Generative Art

Figure 3 for Prompt Engineering for Text-Based Generative Art

Abstract:Text-based generative art has seen an explosion of interest in 2021. Online communities around text-based generative art as a novel digital medium have quickly emerged. This short paper identifies five types of prompt modifiers used by practitioners in the community of text-based generative art based on a 3-month ethnographic study on Twitter. The novel taxonomy of prompt modifiers provides researchers a conceptual starting point for investigating the practices of text-based generative art, but also may help practitioners of text-based generative art improve their images. The paper concludes with a discussion of research opportunities in the space of text-based generative art and the broader implications of prompt engineering from the perspective of human-AI interaction in future applications beyond the use case of text-based generative art.

Via

Access Paper or Ask Questions