Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Enhance Image-to-Image Generation with LLaVA Prompt and Negative Prompt

Jun 04, 2024

Zhicheng Ding, Panfeng Li, Qikai Yang, Siyang Li

Share this with someone who'll enjoy it:

Abstract:This paper presents a novel approach to enhance image-to-image generation by leveraging the multimodal capabilities of the Large Language and Vision Assistant (LLaVA). We propose a framework where LLaVA analyzes input images and generates textual descriptions, hereinafter LLaVA-generated prompts. These prompts, along with the original image, are fed into the image-to-image generation pipeline. This enriched representation guides the generation process towards outputs that exhibit a stronger resemblance to the input image. Extensive experiments demonstrate the effectiveness of LLaVA-generated prompts in promoting image similarity. We observe a significant improvement in the visual coherence between the generated and input images compared to traditional methods. Future work will explore fine-tuning LLaVA prompts for increased control over the creative process. By providing more specific details within the prompts, we aim to achieve a delicate balance between faithfulness to the original image and artistic expression in the generated outputs.

* Accepted by 2024 5th International Conference on Information Science, Parallel and Distributed Systems

View paper on

Share this with someone who'll enjoy it:

Title:Enhance Image-to-Image Generation with LLaVA Prompt and Negative Prompt

Paper and Code