Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:One Category One Prompt: Dataset Distillation using Diffusion Models

Mar 11, 2024

Ali Abbasi, Ashkan Shahbazi, Hamed Pirsiavash, Soheil Kolouri

Figure 1 for One Category One Prompt: Dataset Distillation using Diffusion Models

Figure 2 for One Category One Prompt: Dataset Distillation using Diffusion Models

Figure 3 for One Category One Prompt: Dataset Distillation using Diffusion Models

Figure 4 for One Category One Prompt: Dataset Distillation using Diffusion Models

Share this with someone who'll enjoy it:

Abstract:The extensive amounts of data required for training deep neural networks pose significant challenges on storage and transmission fronts. Dataset distillation has emerged as a promising technique to condense the information of massive datasets into a much smaller yet representative set of synthetic samples. However, traditional dataset distillation approaches often struggle to scale effectively with high-resolution images and more complex architectures due to the limitations in bi-level optimization. Recently, several works have proposed exploiting knowledge distillation with decoupled optimization schemes to scale up dataset distillation. Although these methods effectively address the scalability issue, they rely on extensive image augmentations requiring the storage of soft labels for augmented images. In this paper, we introduce Dataset Distillation using Diffusion Models (D3M) as a novel paradigm for dataset distillation, leveraging recent advancements in generative text-to-image foundation models. Our approach utilizes textual inversion, a technique for fine-tuning text-to-image generative models, to create concise and informative representations for large datasets. By employing these learned text prompts, we can efficiently store and infer new samples for introducing data variability within a fixed memory budget. We show the effectiveness of our method through extensive experiments across various computer vision benchmark datasets with different memory budgets.

View paper on

Share this with someone who'll enjoy it:

Title:One Category One Prompt: Dataset Distillation using Diffusion Models

Paper and Code