Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

Nov 30, 2023

Siteng Huang, Biao Gong, Yutong Feng, Xi Chen, Yuqian Fu, Yu Liu, Donglin Wang

Figure 1 for Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

Figure 2 for Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

Figure 3 for Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

Figure 4 for Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

Share this with someone who'll enjoy it:

Abstract:This study focuses on a novel task in text-to-image (T2I) generation, namely action customization. The objective of this task is to learn the co-existing action from limited data and generalize it to unseen humans or even animals. Experimental results show that existing subject-driven customization methods fail to learn the representative characteristics of actions and struggle in decoupling actions from context features, including appearance. To overcome the preference for low-level features and the entanglement of high-level features, we propose an inversion-based method Action-Disentangled Identifier (ADI) to learn action-specific identifiers from the exemplar images. ADI first expands the semantic conditioning space by introducing layer-wise identifier tokens, thereby increasing the representational richness while distributing the inversion across different features. Then, to block the inversion of action-agnostic features, ADI extracts the gradient invariance from the constructed sample triples and masks the updates of irrelevant channels. To comprehensively evaluate the task, we present an ActionBench that includes a variety of actions, each accompanied by meticulously selected samples. Both quantitative and qualitative results show that our ADI outperforms existing baselines in action-customized T2I generation. Our project page is at https://adi-t2i.github.io/ADI.

View paper on

Share this with someone who'll enjoy it:

Title:Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

Paper and Code