Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuhua Tang

MagicNaming: Consistent Identity Generation by Finding a "Name Space" in T2I Diffusion Models

Dec 19, 2024

Jing Zhao, Heliang Zheng, Chaoyue Wang, Long Lan, Wanrong Hunag, Yuhua Tang

Figure 1 for MagicNaming: Consistent Identity Generation by Finding a "Name Space" in T2I Diffusion Models

Figure 2 for MagicNaming: Consistent Identity Generation by Finding a "Name Space" in T2I Diffusion Models

Figure 3 for MagicNaming: Consistent Identity Generation by Finding a "Name Space" in T2I Diffusion Models

Figure 4 for MagicNaming: Consistent Identity Generation by Finding a "Name Space" in T2I Diffusion Models

Abstract:Large-scale text-to-image diffusion models, (e.g., DALL-E, SDXL) are capable of generating famous persons by simply referring to their names. Is it possible to make such models generate generic identities as simple as the famous ones, e.g., just use a name? In this paper, we explore the existence of a "Name Space", where any point in the space corresponds to a specific identity. Fortunately, we find some clues in the feature space spanned by text embedding of celebrities' names. Specifically, we first extract the embeddings of celebrities' names in the Laion5B dataset with the text encoder of diffusion models. Such embeddings are used as supervision to learn an encoder that can predict the name (actually an embedding) of a given face image. We experimentally find that such name embeddings work well in promising the generated image with good identity consistency. Note that like the names of celebrities, our predicted name embeddings are disentangled from the semantics of text inputs, making the original generation capability of text-to-image models well-preserved. Moreover, by simply plugging such name embeddings, all variants (e.g., from Civitai) derived from the same base model (i.e., SDXL) readily become identity-aware text-to-image models. Project homepage: \url{https://magicfusion.github.io/MagicNaming/}.

* AAAI 2025
* Accepted by AAAI 2025

Via

Access Paper or Ask Questions

Context-Driven Index Trimming: A Data Quality Perspective to Enhancing Precision of RALMs

Aug 10, 2024

Kexin Ma, Ruochun Jin, Xi Wang, Huan Chen, Jing Ren, Yuhua Tang

Figure 1 for Context-Driven Index Trimming: A Data Quality Perspective to Enhancing Precision of RALMs

Figure 2 for Context-Driven Index Trimming: A Data Quality Perspective to Enhancing Precision of RALMs

Figure 3 for Context-Driven Index Trimming: A Data Quality Perspective to Enhancing Precision of RALMs

Figure 4 for Context-Driven Index Trimming: A Data Quality Perspective to Enhancing Precision of RALMs

Abstract:Retrieval-Augmented Large Language Models (RALMs) have made significant strides in enhancing the accuracy of generated responses.However, existing research often overlooks the data quality issues within retrieval results, often caused by inaccurate existing vector-distance-based retrieval methods.We propose to boost the precision of RALMs' answers from a data quality perspective through the Context-Driven Index Trimming (CDIT) framework, where Context Matching Dependencies (CMDs) are employed as logical data quality rules to capture and regulate the consistency between retrieved contexts.Based on the semantic comprehension capabilities of Large Language Models (LLMs), CDIT can effectively identify and discard retrieval results that are inconsistent with the query context and further modify indexes in the database, thereby improving answer quality.Experiments demonstrate on challenging question-answering tasks.Also, the flexibility of CDIT is verified through its compatibility with various language models and indexing methods, which offers a promising approach to bolster RALMs' data quality and retrieval precision jointly.

Via

Access Paper or Ask Questions

Towards Radar Emitter Recognition in Changing Environments with Domain Generalization

Feb 18, 2023

Honglin Wu, Xueqiong Li, Long Lan, Liyang Xu, Yuhua Tang

Abstract:Analyzing radar signals from complex Electronic Warfare (EW) environment is a non-trivial task.However, in the real world, the changing EW environment results in inconsistent signal distribution, such as the pulse repetition interval (PRI) mismatch between different detected scenes.In this paper, we propose a novel domain generalization framework to improve the adaptability of signal recognition in changing environments.Specifically, we first design several noise generators to simulate varied scenes. Different from conventional augmentation methods, our introduced generators carefully enhance the diversity of the detected signals and meanwhile maintain the semantic features of the signals. Moreover, we propose a signal scene domain classifier that works in the manner of adversarial learning. The proposed classifier guarantees the signal predictor to generalize to different scenes. Extensive comparative experiments prove the proposed method's superiority.

Via

Access Paper or Ask Questions