Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiachen Lei

SurrogatePrompt: Bypassing the Safety Filter of Text-To-Image Models via Substitution

Sep 25, 2023

Zhongjie Ba, Jieming Zhong, Jiachen Lei, Peng Cheng, Qinglong Wang, Zhan Qin, Zhibo Wang, Kui Ren

Figure 1 for SurrogatePrompt: Bypassing the Safety Filter of Text-To-Image Models via Substitution

Figure 2 for SurrogatePrompt: Bypassing the Safety Filter of Text-To-Image Models via Substitution

Figure 3 for SurrogatePrompt: Bypassing the Safety Filter of Text-To-Image Models via Substitution

Figure 4 for SurrogatePrompt: Bypassing the Safety Filter of Text-To-Image Models via Substitution

Abstract:Advanced text-to-image models such as DALL-E 2 and Midjourney possess the capacity to generate highly realistic images, raising significant concerns regarding the potential proliferation of unsafe content. This includes adult, violent, or deceptive imagery of political figures. Despite claims of rigorous safety mechanisms implemented in these models to restrict the generation of not-safe-for-work (NSFW) content, we successfully devise and exhibit the first prompt attacks on Midjourney, resulting in the production of abundant photorealistic NSFW images. We reveal the fundamental principles of such prompt attacks and suggest strategically substituting high-risk sections within a suspect prompt to evade closed-source safety measures. Our novel framework, SurrogatePrompt, systematically generates attack prompts, utilizing large language models, image-to-text, and image-to-image modules to automate attack prompt creation at scale. Evaluation results disclose an 88% success rate in bypassing Midjourney's proprietary safety filter with our attack prompts, leading to the generation of counterfeit images depicting political figures in violent scenarios. Both subjective and objective assessments validate that the images generated from our attack prompts present considerable safety hazards.

* 14 pages, 11 figures

Via

Access Paper or Ask Questions

Masked Diffusion Models are Fast Learners

Jun 20, 2023

Jiachen Lei, Peng Cheng, Zhongjie Ba, Kui Ren

Abstract:Diffusion models have emerged as the de-facto technique for image generation, yet they entail significant computational overhead, hindering the technique's broader application in the research community. We propose a prior-based denoising training framework, the first to incorporate the pre-train and fine-tune paradigm into the diffusion model training process, which substantially improves training efficiency and shows potential in facilitating various downstream tasks. Our approach centers on masking a high proportion (e.g., up to 90%) of the input image and employing masked score matching to denoise the visible areas, thereby guiding the diffusion model to learn more salient features from training data as prior knowledge. By utilizing this masked learning process in a pre-training stage, we efficiently train the ViT-based diffusion model on CelebA-HQ 256x256 in the pixel space, achieving a 4x acceleration and enhancing the quality of generated images compared to DDPM. Moreover, our masked pre-training technique is universally applicable to various diffusion models that directly generate images in the pixel space and facilitates learning pre-trained models with excellent generalizability: a diffusion model pre-trained on VGGFace2 attains a 46% quality improvement through fine-tuning with merely 10% local data. Our code is available at https://github.com/jiachenlei/maskdm.

Via

Access Paper or Ask Questions

Masked Autoencoders for Egocentric Video Understanding @ Ego4D Challenge 2022

Nov 18, 2022

Jiachen Lei, Shuang Ma, Zhongjie Ba, Sai Vemprala, Ashish Kapoor, Kui Ren

Abstract:In this report, we present our approach and empirical results of applying masked autoencoders in two egocentric video understanding tasks, namely, Object State Change Classification and PNR Temporal Localization, of Ego4D Challenge 2022. As team TheSSVL, we ranked 2nd place in both tasks. Our code will be made available.

* 5 pages

Via

Access Paper or Ask Questions