Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fengyin Lin

DiffSketching: Sketch Control Image Synthesis with Diffusion Models

May 30, 2023

Qiang Wang, Di Kong, Fengyin Lin, Yonggang Qi

Figure 1 for DiffSketching: Sketch Control Image Synthesis with Diffusion Models

Figure 2 for DiffSketching: Sketch Control Image Synthesis with Diffusion Models

Figure 3 for DiffSketching: Sketch Control Image Synthesis with Diffusion Models

Figure 4 for DiffSketching: Sketch Control Image Synthesis with Diffusion Models

Abstract:Creative sketch is a universal way of visual expression, but translating images from an abstract sketch is very challenging. Traditionally, creating a deep learning model for sketch-to-image synthesis needs to overcome the distorted input sketch without visual details, and requires to collect large-scale sketch-image datasets. We first study this task by using diffusion models. Our model matches sketches through the cross domain constraints, and uses a classifier to guide the image synthesis more accurately. Extensive experiments confirmed that our method can not only be faithful to user's input sketches, but also maintain the diversity and imagination of synthetic image results. Our model can beat GAN-based method in terms of generation quality and human evaluation, and does not rely on massive sketch-image datasets. Additionally, we present applications of our method in image editing and interpolation.

Via

Access Paper or Ask Questions

Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style

Mar 25, 2023

Fengyin Lin, Mingkang Li, Da Li, Timothy Hospedales, Yi-Zhe Song, Yonggang Qi

Figure 1 for Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style

Figure 2 for Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style

Figure 3 for Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style

Figure 4 for Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style

Abstract:This paper studies the problem of zero-short sketch-based image retrieval (ZS-SBIR), however with two significant differentiators to prior art (i) we tackle all variants (inter-category, intra-category, and cross datasets) of ZS-SBIR with just one network (``everything''), and (ii) we would really like to understand how this sketch-photo matching operates (``explainable''). Our key innovation lies with the realization that such a cross-modal matching problem could be reduced to comparisons of groups of key local patches -- akin to the seasoned ``bag-of-words'' paradigm. Just with this change, we are able to achieve both of the aforementioned goals, with the added benefit of no longer requiring external semantic knowledge. Technically, ours is a transformer-based cross-modal network, with three novel components (i) a self-attention module with a learnable tokenizer to produce visual tokens that correspond to the most informative local regions, (ii) a cross-attention module to compute local correspondences between the visual tokens across two modalities, and finally (iii) a kernel-based relation network to assemble local putative matches and produce an overall similarity metric for a sketch-photo pair. Experiments show ours indeed delivers superior performances across all ZS-SBIR settings. The all important explainable goal is elegantly achieved by visualizing cross-modal token correspondences, and for the first time, via sketch to photo synthesis by universal replacement of all matched photo patches. Code and model are available at \url{https://github.com/buptLinfy/ZSE-SBIR}.

* CVPR 2023 (Highlight)

Via

Access Paper or Ask Questions