Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Haitao Zhou

ViewCraft3D: High-Fidelity and View-Consistent 3D Vector Graphics Synthesis

May 26, 2025

Chuang Wang, Haitao Zhou, Ling Luo, Qian Yu

Abstract:3D vector graphics play a crucial role in various applications including 3D shape retrieval, conceptual design, and virtual reality interactions due to their ability to capture essential structural information with minimal representation. While recent approaches have shown promise in generating 3D vector graphics, they often suffer from lengthy processing times and struggle to maintain view consistency. To address these limitations, we propose ViewCraft3D (VC3D), an efficient method that leverages 3D priors to generate 3D vector graphics. Specifically, our approach begins with 3D object analysis, employs a geometric extraction algorithm to fit 3D vector graphics to the underlying structure, and applies view-consistent refinement process to enhance visual quality. Our comprehensive experiments demonstrate that VC3D outperforms previous methods in both qualitative and quantitative evaluations, while significantly reducing computational overhead. The resulting 3D sketches maintain view consistency and effectively capture the essential characteristics of the original objects.

Via

Access Paper or Ask Questions

SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation

Nov 26, 2024

Ximing Xing, Qian Yu, Chuang Wang, Haitao Zhou, Jing Zhang, Dong Xu

Figure 1 for SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation

Figure 2 for SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation

Figure 3 for SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation

Figure 4 for SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation

Abstract:Recently, text-guided scalable vector graphics (SVG) synthesis has demonstrated significant potential in domains such as iconography and sketching. However, SVGs generated from existing Text-to-SVG methods often lack editability and exhibit deficiencies in visual quality and diversity. In this paper, we propose a novel text-guided vector graphics synthesis method to address these limitations. To improve the diversity of output SVGs, we present a Vectorized Particle-based Score Distillation (VPSD) approach. VPSD addresses over-saturation issues in existing methods and enhances sample diversity. A pre-trained reward model is incorporated to re-weight vector particles, improving aesthetic appeal and enabling faster convergence. Additionally, we design a novel adaptive vector primitives control strategy, which allows for the dynamic adjustment of the number of primitives, thereby enhancing the presentation of graphic details. Extensive experiments validate the effectiveness of the proposed method, demonstrating its superiority over baseline methods in terms of editability, visual quality, and diversity. We also show that our new method supports up to six distinct vector styles, capable of generating high-quality vector assets suitable for stylized vector design and poster design.

* 17 pages, 17 figures. arXiv admin note: substantial text overlap with arXiv:2312.16476

Via

Access Paper or Ask Questions

TrackGo: A Flexible and Efficient Method for Controllable Video Generation

Aug 21, 2024

Haitao Zhou, Chuang Wang, Rui Nie, Jinxiao Lin, Dongdong Yu, Qian Yu, Changhu Wang

Figure 1 for TrackGo: A Flexible and Efficient Method for Controllable Video Generation

Figure 2 for TrackGo: A Flexible and Efficient Method for Controllable Video Generation

Figure 3 for TrackGo: A Flexible and Efficient Method for Controllable Video Generation

Figure 4 for TrackGo: A Flexible and Efficient Method for Controllable Video Generation

Abstract:Recent years have seen substantial progress in diffusion-based controllable video generation. However, achieving precise control in complex scenarios, including fine-grained object parts, sophisticated motion trajectories, and coherent background movement, remains a challenge. In this paper, we introduce TrackGo, a novel approach that leverages free-form masks and arrows for conditional video generation. This method offers users with a flexible and precise mechanism for manipulating video content. We also propose the TrackAdapter for control implementation, an efficient and lightweight adapter designed to be seamlessly integrated into the temporal self-attention layers of a pretrained video generation model. This design leverages our observation that the attention map of these layers can accurately activate regions corresponding to motion in videos. Our experimental results demonstrate that our new approach, enhanced by the TrackAdapter, achieves state-of-the-art performance on key metrics such as FVD, FID, and ObjMC scores. The project page of TrackGo can be found at: https://zhtjtcz.github.io/TrackGo-Page/

Via

Access Paper or Ask Questions

SVGDreamer: Text Guided SVG Generation with Diffusion Model

Jan 03, 2024

Ximing Xing, Haitao Zhou, Chuang Wang, Jing Zhang, Dong Xu, Qian Yu

Figure 1 for SVGDreamer: Text Guided SVG Generation with Diffusion Model

Figure 2 for SVGDreamer: Text Guided SVG Generation with Diffusion Model

Figure 3 for SVGDreamer: Text Guided SVG Generation with Diffusion Model

Figure 4 for SVGDreamer: Text Guided SVG Generation with Diffusion Model

Abstract:Recently, text-guided scalable vector graphics (SVGs) synthesis has shown promise in domains such as iconography and sketch. However, existing text-to-SVG generation methods lack editability and struggle with visual quality and result diversity. To address these limitations, we propose a novel text-guided vector graphics synthesis method called SVGDreamer. SVGDreamer incorporates a semantic-driven image vectorization (SIVE) process that enables the decomposition of synthesis into foreground objects and background, thereby enhancing editability. Specifically, the SIVE process introduce attention-based primitive control and an attention-mask loss function for effective control and manipulation of individual elements. Additionally, we propose a Vectorized Particle-based Score Distillation (VPSD) approach to tackle the challenges of color over-saturation, vector primitives over-smoothing, and limited result diversity in existing text-to-SVG generation methods. Furthermore, on the basis of VPSD, we introduce Reward Feedback Learning (ReFL) to accelerate VPSD convergence and improve aesthetic appeal. Extensive experiments have been conducted to validate the effectiveness of SVGDreamer, demonstrating its superiority over baseline methods in terms of editability, visual quality, and diversity. The code and demo of SVGDreamer can be found at \href{https://ximinng.github.io/SVGDreamer-project/}{https://ximinng.github.io/SVGDreamer-project/}.

* 19 pages, 15 figures, project link: https://ximinng.github.io/SVGDreamer-project/

Via

Access Paper or Ask Questions

Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis via Stochastic Differential Equations without Training

Aug 15, 2023

Ximing Xing, Chuang Wang, Haitao Zhou, Zhihao Hu, Chongxuan Li, Dong Xu, Qian Yu

Figure 1 for Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis via Stochastic Differential Equations without Training

Figure 2 for Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis via Stochastic Differential Equations without Training

Figure 3 for Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis via Stochastic Differential Equations without Training

Figure 4 for Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis via Stochastic Differential Equations without Training

Abstract:Exemplar-based sketch-to-photo synthesis allows users to generate photo-realistic images based on sketches. Recently, diffusion-based methods have achieved impressive performance on image generation tasks, enabling highly-flexible control through text-driven generation or energy functions. However, generating photo-realistic images with color and texture from sketch images remains challenging for diffusion models. Sketches typically consist of only a few strokes, with most regions left blank, making it difficult for diffusion-based methods to produce photo-realistic images. In this work, we propose a two-stage method named ``Inversion-by-Inversion" for exemplar-based sketch-to-photo synthesis. This approach includes shape-enhancing inversion and full-control inversion. During the shape-enhancing inversion process, an uncolored photo is generated with the guidance of a shape-energy function. This step is essential to ensure control over the shape of the generated photo. In the full-control inversion process, we propose an appearance-energy function to control the color and texture of the final generated photo.Importantly, our Inversion-by-Inversion pipeline is training-free and can accept different types of exemplars for color and texture control. We conducted extensive experiments to evaluate our proposed method, and the results demonstrate its effectiveness.

* 15 pages, preprint version

Via

Access Paper or Ask Questions

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

Jun 26, 2023

Ximing Xing, Chuang Wang, Haitao Zhou, Jing Zhang, Qian Yu, Dong Xu

Figure 1 for DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

Figure 2 for DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

Figure 3 for DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

Figure 4 for DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

Abstract:Even though trained mainly on images, we discover that pretrained diffusion models show impressive power in guiding sketch synthesis. In this paper, we present DiffSketcher, an innovative algorithm that creates vectorized free-hand sketches using natural language input. DiffSketcher is developed based on a pre-trained text-to-image diffusion model. It performs the task by directly optimizing a set of Bezier curves with an extended version of the score distillation sampling (SDS) loss, which allows us to use a raster-level diffusion model as a prior for optimizing a parametric vectorized sketch generator. Furthermore, we explore attention maps embedded in the diffusion model for effective stroke initialization to speed up the generation process. The generated sketches demonstrate multiple levels of abstraction while maintaining recognizability, underlying structure, and essential visual details of the subject drawn. Our experiments show that DiffSketcher achieves greater quality than prior work.

* 13 pages. arXiv admin note: text overlap with arXiv:2209.14988 by other authors

Via

Access Paper or Ask Questions