Picture for Yuren Cong

Yuren Cong

HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming

Add code
Dec 24, 2025
Figure 1 for HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming
Figure 2 for HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming
Figure 3 for HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming
Figure 4 for HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming
Viaarxiv icon

Mixture of States: Routing Token-Level Dynamics for Multimodal Generation

Add code
Nov 15, 2025
Viaarxiv icon

Learning Flow Fields in Attention for Controllable Person Image Generation

Add code
Dec 12, 2024
Figure 1 for Learning Flow Fields in Attention for Controllable Person Image Generation
Figure 2 for Learning Flow Fields in Attention for Controllable Person Image Generation
Figure 3 for Learning Flow Fields in Attention for Controllable Person Image Generation
Figure 4 for Learning Flow Fields in Attention for Controllable Person Image Generation
Viaarxiv icon

WorldAfford: Affordance Grounding based on Natural Language Instructions

Add code
May 21, 2024
Viaarxiv icon

Segment Any Object Model : Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation

Add code
Mar 16, 2024
Figure 1 for Segment Any Object Model : Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation
Figure 2 for Segment Any Object Model : Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation
Figure 3 for Segment Any Object Model : Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation
Figure 4 for Segment Any Object Model : Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation
Viaarxiv icon

GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation

Add code
Dec 07, 2023
Figure 1 for GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Figure 2 for GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Figure 3 for GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Figure 4 for GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Viaarxiv icon

FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing

Add code
Oct 09, 2023
Figure 1 for FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
Figure 2 for FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
Figure 3 for FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
Figure 4 for FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
Viaarxiv icon

Learning Similarity between Scene Graphs and Images with Transformers

Add code
Apr 02, 2023
Figure 1 for Learning Similarity between Scene Graphs and Images with Transformers
Figure 2 for Learning Similarity between Scene Graphs and Images with Transformers
Figure 3 for Learning Similarity between Scene Graphs and Images with Transformers
Figure 4 for Learning Similarity between Scene Graphs and Images with Transformers
Viaarxiv icon

Attribute-Centric Compositional Text-to-Image Generation

Add code
Jan 04, 2023
Figure 1 for Attribute-Centric Compositional Text-to-Image Generation
Figure 2 for Attribute-Centric Compositional Text-to-Image Generation
Figure 3 for Attribute-Centric Compositional Text-to-Image Generation
Figure 4 for Attribute-Centric Compositional Text-to-Image Generation
Viaarxiv icon

SSGVS: Semantic Scene Graph-to-Video Synthesis

Add code
Nov 17, 2022
Viaarxiv icon