Picture for Zhenheng Yang

Zhenheng Yang

InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption

Add code
Dec 12, 2024
Viaarxiv icon

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Add code
Aug 22, 2024
Figure 1 for Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Figure 2 for Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Figure 3 for Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Figure 4 for Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Viaarxiv icon

OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation

Add code
Jul 02, 2024
Viaarxiv icon

Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency

Add code
Mar 23, 2021
Figure 1 for Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency
Figure 2 for Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency
Figure 3 for Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency
Figure 4 for Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency
Viaarxiv icon

SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization

Add code
Sep 01, 2020
Figure 1 for SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization
Figure 2 for SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization
Figure 3 for SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization
Figure 4 for SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization
Viaarxiv icon

Activity Driven Weakly Supervised Object Detection

Add code
Apr 02, 2019
Figure 1 for Activity Driven Weakly Supervised Object Detection
Figure 2 for Activity Driven Weakly Supervised Object Detection
Figure 3 for Activity Driven Weakly Supervised Object Detection
Figure 4 for Activity Driven Weakly Supervised Object Detection
Viaarxiv icon

Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding

Add code
Oct 14, 2018
Figure 1 for Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding
Figure 2 for Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding
Figure 3 for Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding
Figure 4 for Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding
Viaarxiv icon

Joint Unsupervised Learning of Optical Flow and Depth by Watching Stereo Videos

Add code
Oct 08, 2018
Figure 1 for Joint Unsupervised Learning of Optical Flow and Depth by Watching Stereo Videos
Figure 2 for Joint Unsupervised Learning of Optical Flow and Depth by Watching Stereo Videos
Figure 3 for Joint Unsupervised Learning of Optical Flow and Depth by Watching Stereo Videos
Figure 4 for Joint Unsupervised Learning of Optical Flow and Depth by Watching Stereo Videos
Viaarxiv icon

Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding

Add code
Aug 15, 2018
Figure 1 for Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding
Figure 2 for Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding
Figure 3 for Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding
Figure 4 for Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding
Viaarxiv icon

Occlusion Aware Unsupervised Learning of Optical Flow

Add code
Apr 04, 2018
Figure 1 for Occlusion Aware Unsupervised Learning of Optical Flow
Figure 2 for Occlusion Aware Unsupervised Learning of Optical Flow
Figure 3 for Occlusion Aware Unsupervised Learning of Optical Flow
Figure 4 for Occlusion Aware Unsupervised Learning of Optical Flow
Viaarxiv icon