Picture for Fengyuan Shi

Fengyuan Shi

Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation

Add code
Sep 06, 2024
Viaarxiv icon

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

Add code
Dec 05, 2023
Viaarxiv icon

Bridging The Gaps Between Token Pruning and Full Pre-training via Masked Fine-tuning

Add code
Oct 26, 2023
Viaarxiv icon

Progressive Visual Prompt Learning with Contrastive Feature Re-formation

Add code
Apr 17, 2023
Viaarxiv icon

Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding

Add code
Sep 28, 2022
Figure 1 for Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
Figure 2 for Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
Figure 3 for Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
Figure 4 for Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
Viaarxiv icon

End-to-End Dense Video Grounding via Parallel Regression

Add code
Sep 23, 2021
Figure 1 for End-to-End Dense Video Grounding via Parallel Regression
Figure 2 for End-to-End Dense Video Grounding via Parallel Regression
Figure 3 for End-to-End Dense Video Grounding via Parallel Regression
Figure 4 for End-to-End Dense Video Grounding via Parallel Regression
Viaarxiv icon