Picture for Ziyun Zeng

Ziyun Zeng

MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models

Add code
Oct 13, 2024
Viaarxiv icon

PromptFix: You Prompt and We Fix the Photo

Add code
May 27, 2024
Viaarxiv icon

GMMFormer: Gaussian-Mixture-Model based Transformer for Efficient Partially Relevant Video Retrieval

Add code
Oct 08, 2023
Viaarxiv icon

Making LLaMA SEE and Draw with SEED Tokenizer

Add code
Oct 02, 2023
Viaarxiv icon

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation

Add code
Aug 28, 2023
Viaarxiv icon

MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation

Add code
Aug 22, 2023
Viaarxiv icon

Planting a SEED of Vision in Large Language Model

Add code
Jul 16, 2023
Viaarxiv icon

TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale

Add code
May 23, 2023
Viaarxiv icon

Contrastive Masked Autoencoders for Self-Supervised Video Hashing

Add code
Nov 23, 2022
Viaarxiv icon

Learning Transferable Spatiotemporal Representations from Natural Script Knowledge

Add code
Sep 30, 2022
Figure 1 for Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Figure 2 for Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Figure 3 for Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Figure 4 for Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Viaarxiv icon