Picture for Zineng Tang

Zineng Tang

Evaluating Model Perception of Color Illusions in Photorealistic Scenes

Add code
Dec 09, 2024
Viaarxiv icon

Grounding Language in Multi-Perspective Referential Communication

Add code
Oct 04, 2024
Viaarxiv icon

AMONGAGENTS: Evaluating Large Language Models in the Interactive Text-Based Social Deduction Game

Add code
Jul 24, 2024
Viaarxiv icon

CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation

Add code
Nov 30, 2023
Viaarxiv icon

Paxion: Patching Action Knowledge in Video-Language Foundation Models

Add code
May 26, 2023
Viaarxiv icon

Any-to-Any Generation via Composable Diffusion

Add code
May 19, 2023
Viaarxiv icon

Unifying Vision, Text, and Layout for Universal Document Processing

Add code
Dec 20, 2022
Viaarxiv icon

Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention

Add code
Nov 21, 2022
Viaarxiv icon

TVLT: Textless Vision-Language Transformer

Add code
Sep 28, 2022
Figure 1 for TVLT: Textless Vision-Language Transformer
Figure 2 for TVLT: Textless Vision-Language Transformer
Figure 3 for TVLT: Textless Vision-Language Transformer
Figure 4 for TVLT: Textless Vision-Language Transformer
Viaarxiv icon

VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer

Add code
Jul 06, 2021
Figure 1 for VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer
Figure 2 for VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer
Figure 3 for VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer
Figure 4 for VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer
Viaarxiv icon