Picture for Zitian Tang

Zitian Tang

Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals

Add code
Jan 09, 2026
Viaarxiv icon

How Can Objects Help Video-Language Understanding?

Add code
Apr 10, 2025
Viaarxiv icon

Crystals with Transformers on Graphs, for Prediction of Unconventional Crystal Material Properties and the Benchmark

Add code
Jul 23, 2024
Viaarxiv icon

Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains

Add code
Nov 30, 2023
Figure 1 for Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains
Figure 2 for Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains
Figure 3 for Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains
Figure 4 for Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains
Viaarxiv icon

What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging

Add code
Apr 26, 2023
Figure 1 for What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
Figure 2 for What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
Figure 3 for What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
Figure 4 for What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
Viaarxiv icon