Picture for Zitian Tang

Zitian Tang

How Can Objects Help Video-Language Understanding?

Add code
Apr 10, 2025
Viaarxiv icon

Crystals with Transformers on Graphs, for Prediction of Unconventional Crystal Material Properties and the Benchmark

Add code
Jul 23, 2024
Viaarxiv icon

Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains

Add code
Nov 30, 2023
Figure 1 for Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains
Figure 2 for Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains
Figure 3 for Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains
Figure 4 for Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains
Viaarxiv icon

What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging

Add code
Apr 26, 2023
Figure 1 for What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
Figure 2 for What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
Figure 3 for What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
Figure 4 for What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging
Viaarxiv icon