Look Every Frame All at Once: Video-Ma$^2$mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing

Add code
Nov 29, 2024

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: