Picture for Gedas Bertasius

Gedas Bertasius

ARCADE: Scalable Demonstration Collection and Generation via Augmented Reality for Imitation Learning

Add code
Oct 21, 2024
Viaarxiv icon

Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos

Add code
Sep 30, 2024
Viaarxiv icon

VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos

Add code
Sep 11, 2024
Viaarxiv icon

VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos

Add code
May 29, 2024
Viaarxiv icon

Siamese Vision Transformers are Scalable Audio-visual Learners

Add code
Mar 28, 2024
Viaarxiv icon

Augmented Reality Demonstrations for Scalable Robot Imitation Learning

Add code
Mar 20, 2024
Viaarxiv icon

DAM: Dynamic Adapter Merging for Continual Video QA Learning

Add code
Mar 13, 2024
Viaarxiv icon

Video ReCap: Recursive Captioning of Hour-Long Videos

Add code
Feb 28, 2024
Viaarxiv icon

Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences

Add code
Jan 25, 2024
Viaarxiv icon

A Simple LLM Framework for Long-Range Video Question-Answering

Add code
Dec 28, 2023
Viaarxiv icon