Picture for Anna Rohrbach

Anna Rohrbach

Object-based (yet Class-agnostic) Video Domain Adaptation

Add code
Nov 29, 2023
Figure 1 for Object-based (yet Class-agnostic) Video Domain Adaptation
Figure 2 for Object-based (yet Class-agnostic) Video Domain Adaptation
Figure 3 for Object-based (yet Class-agnostic) Video Domain Adaptation
Figure 4 for Object-based (yet Class-agnostic) Video Domain Adaptation
Viaarxiv icon

MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding

Add code
Jun 01, 2023
Figure 1 for MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding
Figure 2 for MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding
Figure 3 for MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding
Figure 4 for MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding
Viaarxiv icon

Simple Token-Level Confidence Improves Caption Correctness

Add code
May 11, 2023
Viaarxiv icon

Focus! Relevant and Sufficient Context Selection for News Image Captioning

Add code
Dec 01, 2022
Viaarxiv icon

Shape-Guided Diffusion with Inside-Outside Attention

Add code
Dec 01, 2022
Viaarxiv icon

G^3: Geolocation via Guidebook Grounding

Add code
Nov 28, 2022
Figure 1 for G^3: Geolocation via Guidebook Grounding
Figure 2 for G^3: Geolocation via Guidebook Grounding
Figure 3 for G^3: Geolocation via Guidebook Grounding
Figure 4 for G^3: Geolocation via Guidebook Grounding
Viaarxiv icon

TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency

Add code
Aug 14, 2022
Figure 1 for TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Figure 2 for TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Figure 3 for TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Figure 4 for TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Viaarxiv icon

Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022

Add code
Jun 15, 2022
Figure 1 for Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022
Figure 2 for Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022
Figure 3 for Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022
Viaarxiv icon

Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens

Add code
Jun 15, 2022
Figure 1 for Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Figure 2 for Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Figure 3 for Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Figure 4 for Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Viaarxiv icon

Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly

Add code
Apr 28, 2022
Figure 1 for Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly
Figure 2 for Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly
Figure 3 for Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly
Figure 4 for Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly
Viaarxiv icon