Picture for Anna Rohrbach

Anna Rohrbach

DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts

Add code
Dec 13, 2024
Viaarxiv icon

Object-based (yet Class-agnostic) Video Domain Adaptation

Add code
Nov 29, 2023
Figure 1 for Object-based (yet Class-agnostic) Video Domain Adaptation
Figure 2 for Object-based (yet Class-agnostic) Video Domain Adaptation
Figure 3 for Object-based (yet Class-agnostic) Video Domain Adaptation
Figure 4 for Object-based (yet Class-agnostic) Video Domain Adaptation
Viaarxiv icon

MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding

Add code
Jun 01, 2023
Figure 1 for MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding
Figure 2 for MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding
Figure 3 for MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding
Figure 4 for MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding
Viaarxiv icon

Simple Token-Level Confidence Improves Caption Correctness

Add code
May 11, 2023
Viaarxiv icon

Focus! Relevant and Sufficient Context Selection for News Image Captioning

Add code
Dec 01, 2022
Figure 1 for Focus! Relevant and Sufficient Context Selection for News Image Captioning
Figure 2 for Focus! Relevant and Sufficient Context Selection for News Image Captioning
Figure 3 for Focus! Relevant and Sufficient Context Selection for News Image Captioning
Figure 4 for Focus! Relevant and Sufficient Context Selection for News Image Captioning
Viaarxiv icon

Shape-Guided Diffusion with Inside-Outside Attention

Add code
Dec 01, 2022
Viaarxiv icon

G^3: Geolocation via Guidebook Grounding

Add code
Nov 28, 2022
Figure 1 for G^3: Geolocation via Guidebook Grounding
Figure 2 for G^3: Geolocation via Guidebook Grounding
Figure 3 for G^3: Geolocation via Guidebook Grounding
Figure 4 for G^3: Geolocation via Guidebook Grounding
Viaarxiv icon

TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency

Add code
Aug 14, 2022
Figure 1 for TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Figure 2 for TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Figure 3 for TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Figure 4 for TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Viaarxiv icon

Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022

Add code
Jun 15, 2022
Figure 1 for Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022
Figure 2 for Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022
Figure 3 for Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022
Viaarxiv icon

Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens

Add code
Jun 15, 2022
Figure 1 for Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Figure 2 for Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Figure 3 for Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Figure 4 for Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Viaarxiv icon