Picture for Florian Schroff

Florian Schroff

Imagen 3

Add code
Aug 13, 2024
Viaarxiv icon

VideoPrism: A Foundational Visual Encoder for Video Understanding

Add code
Feb 20, 2024
Figure 1 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 2 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 3 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 4 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Viaarxiv icon

Distilling Vision-Language Models on Millions of Videos

Add code
Jan 11, 2024
Figure 1 for Distilling Vision-Language Models on Millions of Videos
Figure 2 for Distilling Vision-Language Models on Millions of Videos
Figure 3 for Distilling Vision-Language Models on Millions of Videos
Figure 4 for Distilling Vision-Language Models on Millions of Videos
Viaarxiv icon

VideoGLUE: Video General Understanding Evaluation of Foundation Models

Add code
Jul 06, 2023
Viaarxiv icon

Spatiotemporally Discriminative Video-Language Pre-Training with Text Grounding

Add code
Mar 28, 2023
Viaarxiv icon

Unified Visual Relationship Detection with Vision and Language Models

Add code
Mar 16, 2023
Viaarxiv icon

Learning to Generate Image Embeddings with User-level Differential Privacy

Add code
Nov 20, 2022
Viaarxiv icon

Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision

Add code
Dec 09, 2021
Figure 1 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Figure 2 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Figure 3 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Figure 4 for Contextualized Spatio-Temporal Contrastive Learning with Self-Supervision
Viaarxiv icon

DeepLab2: A TensorFlow Library for Deep Labeling

Add code
Jun 17, 2021
Figure 1 for DeepLab2: A TensorFlow Library for Deep Labeling
Figure 2 for DeepLab2: A TensorFlow Library for Deep Labeling
Figure 3 for DeepLab2: A TensorFlow Library for Deep Labeling
Viaarxiv icon

Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization

Add code
Dec 02, 2020
Figure 1 for Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization
Figure 2 for Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization
Figure 3 for Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization
Figure 4 for Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization
Viaarxiv icon