Picture for Bryan Seybold

Bryan Seybold

CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers

Add code
May 21, 2024
Viaarxiv icon

VideoPoet: A Large Language Model for Zero-Shot Video Generation

Add code
Dec 21, 2023
Figure 1 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 2 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 3 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 4 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Viaarxiv icon

Open-Vocabulary Temporal Action Detection with Off-the-Shelf Image-Text Features

Add code
Dec 20, 2022
Viaarxiv icon

What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics

Add code
May 12, 2022
Figure 1 for What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
Figure 2 for What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
Figure 3 for What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
Figure 4 for What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics
Viaarxiv icon

Learning Audio-Video Modalities from Image Captions

Add code
Apr 01, 2022
Figure 1 for Learning Audio-Video Modalities from Image Captions
Figure 2 for Learning Audio-Video Modalities from Image Captions
Figure 3 for Learning Audio-Video Modalities from Image Captions
Figure 4 for Learning Audio-Video Modalities from Image Captions
Viaarxiv icon

Optical Mouse: 3D Mouse Pose From Single-View Video

Add code
Jun 17, 2021
Figure 1 for Optical Mouse: 3D Mouse Pose From Single-View Video
Figure 2 for Optical Mouse: 3D Mouse Pose From Single-View Video
Figure 3 for Optical Mouse: 3D Mouse Pose From Single-View Video
Figure 4 for Optical Mouse: 3D Mouse Pose From Single-View Video
Viaarxiv icon

Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces

Add code
May 17, 2019
Figure 1 for Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces
Figure 2 for Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces
Figure 3 for Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces
Figure 4 for Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces
Viaarxiv icon

Rethinking the Faster R-CNN Architecture for Temporal Action Localization

Add code
Apr 20, 2018
Figure 1 for Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Figure 2 for Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Figure 3 for Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Figure 4 for Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Viaarxiv icon

Instance Embedding Transfer to Unsupervised Video Object Segmentation

Add code
Feb 27, 2018
Figure 1 for Instance Embedding Transfer to Unsupervised Video Object Segmentation
Figure 2 for Instance Embedding Transfer to Unsupervised Video Object Segmentation
Figure 3 for Instance Embedding Transfer to Unsupervised Video Object Segmentation
Figure 4 for Instance Embedding Transfer to Unsupervised Video Object Segmentation
Viaarxiv icon

CNN Architectures for Large-Scale Audio Classification

Add code
Jan 10, 2017
Figure 1 for CNN Architectures for Large-Scale Audio Classification
Figure 2 for CNN Architectures for Large-Scale Audio Classification
Figure 3 for CNN Architectures for Large-Scale Audio Classification
Figure 4 for CNN Architectures for Large-Scale Audio Classification
Viaarxiv icon