Picture for Tim K. Marks

Tim K. Marks

Disentangled Acoustic Fields For Multimodal Physical Scene Understanding

Add code
Jul 16, 2024
Figure 1 for Disentangled Acoustic Fields For Multimodal Physical Scene Understanding
Figure 2 for Disentangled Acoustic Fields For Multimodal Physical Scene Understanding
Figure 3 for Disentangled Acoustic Fields For Multimodal Physical Scene Understanding
Figure 4 for Disentangled Acoustic Fields For Multimodal Physical Scene Understanding
Viaarxiv icon

TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models

Add code
Apr 25, 2024
Viaarxiv icon

Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis

Add code
Sep 30, 2023
Viaarxiv icon

H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions

Add code
Oct 22, 2022
Viaarxiv icon

D Spatio-Temporal Scene Graphs for Video Question Answering

Add code
Feb 18, 2022
Figure 1 for D Spatio-Temporal Scene Graphs for Video Question Answering
Figure 2 for D Spatio-Temporal Scene Graphs for Video Question Answering
Figure 3 for D Spatio-Temporal Scene Graphs for Video Question Answering
Figure 4 for D Spatio-Temporal Scene Graphs for Video Question Answering
Viaarxiv icon

MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation

Add code
Nov 01, 2021
Figure 1 for MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation
Figure 2 for MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation
Figure 3 for MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation
Figure 4 for MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation
Viaarxiv icon

Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning

Add code
Oct 13, 2021
Figure 1 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Figure 2 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Figure 3 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Figure 4 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Viaarxiv icon

InSeGAN: A Generative Approach to Segmenting Identical Instances in Depth Images

Add code
Aug 31, 2021
Figure 1 for InSeGAN: A Generative Approach to Segmenting Identical Instances in Depth Images
Figure 2 for InSeGAN: A Generative Approach to Segmenting Identical Instances in Depth Images
Figure 3 for InSeGAN: A Generative Approach to Segmenting Identical Instances in Depth Images
Figure 4 for InSeGAN: A Generative Approach to Segmenting Identical Instances in Depth Images
Viaarxiv icon

LUVLi Face Alignment: Estimating Landmarks' Location, Uncertainty, and Visibility Likelihood

Add code
Apr 06, 2020
Figure 1 for LUVLi Face Alignment: Estimating Landmarks' Location, Uncertainty, and Visibility Likelihood
Figure 2 for LUVLi Face Alignment: Estimating Landmarks' Location, Uncertainty, and Visibility Likelihood
Figure 3 for LUVLi Face Alignment: Estimating Landmarks' Location, Uncertainty, and Visibility Likelihood
Figure 4 for LUVLi Face Alignment: Estimating Landmarks' Location, Uncertainty, and Visibility Likelihood
Viaarxiv icon

Spatio-Temporal Ranked-Attention Networks for Video Captioning

Add code
Jan 17, 2020
Figure 1 for Spatio-Temporal Ranked-Attention Networks for Video Captioning
Figure 2 for Spatio-Temporal Ranked-Attention Networks for Video Captioning
Figure 3 for Spatio-Temporal Ranked-Attention Networks for Video Captioning
Figure 4 for Spatio-Temporal Ranked-Attention Networks for Video Captioning
Viaarxiv icon