Picture for Vladimir Iashin

Vladimir Iashin

Temporally Aligned Audio for Video with Autoregression

Add code
Sep 20, 2024
Viaarxiv icon

Synchformer: Efficient Synchronization from Sparse Cues

Add code
Jan 29, 2024
Figure 1 for Synchformer: Efficient Synchronization from Sparse Cues
Figure 2 for Synchformer: Efficient Synchronization from Sparse Cues
Figure 3 for Synchformer: Efficient Synchronization from Sparse Cues
Figure 4 for Synchformer: Efficient Synchronization from Sparse Cues
Viaarxiv icon

Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors

Add code
Oct 13, 2022
Figure 1 for Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors
Figure 2 for Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors
Figure 3 for Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors
Figure 4 for Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors
Viaarxiv icon

Taming Visually Guided Sound Generation

Add code
Oct 17, 2021
Figure 1 for Taming Visually Guided Sound Generation
Figure 2 for Taming Visually Guided Sound Generation
Figure 3 for Taming Visually Guided Sound Generation
Figure 4 for Taming Visually Guided Sound Generation
Viaarxiv icon

Multi-modal estimation of the properties of containers and their content: survey and evaluation

Add code
Jul 27, 2021
Figure 1 for Multi-modal estimation of the properties of containers and their content: survey and evaluation
Figure 2 for Multi-modal estimation of the properties of containers and their content: survey and evaluation
Figure 3 for Multi-modal estimation of the properties of containers and their content: survey and evaluation
Figure 4 for Multi-modal estimation of the properties of containers and their content: survey and evaluation
Viaarxiv icon

Top-1 CORSMAL Challenge 2020 Submission: Filling Mass Estimation Using Multi-modal Observations of Human-robot Handovers

Add code
Dec 02, 2020
Figure 1 for Top-1 CORSMAL Challenge 2020 Submission: Filling Mass Estimation Using Multi-modal Observations of Human-robot Handovers
Figure 2 for Top-1 CORSMAL Challenge 2020 Submission: Filling Mass Estimation Using Multi-modal Observations of Human-robot Handovers
Figure 3 for Top-1 CORSMAL Challenge 2020 Submission: Filling Mass Estimation Using Multi-modal Observations of Human-robot Handovers
Figure 4 for Top-1 CORSMAL Challenge 2020 Submission: Filling Mass Estimation Using Multi-modal Observations of Human-robot Handovers
Viaarxiv icon

A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer

Add code
May 17, 2020
Figure 1 for A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer
Figure 2 for A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer
Figure 3 for A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer
Figure 4 for A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer
Viaarxiv icon

Multi-modal Dense Video Captioning

Add code
Mar 17, 2020
Figure 1 for Multi-modal Dense Video Captioning
Figure 2 for Multi-modal Dense Video Captioning
Figure 3 for Multi-modal Dense Video Captioning
Figure 4 for Multi-modal Dense Video Captioning
Viaarxiv icon