Picture for SouYoung Jin

SouYoung Jin

Multi-layer Learnable Attention Mask for Multimodal Tasks

Add code
Jun 04, 2024
Viaarxiv icon

FT2TF: First-Person Statement Text-To-Talking Face Generation

Add code
Dec 09, 2023
Viaarxiv icon

Learning Human Action Recognition Representations Without Real Humans

Add code
Nov 10, 2023
Viaarxiv icon

LangNav: Language as a Perceptual Representation for Navigation

Add code
Oct 11, 2023
Viaarxiv icon

Cross-Modal Discrete Representation Learning

Add code
Jun 10, 2021
Figure 1 for Cross-Modal Discrete Representation Learning
Figure 2 for Cross-Modal Discrete Representation Learning
Figure 3 for Cross-Modal Discrete Representation Learning
Figure 4 for Cross-Modal Discrete Representation Learning
Viaarxiv icon

Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions

Add code
May 10, 2021
Figure 1 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Figure 2 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Figure 3 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Figure 4 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Viaarxiv icon

Automatic adaptation of object detectors to new domains using self-training

Add code
Apr 15, 2019
Figure 1 for Automatic adaptation of object detectors to new domains using self-training
Figure 2 for Automatic adaptation of object detectors to new domains using self-training
Figure 3 for Automatic adaptation of object detectors to new domains using self-training
Figure 4 for Automatic adaptation of object detectors to new domains using self-training
Viaarxiv icon

Unsupervised Hard Example Mining from Videos for Improved Object Detection

Add code
Aug 13, 2018
Figure 1 for Unsupervised Hard Example Mining from Videos for Improved Object Detection
Figure 2 for Unsupervised Hard Example Mining from Videos for Improved Object Detection
Figure 3 for Unsupervised Hard Example Mining from Videos for Improved Object Detection
Figure 4 for Unsupervised Hard Example Mining from Videos for Improved Object Detection
Viaarxiv icon

End-to-end Face Detection and Cast Grouping in Movies Using Erdős-Rényi Clustering

Add code
Sep 07, 2017
Figure 1 for End-to-end Face Detection and Cast Grouping in Movies Using Erdős-Rényi Clustering
Figure 2 for End-to-end Face Detection and Cast Grouping in Movies Using Erdős-Rényi Clustering
Figure 3 for End-to-end Face Detection and Cast Grouping in Movies Using Erdős-Rényi Clustering
Figure 4 for End-to-end Face Detection and Cast Grouping in Movies Using Erdős-Rényi Clustering
Viaarxiv icon