Picture for Maksim Dzabraev

Maksim Dzabraev

VLRM: Vision-Language Models act as Reward Models for Image Captioning

Add code
Apr 02, 2024
Viaarxiv icon

MDMMT-2: Multidomain Multimodal Transformer for Video Retrieval, One More Step Towards Generalization

Add code
Mar 14, 2022
Figure 1 for MDMMT-2: Multidomain Multimodal Transformer for Video Retrieval, One More Step Towards Generalization
Figure 2 for MDMMT-2: Multidomain Multimodal Transformer for Video Retrieval, One More Step Towards Generalization
Figure 3 for MDMMT-2: Multidomain Multimodal Transformer for Video Retrieval, One More Step Towards Generalization
Figure 4 for MDMMT-2: Multidomain Multimodal Transformer for Video Retrieval, One More Step Towards Generalization
Viaarxiv icon

MDMMT: Multidomain Multimodal Transformer for Video Retrieval

Add code
Mar 19, 2021
Figure 1 for MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Figure 2 for MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Figure 3 for MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Figure 4 for MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Viaarxiv icon

Mutual Modality Learning for Video Action Classification

Add code
Nov 04, 2020
Figure 1 for Mutual Modality Learning for Video Action Classification
Figure 2 for Mutual Modality Learning for Video Action Classification
Figure 3 for Mutual Modality Learning for Video Action Classification
Figure 4 for Mutual Modality Learning for Video Action Classification
Viaarxiv icon