Picture for Nina Shvetsova

Nina Shvetsova

VideoGEM: Training-free Action Grounding in Videos

Add code
Mar 26, 2025
Viaarxiv icon

Unbiasing through Textual Descriptions: Mitigating Representation Bias in Video Benchmarks

Add code
Mar 24, 2025
Viaarxiv icon

HowToCaption: Prompting LLMs to Transform Video Annotations at Scale

Add code
Oct 07, 2023
Viaarxiv icon

In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval

Add code
Sep 16, 2023
Viaarxiv icon

Preserving Modality Structure Improves Multi-Modal Learning

Add code
Aug 24, 2023
Viaarxiv icon

What, when, and where? -- Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions

Add code
Mar 29, 2023
Viaarxiv icon

MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge

Add code
Mar 15, 2023
Viaarxiv icon

Learning by Sorting: Self-supervised Learning with Group Ordering Constraints

Add code
Jan 05, 2023
Figure 1 for Learning by Sorting: Self-supervised Learning with Group Ordering Constraints
Figure 2 for Learning by Sorting: Self-supervised Learning with Group Ordering Constraints
Figure 3 for Learning by Sorting: Self-supervised Learning with Group Ordering Constraints
Figure 4 for Learning by Sorting: Self-supervised Learning with Group Ordering Constraints
Viaarxiv icon

C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval

Add code
Oct 07, 2022
Figure 1 for C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Figure 2 for C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Figure 3 for C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Figure 4 for C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Viaarxiv icon

VL-Taboo: An Analysis of Attribute-based Zero-shot Capabilities of Vision-Language Models

Add code
Sep 12, 2022
Figure 1 for VL-Taboo: An Analysis of Attribute-based Zero-shot Capabilities of Vision-Language Models
Figure 2 for VL-Taboo: An Analysis of Attribute-based Zero-shot Capabilities of Vision-Language Models
Figure 3 for VL-Taboo: An Analysis of Attribute-based Zero-shot Capabilities of Vision-Language Models
Figure 4 for VL-Taboo: An Analysis of Attribute-based Zero-shot Capabilities of Vision-Language Models
Viaarxiv icon