Picture for Masayoshi Kondo

Masayoshi Kondo

DETECLAP: Enhancing Audio-Visual Representation Learning with Object Information

Add code
Sep 18, 2024
Figure 1 for DETECLAP: Enhancing Audio-Visual Representation Learning with Object Information
Figure 2 for DETECLAP: Enhancing Audio-Visual Representation Learning with Object Information
Figure 3 for DETECLAP: Enhancing Audio-Visual Representation Learning with Object Information
Figure 4 for DETECLAP: Enhancing Audio-Visual Representation Learning with Object Information
Viaarxiv icon

Data Collection-free Masked Video Modeling

Add code
Sep 10, 2024
Figure 1 for Data Collection-free Masked Video Modeling
Figure 2 for Data Collection-free Masked Video Modeling
Figure 3 for Data Collection-free Masked Video Modeling
Figure 4 for Data Collection-free Masked Video Modeling
Viaarxiv icon

On the Audio Hallucinations in Large Audio-Video Language Models

Add code
Jan 18, 2024
Viaarxiv icon

Large-scale Vision-Language Models Learn Super Images for Efficient and High-Performance Partially Relevant Video Retrieval

Add code
Dec 01, 2023
Figure 1 for Large-scale Vision-Language Models Learn Super Images for Efficient and High-Performance Partially Relevant Video Retrieval
Figure 2 for Large-scale Vision-Language Models Learn Super Images for Efficient and High-Performance Partially Relevant Video Retrieval
Figure 3 for Large-scale Vision-Language Models Learn Super Images for Efficient and High-Performance Partially Relevant Video Retrieval
Figure 4 for Large-scale Vision-Language Models Learn Super Images for Efficient and High-Performance Partially Relevant Video Retrieval
Viaarxiv icon

Leveraging Image-Text Similarity and Caption Modification for the DataComp Challenge: Filtering Track and BYOD Track

Add code
Oct 23, 2023
Viaarxiv icon