Picture for Wonjae Kim

Wonjae Kim

Emergence of Text Readability in Vision Language Models

Add code
Jun 24, 2025
Viaarxiv icon

Extract Free Dense Misalignment from CLIP

Add code
Dec 24, 2024
Figure 1 for Extract Free Dense Misalignment from CLIP
Figure 2 for Extract Free Dense Misalignment from CLIP
Figure 3 for Extract Free Dense Misalignment from CLIP
Figure 4 for Extract Free Dense Misalignment from CLIP
Viaarxiv icon

Probabilistic Language-Image Pre-Training

Add code
Oct 24, 2024
Figure 1 for Probabilistic Language-Image Pre-Training
Figure 2 for Probabilistic Language-Image Pre-Training
Figure 3 for Probabilistic Language-Image Pre-Training
Figure 4 for Probabilistic Language-Image Pre-Training
Viaarxiv icon

Reducing Task Discrepancy of Text Encoders for Zero-Shot Composed Image Retrieval

Add code
Jun 13, 2024
Viaarxiv icon

HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts

Add code
Apr 26, 2024
Viaarxiv icon

Language-only Efficient Training of Zero-shot Composed Image Retrieval

Add code
Dec 04, 2023
Viaarxiv icon

Lifelong Audio-video Masked Autoencoder with Forget-robust Localized Alignments

Add code
Oct 12, 2023
Viaarxiv icon

Computational Approaches for App-to-App Retrieval and Design Consistency Check

Add code
Sep 19, 2023
Figure 1 for Computational Approaches for App-to-App Retrieval and Design Consistency Check
Figure 2 for Computational Approaches for App-to-App Retrieval and Design Consistency Check
Figure 3 for Computational Approaches for App-to-App Retrieval and Design Consistency Check
Figure 4 for Computational Approaches for App-to-App Retrieval and Design Consistency Check
Viaarxiv icon

What Do Self-Supervised Vision Transformers Learn?

Add code
May 01, 2023
Viaarxiv icon

CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion

Add code
Mar 21, 2023
Viaarxiv icon