Picture for Xindi Wu

Xindi Wu

ICONS: Influence Consensus for Vision-Language Data Selection

Add code
Jan 06, 2025
Figure 1 for ICONS: Influence Consensus for Vision-Language Data Selection
Figure 2 for ICONS: Influence Consensus for Vision-Language Data Selection
Figure 3 for ICONS: Influence Consensus for Vision-Language Data Selection
Figure 4 for ICONS: Influence Consensus for Vision-Language Data Selection
Viaarxiv icon

SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?

Add code
Oct 04, 2024
Figure 1 for SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
Figure 2 for SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
Figure 3 for SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
Figure 4 for SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
Viaarxiv icon

ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty

Add code
Aug 26, 2024
Viaarxiv icon

CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs

Add code
Jun 26, 2024
Figure 1 for CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
Figure 2 for CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
Figure 3 for CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
Figure 4 for CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
Viaarxiv icon

Language Models as Science Tutors

Add code
Feb 16, 2024
Figure 1 for Language Models as Science Tutors
Figure 2 for Language Models as Science Tutors
Figure 3 for Language Models as Science Tutors
Figure 4 for Language Models as Science Tutors
Viaarxiv icon

Multimodal Dataset Distillation for Image-Text Retrieval

Add code
Aug 15, 2023
Figure 1 for Multimodal Dataset Distillation for Image-Text Retrieval
Figure 2 for Multimodal Dataset Distillation for Image-Text Retrieval
Figure 3 for Multimodal Dataset Distillation for Image-Text Retrieval
Figure 4 for Multimodal Dataset Distillation for Image-Text Retrieval
Viaarxiv icon

Pix2Map: Cross-modal Retrieval for Inferring Street Maps from Images

Add code
Jan 10, 2023
Figure 1 for Pix2Map: Cross-modal Retrieval for Inferring Street Maps from Images
Figure 2 for Pix2Map: Cross-modal Retrieval for Inferring Street Maps from Images
Figure 3 for Pix2Map: Cross-modal Retrieval for Inferring Street Maps from Images
Figure 4 for Pix2Map: Cross-modal Retrieval for Inferring Street Maps from Images
Viaarxiv icon

Toward Learning Robust and Invariant Representations with Alignment Regularization and Data Augmentation

Add code
Jun 04, 2022
Figure 1 for Toward Learning Robust and Invariant Representations with Alignment Regularization and Data Augmentation
Figure 2 for Toward Learning Robust and Invariant Representations with Alignment Regularization and Data Augmentation
Figure 3 for Toward Learning Robust and Invariant Representations with Alignment Regularization and Data Augmentation
Figure 4 for Toward Learning Robust and Invariant Representations with Alignment Regularization and Data Augmentation
Viaarxiv icon

Ego4D: Around the World in 3,000 Hours of Egocentric Video

Add code
Oct 13, 2021
Figure 1 for Ego4D: Around the World in 3,000 Hours of Egocentric Video
Figure 2 for Ego4D: Around the World in 3,000 Hours of Egocentric Video
Figure 3 for Ego4D: Around the World in 3,000 Hours of Egocentric Video
Figure 4 for Ego4D: Around the World in 3,000 Hours of Egocentric Video
Viaarxiv icon

Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations

Add code
Nov 25, 2020
Figure 1 for Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations
Figure 2 for Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations
Figure 3 for Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations
Figure 4 for Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations
Viaarxiv icon