Picture for Anas Awadalla

Anas Awadalla

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions

Add code
Nov 12, 2024
Viaarxiv icon

xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

Add code
Aug 22, 2024
Figure 1 for xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Figure 2 for xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Figure 3 for xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Figure 4 for xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Viaarxiv icon

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Add code
Aug 16, 2024
Figure 1 for xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Figure 2 for xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Figure 3 for xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Figure 4 for xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Viaarxiv icon

Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness

Add code
Jul 02, 2024
Figure 1 for Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Figure 2 for Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Figure 3 for Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Figure 4 for Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
Viaarxiv icon

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

Add code
Jun 17, 2024
Viaarxiv icon

Catwalk: A Unified Language Model Evaluation Framework for Many Datasets

Add code
Dec 15, 2023
Viaarxiv icon

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use

Add code
Aug 12, 2023
Viaarxiv icon

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

Add code
Aug 07, 2023
Figure 1 for OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Figure 2 for OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Figure 3 for OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Figure 4 for OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Viaarxiv icon

Are aligned neural networks adversarially aligned?

Add code
Jun 26, 2023
Viaarxiv icon

Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text

Add code
Apr 14, 2023
Viaarxiv icon