Picture for Nicholas Moratelli

Nicholas Moratelli

Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training

Add code
Oct 09, 2024
Figure 1 for Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training
Figure 2 for Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training
Figure 3 for Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training
Figure 4 for Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training
Viaarxiv icon

Fluent and Accurate Image Captioning with a Self-Trained Reward Model

Add code
Aug 29, 2024
Figure 1 for Fluent and Accurate Image Captioning with a Self-Trained Reward Model
Figure 2 for Fluent and Accurate Image Captioning with a Self-Trained Reward Model
Figure 3 for Fluent and Accurate Image Captioning with a Self-Trained Reward Model
Figure 4 for Fluent and Accurate Image Captioning with a Self-Trained Reward Model
Viaarxiv icon

Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization

Add code
Aug 26, 2024
Viaarxiv icon

Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs

Add code
Apr 23, 2024
Figure 1 for Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs
Figure 2 for Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs
Figure 3 for Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs
Figure 4 for Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs
Viaarxiv icon

The (R)Evolution of Multimodal Large Language Models: A Survey

Add code
Feb 19, 2024
Viaarxiv icon