Picture for Enrico Fini

Enrico Fini

Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models

Add code
Apr 10, 2025
Viaarxiv icon

On Large Multimodal Models as Open-World Image Classifiers

Add code
Mar 27, 2025
Viaarxiv icon

FlexTok: Resampling Images into 1D Token Sequences of Flexible Length

Add code
Feb 19, 2025
Viaarxiv icon

Multimodal Autoregressive Pre-training of Large Vision Encoders

Add code
Nov 21, 2024
Figure 1 for Multimodal Autoregressive Pre-training of Large Vision Encoders
Figure 2 for Multimodal Autoregressive Pre-training of Large Vision Encoders
Figure 3 for Multimodal Autoregressive Pre-training of Large Vision Encoders
Figure 4 for Multimodal Autoregressive Pre-training of Large Vision Encoders
Viaarxiv icon

Retrieval-enriched zero-shot image classification in low-resource domains

Add code
Nov 01, 2024
Figure 1 for Retrieval-enriched zero-shot image classification in low-resource domains
Figure 2 for Retrieval-enriched zero-shot image classification in low-resource domains
Figure 3 for Retrieval-enriched zero-shot image classification in low-resource domains
Figure 4 for Retrieval-enriched zero-shot image classification in low-resource domains
Viaarxiv icon

Visual Scratchpads: Enabling Global Reasoning in Vision

Add code
Oct 10, 2024
Figure 1 for Visual Scratchpads: Enabling Global Reasoning in Vision
Figure 2 for Visual Scratchpads: Enabling Global Reasoning in Vision
Figure 3 for Visual Scratchpads: Enabling Global Reasoning in Vision
Figure 4 for Visual Scratchpads: Enabling Global Reasoning in Vision
Viaarxiv icon

Automatic benchmarking of large multimodal models via iterative experiment programming

Add code
Jun 18, 2024
Figure 1 for Automatic benchmarking of large multimodal models via iterative experiment programming
Figure 2 for Automatic benchmarking of large multimodal models via iterative experiment programming
Figure 3 for Automatic benchmarking of large multimodal models via iterative experiment programming
Figure 4 for Automatic benchmarking of large multimodal models via iterative experiment programming
Viaarxiv icon

Vocabulary-free Image Classification and Semantic Segmentation

Add code
Apr 16, 2024
Figure 1 for Vocabulary-free Image Classification and Semantic Segmentation
Figure 2 for Vocabulary-free Image Classification and Semantic Segmentation
Figure 3 for Vocabulary-free Image Classification and Semantic Segmentation
Figure 4 for Vocabulary-free Image Classification and Semantic Segmentation
Viaarxiv icon

Continual Contrastive Spoken Language Understanding

Add code
Oct 04, 2023
Figure 1 for Continual Contrastive Spoken Language Understanding
Figure 2 for Continual Contrastive Spoken Language Understanding
Figure 3 for Continual Contrastive Spoken Language Understanding
Figure 4 for Continual Contrastive Spoken Language Understanding
Viaarxiv icon

Semi-supervised learning made simple with self-supervised clustering

Add code
Jun 13, 2023
Viaarxiv icon