Picture for Marco Bertini

Marco Bertini

Media Integration and Communication Center, UNIFI, Department of Information Engineering

ComiCap: A VLMs pipeline for dense captioning of Comic Panels

Add code
Sep 24, 2024
Viaarxiv icon

Garment Attribute Manipulation with Multi-level Attention

Add code
Sep 16, 2024
Viaarxiv icon

One missing piece in Vision and Language: A Survey on Comics Understanding

Add code
Sep 14, 2024
Figure 1 for One missing piece in Vision and Language: A Survey on Comics Understanding
Figure 2 for One missing piece in Vision and Language: A Survey on Comics Understanding
Figure 3 for One missing piece in Vision and Language: A Survey on Comics Understanding
Figure 4 for One missing piece in Vision and Language: A Survey on Comics Understanding
Viaarxiv icon

Prompt and Prejudice

Add code
Aug 07, 2024
Viaarxiv icon

CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding

Add code
Jul 04, 2024
Viaarxiv icon

Comics Datasets Framework: Mix of Comics datasets for detection benchmarking

Add code
Jul 03, 2024
Viaarxiv icon

Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation

Add code
Jul 03, 2024
Figure 1 for Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Figure 2 for Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Figure 3 for Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Figure 4 for Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Viaarxiv icon

iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval

Add code
May 05, 2024
Viaarxiv icon

Multimodal-Conditioned Latent Diffusion Models for Fashion Image Editing

Add code
Mar 25, 2024
Viaarxiv icon

Quality-Aware Image-Text Alignment for Real-World Image Quality Assessment

Add code
Mar 17, 2024
Viaarxiv icon