Picture for Maitreya Patel

Maitreya Patel

Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model

Add code
Nov 07, 2024
Viaarxiv icon

TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives

Add code
Nov 04, 2024
Viaarxiv icon

Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?

Add code
Oct 17, 2024
Viaarxiv icon

$λ$-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space

Add code
Feb 07, 2024
Viaarxiv icon

ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations

Add code
Dec 07, 2023
Viaarxiv icon

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models

Add code
Jun 07, 2023
Viaarxiv icon

ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models

Add code
Jun 07, 2023
Viaarxiv icon

CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering

Add code
Nov 07, 2022
Viaarxiv icon

Reasoning about Actions over Visual and Linguistic Modalities: A Survey

Add code
Jul 15, 2022
Figure 1 for Reasoning about Actions over Visual and Linguistic Modalities: A Survey
Figure 2 for Reasoning about Actions over Visual and Linguistic Modalities: A Survey
Figure 3 for Reasoning about Actions over Visual and Linguistic Modalities: A Survey
Figure 4 for Reasoning about Actions over Visual and Linguistic Modalities: A Survey
Viaarxiv icon

Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks

Add code
Apr 16, 2022
Figure 1 for Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks
Figure 2 for Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks
Figure 3 for Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks
Figure 4 for Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks
Viaarxiv icon