Picture for Abhay Zala

Abhay Zala

Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

Add code
Apr 15, 2024
Viaarxiv icon

EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents

Add code
Mar 18, 2024
Figure 1 for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents
Figure 2 for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents
Figure 3 for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents
Figure 4 for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents
Viaarxiv icon

DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning

Add code
Oct 18, 2023
Viaarxiv icon

VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning

Add code
Sep 26, 2023
Viaarxiv icon

Visual Programming for Text-to-Image Generation and Evaluation

Add code
May 24, 2023
Figure 1 for Visual Programming for Text-to-Image Generation and Evaluation
Figure 2 for Visual Programming for Text-to-Image Generation and Evaluation
Figure 3 for Visual Programming for Text-to-Image Generation and Evaluation
Figure 4 for Visual Programming for Text-to-Image Generation and Evaluation
Viaarxiv icon

Hierarchical Video-Moment Retrieval and Step-Captioning

Add code
Mar 29, 2023
Figure 1 for Hierarchical Video-Moment Retrieval and Step-Captioning
Figure 2 for Hierarchical Video-Moment Retrieval and Step-Captioning
Figure 3 for Hierarchical Video-Moment Retrieval and Step-Captioning
Figure 4 for Hierarchical Video-Moment Retrieval and Step-Captioning
Viaarxiv icon

CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination

Add code
Jul 08, 2022
Figure 1 for CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
Figure 2 for CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
Figure 3 for CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
Figure 4 for CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination
Viaarxiv icon

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

Add code
Feb 08, 2022
Figure 1 for DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers
Figure 2 for DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers
Figure 3 for DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers
Figure 4 for DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers
Viaarxiv icon

FixMyPose: Pose Correctional Captioning and Retrieval

Add code
Apr 04, 2021
Figure 1 for FixMyPose: Pose Correctional Captioning and Retrieval
Figure 2 for FixMyPose: Pose Correctional Captioning and Retrieval
Figure 3 for FixMyPose: Pose Correctional Captioning and Retrieval
Figure 4 for FixMyPose: Pose Correctional Captioning and Retrieval
Viaarxiv icon

ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments

Add code
Nov 15, 2020
Figure 1 for ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments
Figure 2 for ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments
Figure 3 for ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments
Figure 4 for ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments
Viaarxiv icon