Picture for Yezhou Yang

Yezhou Yang

Arizona State University

Steering Rectified Flow Models in the Vector Field for Controlled Image Generation

Add code
Nov 27, 2024
Viaarxiv icon

Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model

Add code
Nov 07, 2024
Figure 1 for Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model
Figure 2 for Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model
Figure 3 for Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model
Figure 4 for Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model
Viaarxiv icon

TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives

Add code
Nov 04, 2024
Figure 1 for TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Figure 2 for TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Figure 3 for TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Figure 4 for TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Viaarxiv icon

VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks

Add code
Oct 17, 2024
Figure 1 for VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Figure 2 for VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Figure 3 for VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Figure 4 for VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Viaarxiv icon

Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?

Add code
Oct 17, 2024
Figure 1 for Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
Figure 2 for Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
Figure 3 for Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
Figure 4 for Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?
Viaarxiv icon

ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions

Add code
Oct 17, 2024
Figure 1 for ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Figure 2 for ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Figure 3 for ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Figure 4 for ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Viaarxiv icon

TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning

Add code
Sep 30, 2024
Viaarxiv icon

Latent Space Energy-based Neural ODEs

Add code
Sep 05, 2024
Figure 1 for Latent Space Energy-based Neural ODEs
Figure 2 for Latent Space Energy-based Neural ODEs
Figure 3 for Latent Space Energy-based Neural ODEs
Figure 4 for Latent Space Energy-based Neural ODEs
Viaarxiv icon

Roundabout Dilemma Zone Data Mining and Forecasting with Trajectory Prediction and Graph Neural Networks

Add code
Sep 01, 2024
Figure 1 for Roundabout Dilemma Zone Data Mining and Forecasting with Trajectory Prediction and Graph Neural Networks
Figure 2 for Roundabout Dilemma Zone Data Mining and Forecasting with Trajectory Prediction and Graph Neural Networks
Figure 3 for Roundabout Dilemma Zone Data Mining and Forecasting with Trajectory Prediction and Graph Neural Networks
Figure 4 for Roundabout Dilemma Zone Data Mining and Forecasting with Trajectory Prediction and Graph Neural Networks
Viaarxiv icon

Recent Event Camera Innovations: A Survey

Add code
Aug 27, 2024
Viaarxiv icon