Picture for Yezhou Yang

Yezhou Yang

Arizona State University

Enhanced Cooperative Perception Through Asynchronous Vehicle to Infrastructure Framework with Delay Mitigation for Connected and Automated Vehicles

Add code
Apr 10, 2025
Viaarxiv icon

TextInVision: Text and Prompt Complexity Driven Visual Text Generation Benchmark

Add code
Mar 17, 2025
Viaarxiv icon

Generative AI in Transportation Planning: A Survey

Add code
Mar 10, 2025
Viaarxiv icon

Biomedical Foundation Model: A Survey

Add code
Mar 03, 2025
Viaarxiv icon

Dual Caption Preference Optimization for Diffusion Models

Add code
Feb 09, 2025
Viaarxiv icon

Steering Rectified Flow Models in the Vector Field for Controlled Image Generation

Add code
Nov 27, 2024
Viaarxiv icon

Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model

Add code
Nov 07, 2024
Figure 1 for Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model
Figure 2 for Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model
Figure 3 for Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model
Figure 4 for Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model
Viaarxiv icon

TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives

Add code
Nov 04, 2024
Figure 1 for TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Figure 2 for TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Figure 3 for TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Figure 4 for TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Viaarxiv icon

VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks

Add code
Oct 17, 2024
Figure 1 for VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Figure 2 for VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Figure 3 for VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Figure 4 for VL-GLUE: A Suite of Fundamental yet Challenging Visuo-Linguistic Reasoning Tasks
Viaarxiv icon

ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions

Add code
Oct 17, 2024
Figure 1 for ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Figure 2 for ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Figure 3 for ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Figure 4 for ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions
Viaarxiv icon