Picture for Yuxuan Cai

Yuxuan Cai

FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark

Add code
Sep 11, 2025
Viaarxiv icon

Building Self-Evolving Agents via Experience-Driven Lifelong Learning: A Framework and Benchmark

Add code
Aug 26, 2025
Viaarxiv icon

Omni-AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented for Efficient Long Video Understanding

Add code
Jun 16, 2025
Viaarxiv icon

UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions

Add code
Jun 16, 2025
Viaarxiv icon

Task-Core Memory Management and Consolidation for Long-term Continual Learning

Add code
May 15, 2025
Viaarxiv icon

Omni-AD: Learning to Reconstruct Global and Local Features for Multi-class Anomaly Detection

Add code
Mar 27, 2025
Figure 1 for Omni-AD: Learning to Reconstruct Global and Local Features for Multi-class Anomaly Detection
Figure 2 for Omni-AD: Learning to Reconstruct Global and Local Features for Multi-class Anomaly Detection
Figure 3 for Omni-AD: Learning to Reconstruct Global and Local Features for Multi-class Anomaly Detection
Figure 4 for Omni-AD: Learning to Reconstruct Global and Local Features for Multi-class Anomaly Detection
Viaarxiv icon

Towards Robust and Reliable Concept Representations: Reliability-Enhanced Concept Embedding Model

Add code
Feb 03, 2025
Figure 1 for Towards Robust and Reliable Concept Representations: Reliability-Enhanced Concept Embedding Model
Figure 2 for Towards Robust and Reliable Concept Representations: Reliability-Enhanced Concept Embedding Model
Figure 3 for Towards Robust and Reliable Concept Representations: Reliability-Enhanced Concept Embedding Model
Figure 4 for Towards Robust and Reliable Concept Representations: Reliability-Enhanced Concept Embedding Model
Viaarxiv icon

Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation

Add code
Dec 02, 2024
Figure 1 for Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation
Figure 2 for Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation
Figure 3 for Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation
Figure 4 for Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation
Viaarxiv icon

Fleximo: Towards Flexible Text-to-Human Motion Video Generation

Add code
Nov 29, 2024
Viaarxiv icon

Improving Multi-Subject Consistency in Open-Domain Image Generation with Isolation and Reposition Attention

Add code
Nov 28, 2024
Figure 1 for Improving Multi-Subject Consistency in Open-Domain Image Generation with Isolation and Reposition Attention
Figure 2 for Improving Multi-Subject Consistency in Open-Domain Image Generation with Isolation and Reposition Attention
Figure 3 for Improving Multi-Subject Consistency in Open-Domain Image Generation with Isolation and Reposition Attention
Figure 4 for Improving Multi-Subject Consistency in Open-Domain Image Generation with Isolation and Reposition Attention
Viaarxiv icon