Picture for Jiebo Luo

Jiebo Luo

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting

Add code
Apr 09, 2025
Viaarxiv icon

Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)

Add code
Apr 04, 2025
Viaarxiv icon

HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation

Add code
Mar 31, 2025
Viaarxiv icon

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

Add code
Mar 30, 2025
Viaarxiv icon

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Add code
Mar 16, 2025
Viaarxiv icon

OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

Add code
Mar 12, 2025
Viaarxiv icon

QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension

Add code
Mar 11, 2025
Viaarxiv icon

Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs

Add code
Feb 26, 2025
Viaarxiv icon

Facilitating Long Context Understanding via Supervised Chain-of-Thought Reasoning

Add code
Feb 18, 2025
Viaarxiv icon

Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation

Add code
Feb 17, 2025
Viaarxiv icon