Picture for Xu Jia

Xu Jia

EvMic: Event-based Non-contact sound recovery from effective spatial-temporal modeling

Add code
Apr 03, 2025
Viaarxiv icon

Towards Physically Plausible Video Generation via VLM Planning

Add code
Mar 30, 2025
Viaarxiv icon

GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing

Add code
Mar 16, 2025
Viaarxiv icon

One Size doesn't Fit All: A Personalized Conversational Tutoring Agent for Mathematics Instruction

Add code
Feb 19, 2025
Viaarxiv icon

CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation

Add code
Feb 12, 2025
Viaarxiv icon

Baichuan-Omni-1.5 Technical Report

Add code
Jan 26, 2025
Viaarxiv icon

ReNeg: Learning Negative Embedding with Reward Guidance

Add code
Dec 27, 2024
Viaarxiv icon

TINQ: Temporal Inconsistency Guided Blind Video Quality Assessment

Add code
Dec 25, 2024
Viaarxiv icon

MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models

Add code
Dec 02, 2024
Figure 1 for MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models
Figure 2 for MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models
Figure 3 for MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models
Figure 4 for MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models
Viaarxiv icon

OASIS: Open Agent Social Interaction Simulations with One Million Agents

Add code
Nov 26, 2024
Figure 1 for OASIS: Open Agent Social Interaction Simulations with One Million Agents
Figure 2 for OASIS: Open Agent Social Interaction Simulations with One Million Agents
Figure 3 for OASIS: Open Agent Social Interaction Simulations with One Million Agents
Figure 4 for OASIS: Open Agent Social Interaction Simulations with One Million Agents
Viaarxiv icon