Picture for Xiaojun Chang

Xiaojun Chang

Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration

Add code
Dec 17, 2024
Figure 1 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 2 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 3 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 4 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Viaarxiv icon

HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation

Add code
Dec 15, 2024
Figure 1 for HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
Figure 2 for HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
Figure 3 for HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
Figure 4 for HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
Viaarxiv icon

RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation

Add code
Dec 11, 2024
Viaarxiv icon

GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation

Add code
Dec 01, 2024
Viaarxiv icon

Towards Open-Vocabulary Audio-Visual Event Localization

Add code
Nov 18, 2024
Viaarxiv icon

StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration

Add code
Nov 07, 2024
Figure 1 for StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration
Figure 2 for StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration
Figure 3 for StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration
Figure 4 for StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration
Viaarxiv icon

Dual Conditional Diffusion Models for Sequential Recommendation

Add code
Oct 29, 2024
Figure 1 for Dual Conditional Diffusion Models for Sequential Recommendation
Figure 2 for Dual Conditional Diffusion Models for Sequential Recommendation
Figure 3 for Dual Conditional Diffusion Models for Sequential Recommendation
Figure 4 for Dual Conditional Diffusion Models for Sequential Recommendation
Viaarxiv icon

ContextDet: Temporal Action Detection with Adaptive Context Aggregation

Add code
Oct 20, 2024
Figure 1 for ContextDet: Temporal Action Detection with Adaptive Context Aggregation
Figure 2 for ContextDet: Temporal Action Detection with Adaptive Context Aggregation
Figure 3 for ContextDet: Temporal Action Detection with Adaptive Context Aggregation
Figure 4 for ContextDet: Temporal Action Detection with Adaptive Context Aggregation
Viaarxiv icon

Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes

Add code
Oct 14, 2024
Figure 1 for Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
Figure 2 for Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
Figure 3 for Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
Figure 4 for Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
Viaarxiv icon

Flexiffusion: Segment-wise Neural Architecture Search for Flexible Denoising Schedule

Add code
Sep 26, 2024
Figure 1 for Flexiffusion: Segment-wise Neural Architecture Search for Flexible Denoising Schedule
Figure 2 for Flexiffusion: Segment-wise Neural Architecture Search for Flexible Denoising Schedule
Figure 3 for Flexiffusion: Segment-wise Neural Architecture Search for Flexible Denoising Schedule
Figure 4 for Flexiffusion: Segment-wise Neural Architecture Search for Flexible Denoising Schedule
Viaarxiv icon