Picture for Xun Yang

Xun Yang

Visual-Oriented Fine-Grained Knowledge Editing for MultiModal Large Language Models

Add code
Nov 19, 2024
Viaarxiv icon

Grounding is All You Need? Dual Temporal Grounding for Video Dialog

Add code
Oct 08, 2024
Figure 1 for Grounding is All You Need? Dual Temporal Grounding for Video Dialog
Figure 2 for Grounding is All You Need? Dual Temporal Grounding for Video Dialog
Figure 3 for Grounding is All You Need? Dual Temporal Grounding for Video Dialog
Figure 4 for Grounding is All You Need? Dual Temporal Grounding for Video Dialog
Viaarxiv icon

Scene-Text Grounding for Text-Based Video Question Answering

Add code
Sep 22, 2024
Figure 1 for Scene-Text Grounding for Text-Based Video Question Answering
Figure 2 for Scene-Text Grounding for Text-Based Video Question Answering
Figure 3 for Scene-Text Grounding for Text-Based Video Question Answering
Figure 4 for Scene-Text Grounding for Text-Based Video Question Answering
Viaarxiv icon

Dual-stream Feature Augmentation for Domain Generalization

Add code
Sep 07, 2024
Viaarxiv icon

GRPose: Learning Graph Relations for Human Image Generation with Pose Priors

Add code
Aug 29, 2024
Viaarxiv icon

Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks

Add code
Jul 30, 2024
Viaarxiv icon

Advancing Prompt Learning through an External Layer

Add code
Jul 29, 2024
Viaarxiv icon

Towards Scale-Aware Full Surround Monodepth with Transformers

Add code
Jul 15, 2024
Figure 1 for Towards Scale-Aware Full Surround Monodepth with Transformers
Figure 2 for Towards Scale-Aware Full Surround Monodepth with Transformers
Figure 3 for Towards Scale-Aware Full Surround Monodepth with Transformers
Figure 4 for Towards Scale-Aware Full Surround Monodepth with Transformers
Viaarxiv icon

Boosting Adversarial Transferability for Skeleton-based Action Recognition via Exploring the Model Posterior Space

Add code
Jul 11, 2024
Viaarxiv icon

TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation

Add code
Jul 02, 2024
Figure 1 for TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation
Figure 2 for TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation
Figure 3 for TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation
Figure 4 for TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation
Viaarxiv icon