Picture for Jingkuan Song

Jingkuan Song

SeMv-3D: Towards Semantic and Mutil-view Consistency simultaneously for General Text-to-3D Generation with Triplane Priors

Add code
Oct 10, 2024
Viaarxiv icon

BadCM: Invisible Backdoor Attack Against Cross-Modal Learning

Add code
Oct 03, 2024
Viaarxiv icon

One-step Noisy Label Mitigation

Add code
Oct 02, 2024
Figure 1 for One-step Noisy Label Mitigation
Figure 2 for One-step Noisy Label Mitigation
Figure 3 for One-step Noisy Label Mitigation
Figure 4 for One-step Noisy Label Mitigation
Viaarxiv icon

MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct

Add code
Sep 09, 2024
Figure 1 for MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
Figure 2 for MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
Figure 3 for MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
Figure 4 for MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
Viaarxiv icon

VQ-Flow: Taming Normalizing Flows for Multi-Class Anomaly Detection via Hierarchical Vector Quantization

Add code
Sep 02, 2024
Viaarxiv icon

Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infection

Add code
Jul 17, 2024
Viaarxiv icon

Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization

Add code
May 24, 2024
Viaarxiv icon

Text-Video Retrieval with Global-Local Semantic Consistent Learning

Add code
May 21, 2024
Viaarxiv icon

RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception

Add code
May 17, 2024
Viaarxiv icon

EchoReel: Enhancing Action Generation of Existing Video Diffusion Models

Add code
Mar 18, 2024
Viaarxiv icon