Picture for Mingyuan Zhang

Mingyuan Zhang

Distorted or Fabricated? A Survey on Hallucination in Video LLMs

Add code
Apr 14, 2026
Viaarxiv icon

Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer

Add code
Mar 19, 2026
Viaarxiv icon

InfiniteDance: Scalable 3D Dance Generation Towards in-the-wild Generalization

Add code
Mar 10, 2026
Viaarxiv icon

NECromancer: Breathing Life into Skeletons via BVH Animation

Add code
Feb 06, 2026
Viaarxiv icon

DiMo: Discrete Diffusion Modeling for Motion Generation and Understanding

Add code
Feb 04, 2026
Viaarxiv icon

Rethinking Fine-Tuning: Unlocking Hidden Capabilities in Vision-Language Models

Add code
Dec 28, 2025
Viaarxiv icon

SWiT-4D: Sliding-Window Transformer for Lossless and Parameter-Free Temporal 4D Generation

Add code
Dec 11, 2025
Figure 1 for SWiT-4D: Sliding-Window Transformer for Lossless and Parameter-Free Temporal 4D Generation
Figure 2 for SWiT-4D: Sliding-Window Transformer for Lossless and Parameter-Free Temporal 4D Generation
Figure 3 for SWiT-4D: Sliding-Window Transformer for Lossless and Parameter-Free Temporal 4D Generation
Figure 4 for SWiT-4D: Sliding-Window Transformer for Lossless and Parameter-Free Temporal 4D Generation
Viaarxiv icon

MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos

Add code
Dec 11, 2025
Viaarxiv icon

Landmark Guided Visual Feature Extractor for Visual Speech Recognition with Limited Resource

Add code
Aug 10, 2025
Viaarxiv icon

Semantics-Aware Human Motion Generation from Audio Instructions

Add code
May 29, 2025
Viaarxiv icon