Picture for Jingdong Chen

Jingdong Chen

ARGenSeg: Image Segmentation with Autoregressive Image Generation Model

Add code
Oct 23, 2025
Viaarxiv icon

PruneHal: Reducing Hallucinations in Multi-modal Large Language Models through Adaptive KV Cache Pruning

Add code
Oct 22, 2025
Viaarxiv icon

CasP: Improving Semi-Dense Feature Matching Pipeline Leveraging Cascaded Correspondence Priors for Guidance

Add code
Jul 23, 2025
Viaarxiv icon

VideoMAR: Autoregressive Video Generatio with Continuous Tokens

Add code
Jun 18, 2025
Viaarxiv icon

Ming-Omni: A Unified Multimodal Model for Perception and Generation

Add code
Jun 11, 2025
Figure 1 for Ming-Omni: A Unified Multimodal Model for Perception and Generation
Figure 2 for Ming-Omni: A Unified Multimodal Model for Perception and Generation
Figure 3 for Ming-Omni: A Unified Multimodal Model for Perception and Generation
Figure 4 for Ming-Omni: A Unified Multimodal Model for Perception and Generation
Viaarxiv icon

Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO

Add code
May 27, 2025
Viaarxiv icon

Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction

Add code
May 05, 2025
Viaarxiv icon

Determined blind source separation via modeling adjacent frequency band correlations in speech signals

Add code
Apr 05, 2025
Viaarxiv icon

Spatial-Filter-Bank-Based Neural Method for Multichannel Speech Enhancement

Add code
Apr 02, 2025
Viaarxiv icon

When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning

Add code
Mar 10, 2025
Viaarxiv icon