Picture for Mingrui Chen

Mingrui Chen

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model

Add code
Jun 10, 2025
Viaarxiv icon

HAD: Hybrid Architecture Distillation Outperforms Teacher in Genomic Sequence Modeling

Add code
May 27, 2025
Viaarxiv icon

Unlocking the Potential of Difficulty Prior in RL-based Multimodal Reasoning

Add code
May 19, 2025
Viaarxiv icon

The Binary and Ternary Quantization Can Improve Feature Discrimination

Add code
Apr 18, 2025
Viaarxiv icon

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

Semantic Equitable Clustering: A Simple, Fast and Effective Strategy for Vision Transformer

Add code
May 22, 2024
Viaarxiv icon

Vision Transformer with Sparse Scan Prior

Add code
May 22, 2024
Viaarxiv icon

RMT: Retentive Networks Meet Vision Transformers

Add code
Sep 20, 2023
Viaarxiv icon

Occ$^2$Net: Robust Image Matching Based on 3D Occupancy Estimation for Occluded Regions

Add code
Aug 14, 2023
Viaarxiv icon

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

Add code
Jun 05, 2023
Viaarxiv icon