Picture for Boyu Chen

Boyu Chen

Seeking Necessary and Sufficient Information from Multimodal Medical Data

Add code
Feb 27, 2026
Viaarxiv icon

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Add code
Feb 11, 2026
Viaarxiv icon

Multimodal Generative Retrieval Model with Staged Pretraining for Food Delivery on Meituan

Add code
Feb 06, 2026
Viaarxiv icon

Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding

Add code
Jun 09, 2025
Figure 1 for Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding
Figure 2 for Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding
Figure 3 for Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding
Figure 4 for Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding
Viaarxiv icon

VideoChat-A1: Thinking with Long Videos by Chain-of-Shot Reasoning

Add code
Jun 06, 2025
Viaarxiv icon

Unveiling Hidden Vulnerabilities in Digital Human Generation via Adversarial Attacks

Add code
Apr 24, 2025
Figure 1 for Unveiling Hidden Vulnerabilities in Digital Human Generation via Adversarial Attacks
Figure 2 for Unveiling Hidden Vulnerabilities in Digital Human Generation via Adversarial Attacks
Figure 3 for Unveiling Hidden Vulnerabilities in Digital Human Generation via Adversarial Attacks
Figure 4 for Unveiling Hidden Vulnerabilities in Digital Human Generation via Adversarial Attacks
Viaarxiv icon

Blend the Separated: Mixture of Synergistic Experts for Data-Scarcity Drug-Target Interaction Prediction

Add code
Mar 20, 2025
Viaarxiv icon

Minuscule Cell Detection in AS-OCT Images with Progressive Field-of-View Focusing

Add code
Mar 15, 2025
Viaarxiv icon

LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents

Add code
Mar 13, 2025
Figure 1 for LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents
Figure 2 for LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents
Figure 3 for LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents
Figure 4 for LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents
Viaarxiv icon

PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths

Add code
Feb 18, 2025
Viaarxiv icon