Picture for Shuang Chen

Shuang Chen

Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

Add code
Feb 02, 2026
Viaarxiv icon

Exploring Reasoning Reward Model for Agents

Add code
Jan 29, 2026
Viaarxiv icon

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

Add code
Jan 29, 2026
Viaarxiv icon

Innovator-VL: A Multimodal Large Language Model for Scientific Discovery

Add code
Jan 27, 2026
Viaarxiv icon

Democratizing planetary-scale analysis: An ultra-lightweight Earth embedding database for accurate and flexible global land monitoring

Add code
Jan 16, 2026
Viaarxiv icon

Wetland mapping from sparse annotations with satellite image time series and temporal-aware segment anything model

Add code
Jan 16, 2026
Viaarxiv icon

IPCV: Information-Preserving Compression for MLLM Visual Encoders

Add code
Dec 21, 2025
Figure 1 for IPCV: Information-Preserving Compression for MLLM Visual Encoders
Figure 2 for IPCV: Information-Preserving Compression for MLLM Visual Encoders
Figure 3 for IPCV: Information-Preserving Compression for MLLM Visual Encoders
Figure 4 for IPCV: Information-Preserving Compression for MLLM Visual Encoders
Viaarxiv icon

Harli: SLO-Aware Co-location of LLM Inference and PEFT-based Finetuning on Model-as-a-Service Platforms

Add code
Nov 19, 2025
Viaarxiv icon

BcQLM: Efficient Vision-Language Understanding with Distilled Q-Gated Cross-Modal Fusion

Add code
Sep 10, 2025
Viaarxiv icon

Interleaving Reasoning for Better Text-to-Image Generation

Add code
Sep 09, 2025
Figure 1 for Interleaving Reasoning for Better Text-to-Image Generation
Figure 2 for Interleaving Reasoning for Better Text-to-Image Generation
Figure 3 for Interleaving Reasoning for Better Text-to-Image Generation
Figure 4 for Interleaving Reasoning for Better Text-to-Image Generation
Viaarxiv icon