Picture for Xin Chen

Xin Chen

Univ. California, Santa Barbara

Q-Mask: Query-driven Causal Masks for Text Anchoring in OCR-Oriented Vision-Language Models

Add code
Mar 31, 2026
Viaarxiv icon

Project Imaging-X: A Survey of 1000+ Open-Access Medical Imaging Datasets for Foundation Model Development

Add code
Mar 29, 2026
Viaarxiv icon

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Add code
Mar 26, 2026
Viaarxiv icon

Neuron-Aware Data Selection In Instruction Tuning For Large Language Models

Add code
Mar 13, 2026
Viaarxiv icon

Fish Audio S2 Technical Report

Add code
Mar 11, 2026
Viaarxiv icon

MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models

Add code
Mar 05, 2026
Viaarxiv icon

Fusion4CA: Boosting 3D Object Detection via Comprehensive Image Exploitation

Add code
Mar 05, 2026
Viaarxiv icon

Coordinated Semantic Alignment and Evidence Constraints for Retrieval-Augmented Generation with Large Language Models

Add code
Mar 04, 2026
Viaarxiv icon

UETrack: A Unified and Efficient Framework for Single Object Tracking

Add code
Mar 03, 2026
Viaarxiv icon

PseudoAct: Leveraging Pseudocode Synthesis for Flexible Planning and Action Control in Large Language Model Agents

Add code
Feb 27, 2026
Viaarxiv icon