Picture for Na Su

Na Su

From Visuals to Vocabulary: Establishing Equivalence Between Image and Text Token Through Autoregressive Pre-training in MLLMs

Add code
Feb 13, 2025
Viaarxiv icon

MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language Pretraining

Add code
Jan 27, 2025
Figure 1 for MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language Pretraining
Figure 2 for MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language Pretraining
Figure 3 for MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language Pretraining
Figure 4 for MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language Pretraining
Viaarxiv icon

Memory-efficient High-resolution OCT Volume Synthesis with Cascaded Amortized Latent Diffusion Models

Add code
May 26, 2024
Viaarxiv icon

Adjustable Robust Transformer for High Myopia Screening in Optical Coherence Tomography

Add code
Dec 12, 2023
Viaarxiv icon