Picture for Dong Yi

Dong Yi

UniSurg: A Video-Native Foundation Model for Universal Understanding of Surgical Videos

Add code
Feb 05, 2026
Viaarxiv icon

BREATH-VL: Vision-Language-Guided 6-DoF Bronchoscopy Localization via Semantic-Geometric Fusion

Add code
Jan 07, 2026
Viaarxiv icon

Anatomy-R1: Enhancing Anatomy Reasoning in Multimodal Large Language Models via Anatomical Similarity Curriculum and Group Diversity Augmentation

Add code
Dec 24, 2025
Viaarxiv icon

Multimodal Causal-Driven Representation Learning for Generalizable Medical Image Segmentation

Add code
Aug 07, 2025
Viaarxiv icon

A Benchmark for Crime Surveillance Video Analysis with Large Models

Add code
Feb 13, 2025
Viaarxiv icon

MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark

Add code
Jan 28, 2025
Figure 1 for MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark
Figure 2 for MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark
Figure 3 for MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark
Figure 4 for MME-Industry: A Cross-Industry Multimodal Evaluation Benchmark
Viaarxiv icon

AnyDesign: Versatile Area Fashion Editing via Mask-Free Diffusion

Add code
Aug 21, 2024
Figure 1 for AnyDesign: Versatile Area Fashion Editing via Mask-Free Diffusion
Figure 2 for AnyDesign: Versatile Area Fashion Editing via Mask-Free Diffusion
Figure 3 for AnyDesign: Versatile Area Fashion Editing via Mask-Free Diffusion
Figure 4 for AnyDesign: Versatile Area Fashion Editing via Mask-Free Diffusion
Viaarxiv icon

Recurrent Context Compression: Efficiently Expanding the Context Window of LLM

Add code
Jun 10, 2024
Figure 1 for Recurrent Context Compression: Efficiently Expanding the Context Window of LLM
Figure 2 for Recurrent Context Compression: Efficiently Expanding the Context Window of LLM
Figure 3 for Recurrent Context Compression: Efficiently Expanding the Context Window of LLM
Figure 4 for Recurrent Context Compression: Efficiently Expanding the Context Window of LLM
Viaarxiv icon

PFDM: Parser-Free Virtual Try-on via Diffusion Model

Add code
Feb 05, 2024
Figure 1 for PFDM: Parser-Free Virtual Try-on via Diffusion Model
Figure 2 for PFDM: Parser-Free Virtual Try-on via Diffusion Model
Figure 3 for PFDM: Parser-Free Virtual Try-on via Diffusion Model
Figure 4 for PFDM: Parser-Free Virtual Try-on via Diffusion Model
Viaarxiv icon

Large-scale Bisample Learning on ID vs. Spot Face Recognition

Add code
Jun 11, 2018
Figure 1 for Large-scale Bisample Learning on ID vs. Spot Face Recognition
Figure 2 for Large-scale Bisample Learning on ID vs. Spot Face Recognition
Figure 3 for Large-scale Bisample Learning on ID vs. Spot Face Recognition
Figure 4 for Large-scale Bisample Learning on ID vs. Spot Face Recognition
Viaarxiv icon