Picture for Jianwei Yin

Jianwei Yin

Video-QTR: Query-Driven Temporal Reasoning Framework for Lightweight Video Understanding

Add code
Dec 10, 2025
Viaarxiv icon

Walking the Schrödinger Bridge: A Direct Trajectory for Text-to-3D Generation

Add code
Nov 06, 2025
Viaarxiv icon

Geo-R1: Improving Few-Shot Geospatial Referring Expression Understanding with Reinforcement Fine-Tuning

Add code
Sep 26, 2025
Figure 1 for Geo-R1: Improving Few-Shot Geospatial Referring Expression Understanding with Reinforcement Fine-Tuning
Figure 2 for Geo-R1: Improving Few-Shot Geospatial Referring Expression Understanding with Reinforcement Fine-Tuning
Figure 3 for Geo-R1: Improving Few-Shot Geospatial Referring Expression Understanding with Reinforcement Fine-Tuning
Figure 4 for Geo-R1: Improving Few-Shot Geospatial Referring Expression Understanding with Reinforcement Fine-Tuning
Viaarxiv icon

TK-Mamba: Marrying KAN with Mamba for Text-Driven 3D Medical Image Segmentation

Add code
May 24, 2025
Figure 1 for TK-Mamba: Marrying KAN with Mamba for Text-Driven 3D Medical Image Segmentation
Figure 2 for TK-Mamba: Marrying KAN with Mamba for Text-Driven 3D Medical Image Segmentation
Figure 3 for TK-Mamba: Marrying KAN with Mamba for Text-Driven 3D Medical Image Segmentation
Figure 4 for TK-Mamba: Marrying KAN with Mamba for Text-Driven 3D Medical Image Segmentation
Viaarxiv icon

LightRouter: Towards Efficient LLM Collaboration with Minimal Overhead

Add code
May 22, 2025
Viaarxiv icon

Scalable Multi-Stage Influence Function for Large Language Models via Eigenvalue-Corrected Kronecker-Factored Parameterization

Add code
May 08, 2025
Viaarxiv icon

SRMF: A Data Augmentation and Multimodal Fusion Approach for Long-Tail UHR Satellite Image Segmentation

Add code
Apr 28, 2025
Viaarxiv icon

Distilling Transitional Pattern to Large Language Models for Multimodal Session-based Recommendation

Add code
Apr 13, 2025
Viaarxiv icon

ZeroED: Hybrid Zero-shot Error Detection through Large Language Model Reasoning

Add code
Apr 06, 2025
Viaarxiv icon

GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing

Add code
Mar 16, 2025
Viaarxiv icon