Picture for Bin Li

Bin Li

Member, IEEE

Multi-modal Multi-platform Person Re-Identification: Benchmark and Method

Add code
Mar 21, 2025
Viaarxiv icon

UMIT: Unifying Medical Imaging Tasks via Vision-Language Models

Add code
Mar 20, 2025
Viaarxiv icon

DRoPE: Directional Rotary Position Embedding for Efficient Agent Interaction Modeling

Add code
Mar 19, 2025
Viaarxiv icon

Robot Skin with Touch and Bend Sensing using Electrical Impedance Tomography

Add code
Mar 17, 2025
Viaarxiv icon

Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation

Add code
Mar 13, 2025
Viaarxiv icon

VLForgery Face Triad: Detection, Localization and Attribution via Multimodal Large Language Models

Add code
Mar 08, 2025
Viaarxiv icon

EVE: Towards End-to-End Video Subtitle Extraction with Vision-Language Models

Add code
Mar 06, 2025
Viaarxiv icon

Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs

Add code
Mar 05, 2025
Viaarxiv icon

DLF: Extreme Image Compression with Dual-generative Latent Fusion

Add code
Mar 03, 2025
Viaarxiv icon

Towards Practical Real-Time Neural Video Compression

Add code
Feb 28, 2025
Viaarxiv icon