Picture for Fan Zhang

Fan Zhang

University of Bristol

EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery

Add code
Jan 20, 2025
Figure 1 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Figure 2 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Figure 3 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Figure 4 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Viaarxiv icon

UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion

Add code
Jan 20, 2025
Viaarxiv icon

X-LeBench: A Benchmark for Extremely Long Egocentric Video Understanding

Add code
Jan 12, 2025
Viaarxiv icon

GeoPix: Multi-Modal Large Language Model for Pixel-level Image Understanding in Remote Sensing

Add code
Jan 12, 2025
Viaarxiv icon

FlipedRAG: Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models

Add code
Jan 06, 2025
Viaarxiv icon

Artificial Intelligence in Creative Industries: Advances Prior to 2025

Add code
Jan 06, 2025
Viaarxiv icon

SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation

Add code
Dec 20, 2024
Viaarxiv icon

Efficient Quantization-Aware Training on Segment Anything Model in Medical Images and Its Deployment

Add code
Dec 15, 2024
Viaarxiv icon

Continuous Gaussian Process Pre-Optimization for Asynchronous Event-Inertial Odometry

Add code
Dec 12, 2024
Viaarxiv icon

Efficient Gravitational Wave Parameter Estimation via Knowledge Distillation: A ResNet1D-IAF Approach

Add code
Dec 11, 2024
Viaarxiv icon