Picture for Haoyu Zhang

Haoyu Zhang

ChatGPT Encounters Morphing Attack Detection: Zero-Shot MAD with Multi-Modal Large Language Models and General Vision Models

Add code
Mar 13, 2025
Viaarxiv icon

TIME: Temporal-sensitive Multi-dimensional Instruction Tuning and Benchmarking for Video-LLMs

Add code
Mar 13, 2025
Viaarxiv icon

Exo2Ego: Exocentric Knowledge Guided MLLM for Egocentric Video Understanding

Add code
Mar 12, 2025
Viaarxiv icon

Towards Universal Learning-based Model for Cardiac Image Reconstruction: Summary of the CMRxRecon2024 Challenge

Add code
Mar 05, 2025
Viaarxiv icon

Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization

Add code
Feb 25, 2025
Viaarxiv icon

Unified Stochastic Framework for Neural Network Quantization and Pruning

Add code
Dec 24, 2024
Viaarxiv icon

HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios

Add code
Dec 21, 2024
Viaarxiv icon

All-in-One: Transferring Vision Foundation Models into Stereo Matching

Add code
Dec 13, 2024
Viaarxiv icon

Depth-PC: A Visual Servo Framework Integrated with Cross-Modality Fusion for Sim2Real Transfer

Add code
Nov 26, 2024
Figure 1 for Depth-PC: A Visual Servo Framework Integrated with Cross-Modality Fusion for Sim2Real Transfer
Figure 2 for Depth-PC: A Visual Servo Framework Integrated with Cross-Modality Fusion for Sim2Real Transfer
Figure 3 for Depth-PC: A Visual Servo Framework Integrated with Cross-Modality Fusion for Sim2Real Transfer
Figure 4 for Depth-PC: A Visual Servo Framework Integrated with Cross-Modality Fusion for Sim2Real Transfer
Viaarxiv icon

Joint Vision-Language Social Bias Removal for CLIP

Add code
Nov 19, 2024
Viaarxiv icon