Picture for Jian-Fang Hu

Jian-Fang Hu

PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild

Add code
Apr 15, 2025
Viaarxiv icon

Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

Add code
Apr 02, 2025
Viaarxiv icon

ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025

Add code
Mar 30, 2025
Viaarxiv icon

Progressive Human Motion Generation Based on Text and Few Motion Frames

Add code
Mar 17, 2025
Viaarxiv icon

ViSpeak: Visual Instruction Feedback in Streaming Videos

Add code
Mar 17, 2025
Viaarxiv icon

AdvAD: Exploring Non-Parametric Diffusion for Imperceptible Adversarial Attacks

Add code
Mar 12, 2025
Viaarxiv icon

ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations

Add code
Jan 24, 2025
Viaarxiv icon

SAUGE: Taming SAM for Uncertainty-Aligned Multi-Granularity Edge Detection

Add code
Dec 17, 2024
Viaarxiv icon

TechCoach: Towards Technical Keypoint-Aware Descriptive Action Coaching

Add code
Nov 26, 2024
Figure 1 for TechCoach: Towards Technical Keypoint-Aware Descriptive Action Coaching
Figure 2 for TechCoach: Towards Technical Keypoint-Aware Descriptive Action Coaching
Figure 3 for TechCoach: Towards Technical Keypoint-Aware Descriptive Action Coaching
Figure 4 for TechCoach: Towards Technical Keypoint-Aware Descriptive Action Coaching
Viaarxiv icon

SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses

Add code
Aug 07, 2024
Figure 1 for SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses
Figure 2 for SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses
Figure 3 for SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses
Figure 4 for SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses
Viaarxiv icon