Picture for Yapeng Tian

Yapeng Tian

Explainable AI-Generated Image Detection RewardBench

Add code
Nov 15, 2025
Viaarxiv icon

Toward Gaze Target Detection of Young Autistic Children

Add code
Nov 14, 2025
Viaarxiv icon

High-Quality Sound Separation Across Diverse Categories via Visually-Guided Generative Modeling

Add code
Sep 26, 2025
Viaarxiv icon

ANNIE: Be Careful of Your Robots

Add code
Sep 03, 2025
Viaarxiv icon

FreSca: Unveiling the Scaling Space in Diffusion Models

Add code
Apr 02, 2025
Viaarxiv icon

Towards Online Multi-Modal Social Interaction Understanding

Add code
Mar 25, 2025
Figure 1 for Towards Online Multi-Modal Social Interaction Understanding
Figure 2 for Towards Online Multi-Modal Social Interaction Understanding
Figure 3 for Towards Online Multi-Modal Social Interaction Understanding
Figure 4 for Towards Online Multi-Modal Social Interaction Understanding
Viaarxiv icon

PRVQL: Progressive Knowledge-guided Refinement for Robust Egocentric Visual Query Localization

Add code
Feb 11, 2025
Viaarxiv icon

Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters

Add code
Dec 18, 2024
Figure 1 for Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters
Figure 2 for Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters
Figure 3 for Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters
Figure 4 for Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters
Viaarxiv icon

Modality-Inconsistent Continual Learning of Multimodal Large Language Models

Add code
Dec 17, 2024
Figure 1 for Modality-Inconsistent Continual Learning of Multimodal Large Language Models
Figure 2 for Modality-Inconsistent Continual Learning of Multimodal Large Language Models
Figure 3 for Modality-Inconsistent Continual Learning of Multimodal Large Language Models
Figure 4 for Modality-Inconsistent Continual Learning of Multimodal Large Language Models
Viaarxiv icon

VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation

Add code
Dec 14, 2024
Figure 1 for VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation
Figure 2 for VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation
Figure 3 for VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation
Figure 4 for VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation
Viaarxiv icon