Picture for Wenguan Wang

Wenguan Wang

Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models

Add code
Oct 26, 2024
Figure 1 for Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models
Figure 2 for Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models
Figure 3 for Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models
Figure 4 for Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models
Viaarxiv icon

Scene Graph Generation with Role-Playing Large Language Models

Add code
Oct 20, 2024
Figure 1 for Scene Graph Generation with Role-Playing Large Language Models
Figure 2 for Scene Graph Generation with Role-Playing Large Language Models
Figure 3 for Scene Graph Generation with Role-Playing Large Language Models
Figure 4 for Scene Graph Generation with Role-Playing Large Language Models
Viaarxiv icon

Vision-Language Navigation with Energy-Based Policy

Add code
Oct 18, 2024
Figure 1 for Vision-Language Navigation with Energy-Based Policy
Figure 2 for Vision-Language Navigation with Energy-Based Policy
Figure 3 for Vision-Language Navigation with Energy-Based Policy
Figure 4 for Vision-Language Navigation with Energy-Based Policy
Viaarxiv icon

Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation

Add code
Sep 16, 2024
Viaarxiv icon

Image Segmentation in Foundation Model Era: A Survey

Add code
Aug 23, 2024
Viaarxiv icon

Navigation Instruction Generation with BEV Perception and Large Language Models

Add code
Jul 21, 2024
Viaarxiv icon

Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-driven Diffusion

Add code
Jul 15, 2024
Figure 1 for Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-driven Diffusion
Figure 2 for Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-driven Diffusion
Figure 3 for Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-driven Diffusion
Figure 4 for Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-driven Diffusion
Viaarxiv icon

Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data

Add code
Jul 14, 2024
Viaarxiv icon

Nonverbal Interaction Detection

Add code
Jul 11, 2024
Viaarxiv icon

Controllable Navigation Instruction Generation with Chain of Thought Prompting

Add code
Jul 10, 2024
Viaarxiv icon