Picture for Zhihao Yuan

Zhihao Yuan

Generative Semantic Communication for Text-to-Speech Synthesis

Add code
Oct 04, 2024
Viaarxiv icon

Instance-free Text to Point Cloud Localization with Relative Position Awareness

Add code
Apr 27, 2024
Viaarxiv icon

GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance

Add code
Dec 12, 2023
Viaarxiv icon

Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Add code
Nov 26, 2023
Viaarxiv icon

Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases

Add code
Jul 05, 2022
Figure 1 for Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases
Figure 2 for Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases
Figure 3 for Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases
Figure 4 for Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases
Viaarxiv icon

X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning

Add code
Apr 06, 2022
Figure 1 for X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning
Figure 2 for X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning
Figure 3 for X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning
Figure 4 for X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning
Viaarxiv icon

CLEVR3D: Compositional Language and Elementary Visual Reasoning for Question Answering in 3D Real-World Scenes

Add code
Dec 31, 2021
Figure 1 for CLEVR3D: Compositional Language and Elementary Visual Reasoning for Question Answering in 3D Real-World Scenes
Figure 2 for CLEVR3D: Compositional Language and Elementary Visual Reasoning for Question Answering in 3D Real-World Scenes
Figure 3 for CLEVR3D: Compositional Language and Elementary Visual Reasoning for Question Answering in 3D Real-World Scenes
Figure 4 for CLEVR3D: Compositional Language and Elementary Visual Reasoning for Question Answering in 3D Real-World Scenes
Viaarxiv icon

InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring

Add code
Mar 01, 2021
Figure 1 for InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring
Figure 2 for InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring
Figure 3 for InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring
Figure 4 for InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring
Viaarxiv icon