Picture for Yuhang Yang

Yuhang Yang

TableGPT2: A Large Multimodal Model with Tabular Data Integration

Add code
Nov 04, 2024
Viaarxiv icon

The Dawn of Video Generation: Preliminary Explorations with SORA-like Models

Add code
Oct 07, 2024
Viaarxiv icon

Grounding 3D Scene Affordance From Egocentric Interactions

Add code
Sep 29, 2024
Viaarxiv icon

EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

Add code
May 22, 2024
Viaarxiv icon

LEMON: Learning 3D Human-Object Interaction Relation from 2D Images

Add code
Dec 14, 2023
Viaarxiv icon

Adapting OpenAI's Whisper for Speech Recognition on Code-Switch Mandarin-English SEAME and ASRU2019 Datasets

Add code
Nov 29, 2023
Viaarxiv icon

Suicidal Pedestrian: Generation of Safety-Critical Scenarios for Autonomous Vehicles

Add code
Sep 01, 2023
Viaarxiv icon

Grounding 3D Object Affordance from 2D Interactions in Images

Add code
Mar 18, 2023
Viaarxiv icon

Speech-text based multi-modal training with bidirectional attention for improved speech recognition

Add code
Nov 01, 2022
Viaarxiv icon

Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition

Add code
Apr 08, 2022
Figure 1 for Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Figure 2 for Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Figure 3 for Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Figure 4 for Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Viaarxiv icon