Picture for Jinsong Zhang

Jinsong Zhang

SpeechAct: Towards Generating Whole-body Motion from Speech

Add code
Nov 29, 2023
Viaarxiv icon

High-Quality Animatable Dynamic Garment Reconstruction from Monocular Videos

Add code
Nov 02, 2023
Viaarxiv icon

Towards Grouping in Large Scenes with Occlusion-aware Spatio-temporal Transformers

Add code
Oct 30, 2023
Viaarxiv icon

Narrator: Towards Natural Control of Human-Scene Interaction Generation via Relationship Reasoning

Add code
Mar 16, 2023
Figure 1 for Narrator: Towards Natural Control of Human-Scene Interaction Generation via Relationship Reasoning
Figure 2 for Narrator: Towards Natural Control of Human-Scene Interaction Generation via Relationship Reasoning
Figure 3 for Narrator: Towards Natural Control of Human-Scene Interaction Generation via Relationship Reasoning
Figure 4 for Narrator: Towards Natural Control of Human-Scene Interaction Generation via Relationship Reasoning
Viaarxiv icon

DiaASQ : A Benchmark of Conversational Aspect-based Sentiment Quadruple Analysis

Add code
Nov 20, 2022
Viaarxiv icon

Text-Aware End-to-end Mispronunciation Detection and Diagnosis

Add code
Jun 15, 2022
Figure 1 for Text-Aware End-to-end Mispronunciation Detection and Diagnosis
Figure 2 for Text-Aware End-to-end Mispronunciation Detection and Diagnosis
Figure 3 for Text-Aware End-to-end Mispronunciation Detection and Diagnosis
Figure 4 for Text-Aware End-to-end Mispronunciation Detection and Diagnosis
Viaarxiv icon

Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers

Add code
Jul 08, 2021
Figure 1 for Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers
Figure 2 for Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers
Figure 3 for Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers
Figure 4 for Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers
Viaarxiv icon

Speech Enhancement using Separable Polling Attention and Global Layer Normalization followed with PReLU

Add code
May 06, 2021
Figure 1 for Speech Enhancement using Separable Polling Attention and Global Layer Normalization followed with PReLU
Figure 2 for Speech Enhancement using Separable Polling Attention and Global Layer Normalization followed with PReLU
Figure 3 for Speech Enhancement using Separable Polling Attention and Global Layer Normalization followed with PReLU
Figure 4 for Speech Enhancement using Separable Polling Attention and Global Layer Normalization followed with PReLU
Viaarxiv icon

A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augmentation Techniques

Add code
Apr 17, 2021
Figure 1 for A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augmentation Techniques
Figure 2 for A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augmentation Techniques
Figure 3 for A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augmentation Techniques
Figure 4 for A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augmentation Techniques
Viaarxiv icon

PISE: Person Image Synthesis and Editing with Decoupled GAN

Add code
Mar 31, 2021
Figure 1 for PISE: Person Image Synthesis and Editing with Decoupled GAN
Figure 2 for PISE: Person Image Synthesis and Editing with Decoupled GAN
Figure 3 for PISE: Person Image Synthesis and Editing with Decoupled GAN
Figure 4 for PISE: Person Image Synthesis and Editing with Decoupled GAN
Viaarxiv icon