Picture for Sizhou Chen

Sizhou Chen

DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model

Add code
Feb 26, 2025
Viaarxiv icon

Bridging the Gap between Text, Audio, Image, and Any Sequence: A Novel Approach using Gloss-based Annotation

Add code
Oct 04, 2024
Figure 1 for Bridging the Gap between Text, Audio, Image, and Any Sequence: A Novel Approach using Gloss-based Annotation
Figure 2 for Bridging the Gap between Text, Audio, Image, and Any Sequence: A Novel Approach using Gloss-based Annotation
Figure 3 for Bridging the Gap between Text, Audio, Image, and Any Sequence: A Novel Approach using Gloss-based Annotation
Figure 4 for Bridging the Gap between Text, Audio, Image, and Any Sequence: A Novel Approach using Gloss-based Annotation
Viaarxiv icon

Echotune: A Modular Extractor Leveraging the Variable-Length Nature of Speech in ASR Tasks

Add code
Sep 14, 2023
Viaarxiv icon