Picture for Ge Zhu

Ge Zhu

Presto! Distilling Steps and Layers for Accelerating Music Generation

Add code
Oct 07, 2024
Viaarxiv icon

Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation

Add code
Aug 13, 2024
Viaarxiv icon

MusicHiFi: Fast High-Fidelity Stereo Vocoding

Add code
Mar 20, 2024
Viaarxiv icon

Cacophony: An Improved Contrastive Audio-Text Model

Add code
Feb 10, 2024
Viaarxiv icon

EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis

Add code
Nov 18, 2023
Viaarxiv icon

Transcription free filler word detection with Neural semi-CRFs

Add code
Mar 11, 2023
Viaarxiv icon

Sharp Eyes: A Salient Object Detector Working The Same Way as Human Visual Characteristics

Add code
Jan 18, 2023
Viaarxiv icon

Music Source Separation with Generative Flow

Add code
Apr 26, 2022
Figure 1 for Music Source Separation with Generative Flow
Figure 2 for Music Source Separation with Generative Flow
Figure 3 for Music Source Separation with Generative Flow
Viaarxiv icon

Filler Word Detection and Classification: A Dataset and Benchmark

Add code
Mar 28, 2022
Figure 1 for Filler Word Detection and Classification: A Dataset and Benchmark
Figure 2 for Filler Word Detection and Classification: A Dataset and Benchmark
Figure 3 for Filler Word Detection and Classification: A Dataset and Benchmark
Figure 4 for Filler Word Detection and Classification: A Dataset and Benchmark
Viaarxiv icon

A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification

Add code
Mar 02, 2022
Figure 1 for A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
Figure 2 for A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
Figure 3 for A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
Figure 4 for A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
Viaarxiv icon