Picture for Hiroshi Saruwatari

Hiroshi Saruwatari

DNN-based ensemble singing voice synthesis with interactions between singers

Add code
Sep 16, 2024
Viaarxiv icon

The T05 System for The VoiceMOS Challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech

Add code
Sep 14, 2024
Viaarxiv icon

Cross-Dialect Text-To-Speech in Pitch-Accent Language Incorporating Multi-Dialect Phoneme-Level BERT

Add code
Sep 11, 2024
Viaarxiv icon

BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec

Add code
Sep 09, 2024
Figure 1 for BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
Figure 2 for BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
Figure 3 for BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
Figure 4 for BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
Viaarxiv icon

SaSLaW: Dialogue Speech Corpus with Audio-visual Egocentric Information Toward Environment-adaptive Dialogue Speech Synthesis

Add code
Aug 13, 2024
Viaarxiv icon

J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling

Add code
Jul 22, 2024
Figure 1 for J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Figure 2 for J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Figure 3 for J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Figure 4 for J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Viaarxiv icon

Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals

Add code
Jun 25, 2024
Viaarxiv icon

Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment

Add code
Jun 11, 2024
Viaarxiv icon

SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark

Add code
Jun 11, 2024
Viaarxiv icon

RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

Add code
Apr 06, 2024
Viaarxiv icon