Picture for Heiga Zen

Heiga Zen

Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback

Add code
Dec 03, 2024
Viaarxiv icon

Geometric-Averaged Preference Optimization for Soft Preference Labels

Add code
Sep 10, 2024
Viaarxiv icon

FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks

Add code
Aug 12, 2024
Figure 1 for FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks
Figure 2 for FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks
Figure 3 for FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks
Figure 4 for FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks
Viaarxiv icon

SimulTron: On-Device Simultaneous Speech to Speech Translation

Add code
Jun 04, 2024
Figure 1 for SimulTron: On-Device Simultaneous Speech to Speech Translation
Figure 2 for SimulTron: On-Device Simultaneous Speech to Speech Translation
Figure 3 for SimulTron: On-Device Simultaneous Speech to Speech Translation
Figure 4 for SimulTron: On-Device Simultaneous Speech to Speech Translation
Viaarxiv icon

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data

Add code
Feb 29, 2024
Figure 1 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Figure 2 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Figure 3 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Figure 4 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Viaarxiv icon

SayTap: Language to Quadrupedal Locomotion

Add code
Jun 14, 2023
Viaarxiv icon

Translatotron 3: Speech to Speech Translation with Monolingual Data

Add code
Jun 01, 2023
Viaarxiv icon

LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus

Add code
May 30, 2023
Figure 1 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 2 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 3 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 4 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Viaarxiv icon

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations

Add code
Mar 03, 2023
Viaarxiv icon