Picture for Tomoki Toda

Tomoki Toda

Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement

Add code
Mar 19, 2025
Viaarxiv icon

Serenade: A Singing Style Conversion Framework Based On Audio Infilling

Add code
Mar 16, 2025
Viaarxiv icon

Handling Domain Shifts for Anomalous Sound Detection: A Review of DCASE-Related Work

Add code
Mar 13, 2025
Viaarxiv icon

Investigation of perceptual music similarity focusing on each instrumental part

Add code
Feb 04, 2025
Figure 1 for Investigation of perceptual music similarity focusing on each instrumental part
Figure 2 for Investigation of perceptual music similarity focusing on each instrumental part
Figure 3 for Investigation of perceptual music similarity focusing on each instrumental part
Figure 4 for Investigation of perceptual music similarity focusing on each instrumental part
Viaarxiv icon

Wavehax: Aliasing-Free Neural Waveform Synthesis Based on 2D Convolution and Harmonic Prior for Reliable Complex Spectrogram Estimation

Add code
Nov 11, 2024
Figure 1 for Wavehax: Aliasing-Free Neural Waveform Synthesis Based on 2D Convolution and Harmonic Prior for Reliable Complex Spectrogram Estimation
Figure 2 for Wavehax: Aliasing-Free Neural Waveform Synthesis Based on 2D Convolution and Harmonic Prior for Reliable Complex Spectrogram Estimation
Figure 3 for Wavehax: Aliasing-Free Neural Waveform Synthesis Based on 2D Convolution and Harmonic Prior for Reliable Complex Spectrogram Estimation
Figure 4 for Wavehax: Aliasing-Free Neural Waveform Synthesis Based on 2D Convolution and Harmonic Prior for Reliable Complex Spectrogram Estimation
Viaarxiv icon

MOS-Bench: Benchmarking Generalization Abilities of Subjective Speech Quality Assessment Models

Add code
Nov 06, 2024
Viaarxiv icon

Improved Architecture for High-resolution Piano Transcription to Efficiently Capture Acoustic Characteristics of Music Signals

Add code
Sep 29, 2024
Viaarxiv icon

Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions

Add code
Sep 29, 2024
Figure 1 for Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions
Figure 2 for Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions
Figure 3 for Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions
Figure 4 for Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions
Viaarxiv icon

Improvements of Discriminative Feature Space Training for Anomalous Sound Detection in Unlabeled Conditions

Add code
Sep 14, 2024
Viaarxiv icon

The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction

Add code
Sep 11, 2024
Figure 1 for The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Figure 2 for The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Figure 3 for The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Figure 4 for The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Viaarxiv icon