Picture for Yoshiki Masuyama

Yoshiki Masuyama

Mel-Spectrogram Inversion via Alternating Direction Method of Multipliers

Add code
Jan 09, 2025
Viaarxiv icon

Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition

Add code
Nov 11, 2024
Viaarxiv icon

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech

Add code
Sep 24, 2024
Viaarxiv icon

Exploring the Capability of Mamba in Speech Applications

Add code
Jun 24, 2024
Viaarxiv icon

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization

Add code
Feb 27, 2024
Viaarxiv icon

Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction

Add code
Oct 30, 2023
Viaarxiv icon

Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase

Add code
Jul 23, 2023
Viaarxiv icon

Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation

Add code
Jul 23, 2023
Viaarxiv icon

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

Add code
Jul 14, 2023
Figure 1 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 2 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 3 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 4 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Viaarxiv icon

Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation

Add code
Jun 17, 2023
Viaarxiv icon