Picture for Yoshiki Masuyama

Yoshiki Masuyama

ESPnet-SpeechLM: An Open Speech Language Model Toolkit

Add code
Feb 21, 2025
Viaarxiv icon

Mel-Spectrogram Inversion via Alternating Direction Method of Multipliers

Add code
Jan 09, 2025
Viaarxiv icon

Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition

Add code
Nov 11, 2024
Figure 1 for Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition
Figure 2 for Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition
Figure 3 for Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition
Figure 4 for Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition
Viaarxiv icon

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech

Add code
Sep 24, 2024
Viaarxiv icon

Exploring the Capability of Mamba in Speech Applications

Add code
Jun 24, 2024
Viaarxiv icon

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization

Add code
Feb 27, 2024
Viaarxiv icon

Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction

Add code
Oct 30, 2023
Figure 1 for Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction
Figure 2 for Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction
Figure 3 for Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction
Figure 4 for Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction
Viaarxiv icon

Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase

Add code
Jul 23, 2023
Viaarxiv icon

Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation

Add code
Jul 23, 2023
Figure 1 for Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
Figure 2 for Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
Figure 3 for Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation
Viaarxiv icon

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

Add code
Jul 14, 2023
Figure 1 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 2 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 3 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 4 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Viaarxiv icon