Picture for Yoshiki Masuyama

Yoshiki Masuyama

Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition

Add code
Nov 11, 2024
Viaarxiv icon

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech

Add code
Sep 24, 2024
Viaarxiv icon

Exploring the Capability of Mamba in Speech Applications

Add code
Jun 24, 2024
Viaarxiv icon

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization

Add code
Feb 27, 2024
Viaarxiv icon

Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction

Add code
Oct 30, 2023
Viaarxiv icon

Signal Reconstruction from Mel-spectrogram Based on Bi-level Consistency of Full-band Magnitude and Phase

Add code
Jul 23, 2023
Viaarxiv icon

Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation

Add code
Jul 23, 2023
Viaarxiv icon

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

Add code
Jul 14, 2023
Viaarxiv icon

Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation

Add code
Jun 17, 2023
Viaarxiv icon

Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge

Add code
Feb 15, 2023
Viaarxiv icon