Picture for Tomohiro Nakatani

Tomohiro Nakatani

NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge

Add code
Sep 09, 2024
Figure 1 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 2 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 3 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 4 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Viaarxiv icon

Interaural time difference loss for binaural target sound extraction

Add code
Aug 01, 2024
Viaarxiv icon

Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers

Add code
Feb 05, 2024
Viaarxiv icon

Neural network-based virtual microphone estimation with virtual microphone and beamformer-level multi-task loss

Add code
Nov 20, 2023
Viaarxiv icon

Target Speech Extraction with Conditional Diffusion Model

Add code
Aug 17, 2023
Viaarxiv icon

Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers

Add code
Jun 29, 2023
Figure 1 for Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers
Figure 2 for Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers
Figure 3 for Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers
Figure 4 for Modified Parametric Multichannel Wiener Filter \\for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers
Viaarxiv icon

NoisyILRMA: Diffuse-Noise-Aware Independent Low-Rank Matrix Analysis for Fast Blind Source Extraction

Add code
Jun 22, 2023
Viaarxiv icon

Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization

Add code
May 23, 2023
Figure 1 for Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Figure 2 for Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Figure 3 for Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Viaarxiv icon

Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking

Add code
May 07, 2022
Figure 1 for Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking
Figure 2 for Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking
Figure 3 for Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking
Figure 4 for Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking
Viaarxiv icon

Listen only to me! How well can target speech extraction handle false alarms?

Add code
Apr 11, 2022
Figure 1 for Listen only to me! How well can target speech extraction handle false alarms?
Figure 2 for Listen only to me! How well can target speech extraction handle false alarms?
Figure 3 for Listen only to me! How well can target speech extraction handle false alarms?
Figure 4 for Listen only to me! How well can target speech extraction handle false alarms?
Viaarxiv icon