Picture for Tomohiro Nakatani

Tomohiro Nakatani

Joint Enhancement and Classification using Coupled Diffusion Models of Signals and Logits

Add code
Feb 17, 2026
Viaarxiv icon

Loose coupling of spectral and spatial models for multi-channel diarization and enhancement of meetings in dynamic environments

Add code
Jan 22, 2026
Viaarxiv icon

Reference Microphone Selection for Guided Source Separation based on the Normalized L-p Norm

Add code
Oct 31, 2025
Figure 1 for Reference Microphone Selection for Guided Source Separation based on the Normalized L-p Norm
Figure 2 for Reference Microphone Selection for Guided Source Separation based on the Normalized L-p Norm
Figure 3 for Reference Microphone Selection for Guided Source Separation based on the Normalized L-p Norm
Figure 4 for Reference Microphone Selection for Guided Source Separation based on the Normalized L-p Norm
Viaarxiv icon

MOVER: Combining Multiple Meeting Recognition Systems

Add code
Aug 07, 2025
Figure 1 for MOVER: Combining Multiple Meeting Recognition Systems
Figure 2 for MOVER: Combining Multiple Meeting Recognition Systems
Figure 3 for MOVER: Combining Multiple Meeting Recognition Systems
Viaarxiv icon

Description and Discussion on DCASE 2025 Challenge Task 4: Spatial Semantic Segmentation of Sound Scenes

Add code
Jun 12, 2025
Figure 1 for Description and Discussion on DCASE 2025 Challenge Task 4: Spatial Semantic Segmentation of Sound Scenes
Figure 2 for Description and Discussion on DCASE 2025 Challenge Task 4: Spatial Semantic Segmentation of Sound Scenes
Figure 3 for Description and Discussion on DCASE 2025 Challenge Task 4: Spatial Semantic Segmentation of Sound Scenes
Viaarxiv icon

Microphone Array Signal Processing and Deep Learning for Speech Enhancement

Add code
Jan 13, 2025
Figure 1 for Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Figure 2 for Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Figure 3 for Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Figure 4 for Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Viaarxiv icon

NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge

Add code
Sep 09, 2024
Figure 1 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 2 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 3 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 4 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Viaarxiv icon

Interaural time difference loss for binaural target sound extraction

Add code
Aug 01, 2024
Figure 1 for Interaural time difference loss for binaural target sound extraction
Figure 2 for Interaural time difference loss for binaural target sound extraction
Viaarxiv icon

Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers

Add code
Feb 05, 2024
Viaarxiv icon

Neural network-based virtual microphone estimation with virtual microphone and beamformer-level multi-task loss

Add code
Nov 20, 2023
Figure 1 for Neural network-based virtual microphone estimation with virtual microphone and beamformer-level multi-task loss
Figure 2 for Neural network-based virtual microphone estimation with virtual microphone and beamformer-level multi-task loss
Figure 3 for Neural network-based virtual microphone estimation with virtual microphone and beamformer-level multi-task loss
Viaarxiv icon