Picture for Hiroshi Sato

Hiroshi Sato

Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning

Add code
Dec 04, 2024
Figure 1 for Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
Figure 2 for Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
Figure 3 for Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
Figure 4 for Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning
Viaarxiv icon

Error-Feedback Model for Output Correction in Bilateral Control-Based Imitation Learning

Add code
Nov 19, 2024
Viaarxiv icon

Guided Speaker Embedding

Add code
Oct 16, 2024
Figure 1 for Guided Speaker Embedding
Figure 2 for Guided Speaker Embedding
Figure 3 for Guided Speaker Embedding
Figure 4 for Guided Speaker Embedding
Viaarxiv icon

Investigation of Speaker Representation for Target-Speaker Speech Processing

Add code
Oct 15, 2024
Figure 1 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Figure 2 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Figure 3 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Figure 4 for Investigation of Speaker Representation for Target-Speaker Speech Processing
Viaarxiv icon

Alignment-Free Training for Transducer-based Multi-Talker ASR

Add code
Sep 30, 2024
Figure 1 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 2 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 3 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Figure 4 for Alignment-Free Training for Transducer-based Multi-Talker ASR
Viaarxiv icon

Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding

Add code
Sep 30, 2024
Figure 1 for Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Figure 2 for Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Figure 3 for Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Figure 4 for Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding
Viaarxiv icon

NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge

Add code
Sep 09, 2024
Figure 1 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 2 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 3 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Figure 4 for NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
Viaarxiv icon

Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings

Add code
Aug 30, 2024
Viaarxiv icon

SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling

Add code
Jul 01, 2024
Viaarxiv icon

Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance

Add code
Apr 23, 2024
Figure 1 for Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance
Figure 2 for Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance
Figure 3 for Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance
Figure 4 for Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance
Viaarxiv icon