Picture for Hisashi Kawai

Hisashi Kawai

WaveNeXt 2: ConvNeXt-Based Fast Neural Vocoders With Residual Denoising and Sub-Modeling for GAN and Diffusion Models

Add code
May 25, 2026
Viaarxiv icon

Layer-wise Analysis for Quality of Multilingual Synthesized Speech

Add code
Sep 05, 2025
Viaarxiv icon

Cross-modal Knowledge Transfer Learning as Graph Matching Based on Optimal Transport for ASR

Add code
May 19, 2025
Viaarxiv icon

Retrieval-Augmented Speech Recognition Approach for Domain Challenges

Add code
Feb 21, 2025
Figure 1 for Retrieval-Augmented Speech Recognition Approach for Domain Challenges
Figure 2 for Retrieval-Augmented Speech Recognition Approach for Domain Challenges
Figure 3 for Retrieval-Augmented Speech Recognition Approach for Domain Challenges
Figure 4 for Retrieval-Augmented Speech Recognition Approach for Domain Challenges
Viaarxiv icon

Temporal Order Preserved Optimal Transport-based Cross-modal Knowledge Transfer Learning for ASR

Add code
Sep 03, 2024
Figure 1 for Temporal Order Preserved Optimal Transport-based Cross-modal Knowledge Transfer Learning for ASR
Figure 2 for Temporal Order Preserved Optimal Transport-based Cross-modal Knowledge Transfer Learning for ASR
Figure 3 for Temporal Order Preserved Optimal Transport-based Cross-modal Knowledge Transfer Learning for ASR
Viaarxiv icon

Generative linguistic representation for spoken language identification

Add code
Dec 18, 2023
Viaarxiv icon

Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition

Add code
Dec 18, 2023
Viaarxiv icon

Neural domain alignment for spoken language recognition based on optimal transport

Add code
Oct 20, 2023
Figure 1 for Neural domain alignment for spoken language recognition based on optimal transport
Figure 2 for Neural domain alignment for spoken language recognition based on optimal transport
Figure 3 for Neural domain alignment for spoken language recognition based on optimal transport
Figure 4 for Neural domain alignment for spoken language recognition based on optimal transport
Viaarxiv icon

Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR

Add code
Sep 28, 2023
Figure 1 for Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR
Figure 2 for Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR
Figure 3 for Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR
Figure 4 for Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-based ASR
Viaarxiv icon

Cross-modal Alignment with Optimal Transport for CTC-based ASR

Add code
Sep 24, 2023
Figure 1 for Cross-modal Alignment with Optimal Transport for CTC-based ASR
Figure 2 for Cross-modal Alignment with Optimal Transport for CTC-based ASR
Figure 3 for Cross-modal Alignment with Optimal Transport for CTC-based ASR
Viaarxiv icon