Picture for Tatsuya Kawahara

Tatsuya Kawahara

Human-Like Embodied AI Interviewer: Employing Android ERICA in Real International Conference

Add code
Dec 13, 2024
Viaarxiv icon

Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection

Add code
Oct 21, 2024
Figure 1 for Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection
Figure 2 for Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection
Figure 3 for Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection
Figure 4 for Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection
Viaarxiv icon

Efficient and Robust Long-Form Speech Recognition with Hybrid H3-Conformer

Add code
Oct 05, 2024
Viaarxiv icon

Analysis and Detection of Differences in Spoken User Behaviors between Autonomous and Wizard-of-Oz Systems

Add code
Oct 04, 2024
Viaarxiv icon

Robotic Backchanneling in Online Conversation Facilitation: A Cross-Generational Study

Add code
Sep 25, 2024
Viaarxiv icon

Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition

Add code
Sep 01, 2024
Viaarxiv icon

Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction

Add code
Aug 29, 2024
Figure 1 for Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction
Figure 2 for Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction
Figure 3 for Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction
Figure 4 for Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction
Viaarxiv icon

StyEmp: Stylizing Empathetic Response Generation via Multi-Grained Prefix Encoder and Personality Reinforcement

Add code
Aug 05, 2024
Viaarxiv icon

Multilingual Turn-taking Prediction Using Voice Activity Projection

Add code
Mar 14, 2024
Viaarxiv icon

Investigation of Adapter for Automatic Speech Recognition in Noisy Environment

Add code
Feb 29, 2024
Viaarxiv icon