Picture for Tatsuya Kawahara

Tatsuya Kawahara

Why Do We Laugh? Annotation and Taxonomy Generation for Laughable Contexts in Spontaneous Text Conversation

Add code
Jan 28, 2025
Viaarxiv icon

An LLM Benchmark for Addressee Recognition in Multi-modal Multi-party Dialogue

Add code
Jan 28, 2025
Viaarxiv icon

Human-Like Embodied AI Interviewer: Employing Android ERICA in Real International Conference

Add code
Dec 13, 2024
Viaarxiv icon

Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection

Add code
Oct 21, 2024
Figure 1 for Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection
Figure 2 for Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection
Figure 3 for Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection
Figure 4 for Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection
Viaarxiv icon

Efficient and Robust Long-Form Speech Recognition with Hybrid H3-Conformer

Add code
Oct 05, 2024
Viaarxiv icon

Analysis and Detection of Differences in Spoken User Behaviors between Autonomous and Wizard-of-Oz Systems

Add code
Oct 04, 2024
Viaarxiv icon

Robotic Backchanneling in Online Conversation Facilitation: A Cross-Generational Study

Add code
Sep 25, 2024
Viaarxiv icon

Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition

Add code
Sep 01, 2024
Figure 1 for Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Figure 2 for Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Figure 3 for Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Figure 4 for Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Viaarxiv icon

Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction

Add code
Aug 29, 2024
Figure 1 for Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction
Figure 2 for Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction
Figure 3 for Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction
Figure 4 for Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction
Viaarxiv icon

StyEmp: Stylizing Empathetic Response Generation via Multi-Grained Prefix Encoder and Personality Reinforcement

Add code
Aug 05, 2024
Viaarxiv icon