Picture for Yudong Yang

Yudong Yang

An End-To-End Stuttering Detection Method Based On Conformer And BILSTM

Add code
Nov 14, 2024
Figure 1 for An End-To-End Stuttering Detection Method Based On Conformer And BILSTM
Figure 2 for An End-To-End Stuttering Detection Method Based On Conformer And BILSTM
Figure 3 for An End-To-End Stuttering Detection Method Based On Conformer And BILSTM
Figure 4 for An End-To-End Stuttering Detection Method Based On Conformer And BILSTM
Viaarxiv icon

Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization

Add code
Oct 09, 2024
Figure 1 for Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization
Figure 2 for Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization
Figure 3 for Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization
Figure 4 for Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization
Viaarxiv icon

Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation

Add code
Sep 25, 2024
Figure 1 for Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Figure 2 for Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Figure 3 for Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Figure 4 for Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation
Viaarxiv icon

Extract and Diffuse: Latent Integration for Improved Diffusion-based Speech and Vocal Enhancement

Add code
Sep 15, 2024
Viaarxiv icon

Exploring Knowledge Transfer in Evolutionary Many-task Optimization: A Complex Network Perspective

Add code
Jul 12, 2024
Viaarxiv icon

Automatic Assessment of Dysarthria Using Audio-visual Vowel Graph Attention Network

Add code
May 07, 2024
Viaarxiv icon

An Audio-textual Diffusion Model For Converting Speech Signals Into Ultrasound Tongue Imaging Data

Add code
Mar 12, 2024
Viaarxiv icon