Picture for Xunying Liu

Xunying Liu

Efficient Adapter Tuning for Joint Singing Voice Beat and Downbeat Tracking with Self-supervised Learning Features

Add code
Mar 13, 2025
Viaarxiv icon

Phone-purity Guided Discrete Tokens for Dysarthric Speech Recognition

Add code
Jan 08, 2025
Figure 1 for Phone-purity Guided Discrete Tokens for Dysarthric Speech Recognition
Figure 2 for Phone-purity Guided Discrete Tokens for Dysarthric Speech Recognition
Figure 3 for Phone-purity Guided Discrete Tokens for Dysarthric Speech Recognition
Figure 4 for Phone-purity Guided Discrete Tokens for Dysarthric Speech Recognition
Viaarxiv icon

Effective and Efficient Mixed Precision Quantization of Speech Foundation Models

Add code
Jan 07, 2025
Figure 1 for Effective and Efficient Mixed Precision Quantization of Speech Foundation Models
Figure 2 for Effective and Efficient Mixed Precision Quantization of Speech Foundation Models
Figure 3 for Effective and Efficient Mixed Precision Quantization of Speech Foundation Models
Figure 4 for Effective and Efficient Mixed Precision Quantization of Speech Foundation Models
Viaarxiv icon

Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives

Add code
Jan 07, 2025
Figure 1 for Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives
Figure 2 for Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives
Figure 3 for Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives
Figure 4 for Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives
Viaarxiv icon

Structured Speaker-Deficiency Adaptation of Foundation Models for Dysarthric and Elderly Speech Recognition

Add code
Dec 25, 2024
Viaarxiv icon

Improving Grapheme-to-Phoneme Conversion through In-Context Knowledge Retrieval with Large Language Models

Add code
Nov 12, 2024
Viaarxiv icon

Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR

Add code
Sep 13, 2024
Figure 1 for Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Figure 2 for Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Figure 3 for Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Figure 4 for Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Viaarxiv icon

Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions

Add code
Sep 13, 2024
Figure 1 for Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions
Figure 2 for Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions
Figure 3 for Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions
Figure 4 for Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions
Viaarxiv icon

Exploring SSL Discrete Tokens for Multilingual ASR

Add code
Sep 13, 2024
Figure 1 for Exploring SSL Discrete Tokens for Multilingual ASR
Figure 2 for Exploring SSL Discrete Tokens for Multilingual ASR
Figure 3 for Exploring SSL Discrete Tokens for Multilingual ASR
Figure 4 for Exploring SSL Discrete Tokens for Multilingual ASR
Viaarxiv icon

Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System

Add code
Jul 13, 2024
Viaarxiv icon