Picture for Kai Yu

Kai Yu

Sherman

Unified Pathological Speech Analysis with Prompt Tuning

Add code
Nov 05, 2024
Viaarxiv icon

Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding

Add code
Oct 29, 2024
Figure 1 for Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding
Figure 2 for Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding
Figure 3 for Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding
Figure 4 for Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding
Viaarxiv icon

A Survey on Speech Large Language Models

Add code
Oct 24, 2024
Figure 1 for A Survey on Speech Large Language Models
Figure 2 for A Survey on Speech Large Language Models
Figure 3 for A Survey on Speech Large Language Models
Figure 4 for A Survey on Speech Large Language Models
Viaarxiv icon

LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec

Add code
Oct 21, 2024
Viaarxiv icon

MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Add code
Oct 17, 2024
Viaarxiv icon

Converging to a Lingua Franca: Evolution of Linguistic Regions and Semantics Alignment in Multilingual Large Language Models

Add code
Oct 15, 2024
Figure 1 for Converging to a Lingua Franca: Evolution of Linguistic Regions and Semantics Alignment in Multilingual Large Language Models
Figure 2 for Converging to a Lingua Franca: Evolution of Linguistic Regions and Semantics Alignment in Multilingual Large Language Models
Figure 3 for Converging to a Lingua Franca: Evolution of Linguistic Regions and Semantics Alignment in Multilingual Large Language Models
Figure 4 for Converging to a Lingua Franca: Evolution of Linguistic Regions and Semantics Alignment in Multilingual Large Language Models
Viaarxiv icon

SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs

Add code
Oct 12, 2024
Figure 1 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Figure 2 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Figure 3 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Figure 4 for SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs
Viaarxiv icon

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Add code
Oct 09, 2024
Viaarxiv icon

AlignSum: Data Pyramid Hierarchical Fine-tuning for Aligning with Human Summarization Preference

Add code
Oct 01, 2024
Viaarxiv icon

TRANSAGENT: An LLM-Based Multi-Agent System for Code Translation

Add code
Sep 30, 2024
Viaarxiv icon