Picture for Yu Gu

Yu Gu

CSSinger: End-to-End Chunkwise Streaming Singing Voice Synthesis System Based on Conditional Variational Autoencoder

Add code
Dec 12, 2024
Viaarxiv icon

XKV: Personalized KV Cache Memory Reduction for Long-Context LLM Inference

Add code
Dec 08, 2024
Viaarxiv icon

Improving Accuracy and Generalization for Efficient Visual Tracking

Add code
Nov 28, 2024
Viaarxiv icon

Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents

Add code
Nov 10, 2024
Figure 1 for Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
Figure 2 for Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
Figure 3 for Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
Figure 4 for Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
Viaarxiv icon

Building A Coding Assistant via the Retrieval-Augmented Language Model

Add code
Oct 21, 2024
Figure 1 for Building A Coding Assistant via the Retrieval-Augmented Language Model
Figure 2 for Building A Coding Assistant via the Retrieval-Augmented Language Model
Figure 3 for Building A Coding Assistant via the Retrieval-Augmented Language Model
Figure 4 for Building A Coding Assistant via the Retrieval-Augmented Language Model
Viaarxiv icon

DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis

Add code
Oct 17, 2024
Viaarxiv icon

SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based on Source-filter Model

Add code
Oct 16, 2024
Figure 1 for SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based on Source-filter Model
Figure 2 for SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based on Source-filter Model
Figure 3 for SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based on Source-filter Model
Figure 4 for SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based on Source-filter Model
Viaarxiv icon

MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging

Add code
Oct 09, 2024
Figure 1 for MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging
Figure 2 for MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging
Figure 3 for MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging
Figure 4 for MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging
Viaarxiv icon

STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment

Add code
Sep 13, 2024
Figure 1 for STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Figure 2 for STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Figure 3 for STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Figure 4 for STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Viaarxiv icon

LCM-SVC: Latent Diffusion Model Based Singing Voice Conversion with Inference Acceleration via Latent Consistency Distillation

Add code
Aug 22, 2024
Viaarxiv icon