Picture for Yu Gu

Yu Gu

Building A Coding Assistant via the Retrieval-Augmented Language Model

Add code
Oct 21, 2024
Figure 1 for Building A Coding Assistant via the Retrieval-Augmented Language Model
Figure 2 for Building A Coding Assistant via the Retrieval-Augmented Language Model
Figure 3 for Building A Coding Assistant via the Retrieval-Augmented Language Model
Figure 4 for Building A Coding Assistant via the Retrieval-Augmented Language Model
Viaarxiv icon

DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis

Add code
Oct 17, 2024
Viaarxiv icon

SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based on Source-filter Model

Add code
Oct 16, 2024
Figure 1 for SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based on Source-filter Model
Figure 2 for SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based on Source-filter Model
Figure 3 for SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based on Source-filter Model
Figure 4 for SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based on Source-filter Model
Viaarxiv icon

MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging

Add code
Oct 09, 2024
Figure 1 for MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging
Figure 2 for MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging
Figure 3 for MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging
Figure 4 for MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging
Viaarxiv icon

STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment

Add code
Sep 13, 2024
Figure 1 for STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Figure 2 for STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Figure 3 for STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Figure 4 for STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Viaarxiv icon

LCM-SVC: Latent Diffusion Model Based Singing Voice Conversion with Inference Acceleration via Latent Consistency Distillation

Add code
Aug 22, 2024
Viaarxiv icon

VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents

Add code
Aug 12, 2024
Figure 1 for VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents
Figure 2 for VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents
Figure 3 for VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents
Figure 4 for VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents
Viaarxiv icon

Cellular Plasticity Model for Bottom-Up Robotic Design

Add code
Aug 10, 2024
Viaarxiv icon

Enhancing the Code Debugging Ability of LLMs via Communicative Agent Based Data Refinement

Add code
Aug 09, 2024
Viaarxiv icon

Video-to-Audio Generation with Hidden Alignment

Add code
Jul 10, 2024
Figure 1 for Video-to-Audio Generation with Hidden Alignment
Figure 2 for Video-to-Audio Generation with Hidden Alignment
Figure 3 for Video-to-Audio Generation with Hidden Alignment
Figure 4 for Video-to-Audio Generation with Hidden Alignment
Viaarxiv icon