Picture for Zhiyong Wu

Zhiyong Wu

TouchTTS: An Embarrassingly Simple TTS Framework that Everyone Can Touch

Add code
Dec 12, 2024
Viaarxiv icon

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Add code
Dec 06, 2024
Viaarxiv icon

The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024

Add code
Dec 02, 2024
Viaarxiv icon

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Add code
Oct 30, 2024
Figure 1 for OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Figure 2 for OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Figure 3 for OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Figure 4 for OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Viaarxiv icon

AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant

Add code
Oct 24, 2024
Viaarxiv icon

A Controlled Study on Long Context Extension and Generalization in LLMs

Add code
Sep 18, 2024
Figure 1 for A Controlled Study on Long Context Extension and Generalization in LLMs
Figure 2 for A Controlled Study on Long Context Extension and Generalization in LLMs
Figure 3 for A Controlled Study on Long Context Extension and Generalization in LLMs
Figure 4 for A Controlled Study on Long Context Extension and Generalization in LLMs
Viaarxiv icon

Rhythmic Foley: A Framework For Seamless Audio-Visual Alignment In Video-to-Audio Synthesis

Add code
Sep 13, 2024
Viaarxiv icon

RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion

Add code
Sep 10, 2024
Figure 1 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
Figure 2 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
Figure 3 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
Figure 4 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
Viaarxiv icon

An End-to-End Approach for Chord-Conditioned Song Generation

Add code
Sep 10, 2024
Figure 1 for An End-to-End Approach for Chord-Conditioned Song Generation
Figure 2 for An End-to-End Approach for Chord-Conditioned Song Generation
Figure 3 for An End-to-End Approach for Chord-Conditioned Song Generation
Viaarxiv icon

SongCreator: Lyrics-based Universal Song Generation

Add code
Sep 09, 2024
Figure 1 for SongCreator: Lyrics-based Universal Song Generation
Figure 2 for SongCreator: Lyrics-based Universal Song Generation
Figure 3 for SongCreator: Lyrics-based Universal Song Generation
Figure 4 for SongCreator: Lyrics-based Universal Song Generation
Viaarxiv icon