Picture for Zhiyong Wu

Zhiyong Wu

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Add code
Oct 30, 2024
Viaarxiv icon

AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant

Add code
Oct 24, 2024
Viaarxiv icon

A Controlled Study on Long Context Extension and Generalization in LLMs

Add code
Sep 18, 2024
Figure 1 for A Controlled Study on Long Context Extension and Generalization in LLMs
Figure 2 for A Controlled Study on Long Context Extension and Generalization in LLMs
Figure 3 for A Controlled Study on Long Context Extension and Generalization in LLMs
Figure 4 for A Controlled Study on Long Context Extension and Generalization in LLMs
Viaarxiv icon

Rhythmic Foley: A Framework For Seamless Audio-Visual Alignment In Video-to-Audio Synthesis

Add code
Sep 13, 2024
Viaarxiv icon

An End-to-End Approach for Chord-Conditioned Song Generation

Add code
Sep 10, 2024
Figure 1 for An End-to-End Approach for Chord-Conditioned Song Generation
Figure 2 for An End-to-End Approach for Chord-Conditioned Song Generation
Figure 3 for An End-to-End Approach for Chord-Conditioned Song Generation
Viaarxiv icon

RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion

Add code
Sep 10, 2024
Figure 1 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
Figure 2 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
Figure 3 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
Figure 4 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
Viaarxiv icon

SongCreator: Lyrics-based Universal Song Generation

Add code
Sep 09, 2024
Figure 1 for SongCreator: Lyrics-based Universal Song Generation
Figure 2 for SongCreator: Lyrics-based Universal Song Generation
Figure 3 for SongCreator: Lyrics-based Universal Song Generation
Figure 4 for SongCreator: Lyrics-based Universal Song Generation
Viaarxiv icon

Comparing Discrete and Continuous Space LLMs for Speech Recognition

Add code
Sep 01, 2024
Figure 1 for Comparing Discrete and Continuous Space LLMs for Speech Recognition
Figure 2 for Comparing Discrete and Continuous Space LLMs for Speech Recognition
Figure 3 for Comparing Discrete and Continuous Space LLMs for Speech Recognition
Figure 4 for Comparing Discrete and Continuous Space LLMs for Speech Recognition
Viaarxiv icon

VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling

Add code
Aug 28, 2024
Figure 1 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 2 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 3 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Figure 4 for VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
Viaarxiv icon

Foundation Models for Music: A Survey

Add code
Aug 27, 2024
Figure 1 for Foundation Models for Music: A Survey
Figure 2 for Foundation Models for Music: A Survey
Figure 3 for Foundation Models for Music: A Survey
Figure 4 for Foundation Models for Music: A Survey
Viaarxiv icon