Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuxing Wang

SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias

Sep 13, 2022

Zihao Wang, Kejun Zhang, Yuxing Wang, Chen Zhang, Qihao Liang, Pengfei Yu, Yongsheng Feng, Wenbo Liu, Yikai Wang, Yuntai Bao(+1 more)

Figure 1 for SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias

Figure 2 for SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias

Figure 3 for SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias

Figure 4 for SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias

Abstract:Real-time music accompaniment generation has a wide range of applications in the music industry, such as music education and live performances. However, automatic real-time music accompaniment generation is still understudied and often faces a trade-off between logical latency and exposure bias. In this paper, we propose SongDriver, a real-time music accompaniment generation system without logical latency nor exposure bias. Specifically, SongDriver divides one accompaniment generation task into two phases: 1) The arrangement phase, where a Transformer model first arranges chords for input melodies in real-time, and caches the chords for the next phase instead of playing them out. 2) The prediction phase, where a CRF model generates playable multi-track accompaniments for the coming melodies based on previously cached chords. With this two-phase strategy, SongDriver directly generates the accompaniment for the upcoming melody, achieving zero logical latency. Furthermore, when predicting chords for a timestep, SongDriver refers to the cached chords from the first phase rather than its previous predictions, which avoids the exposure bias problem. Since the input length is often constrained under real-time conditions, another potential problem is the loss of long-term sequential information. To make up for this disadvantage, we extract four musical features from a long-term music piece before the current time step as global information. In the experiment, we train SongDriver on some open-source datasets and an original \`aiSong Dataset built from Chinese-style modern pop music scores. The results show that SongDriver outperforms existing SOTA (state-of-the-art) models on both objective and subjective metrics, meanwhile significantly reducing the physical latency.

* *Both Zihao Wang and Qihao Liang contribute equally to the paper and share the co-first authorship. This paper has been accepted by ACM Multimedia 2022 for oral presentation

Via

Access Paper or Ask Questions

What Dense Graph Do You Need for Self-Attention?

Jun 09, 2022

Yuxing Wang, Chu-Tak Lee, Qipeng Guo, Zhangyue Yin, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu

Figure 1 for What Dense Graph Do You Need for Self-Attention?

Figure 2 for What Dense Graph Do You Need for Self-Attention?

Figure 3 for What Dense Graph Do You Need for Self-Attention?

Figure 4 for What Dense Graph Do You Need for Self-Attention?

Abstract:Transformers have made progress in miscellaneous tasks, but suffer from quadratic computational and memory complexities. Recent works propose sparse Transformers with attention on sparse graphs to reduce complexity and remain strong performance. While effective, the crucial parts of how dense a graph needs to be to perform well are not fully explored. In this paper, we propose Normalized Information Payload (NIP), a graph scoring function measuring information transfer on graph, which provides an analysis tool for trade-offs between performance and complexity. Guided by this theoretical analysis, we present Hypercube Transformer, a sparse Transformer that models token interactions in a hypercube and shows comparable or even better results with vanilla Transformer while yielding $O(N\log N)$ complexity with sequence length $N$. Experiments on tasks requiring various sequence lengths lay validation for our graph function well.

* Accepted by ICML 2022. Code is available at https://github.com/yxzwang/Normalized-Information-Payload

Via

Access Paper or Ask Questions

A Surrogate-Assisted Controller for Expensive Evolutionary Reinforcement Learning

Jan 01, 2022

Yuxing Wang, Tiantian Zhang, Yongzhe Chang, Bin Liang, Xueqian Wang, Bo Yuan

Figure 1 for A Surrogate-Assisted Controller for Expensive Evolutionary Reinforcement Learning

Figure 2 for A Surrogate-Assisted Controller for Expensive Evolutionary Reinforcement Learning

Figure 3 for A Surrogate-Assisted Controller for Expensive Evolutionary Reinforcement Learning

Figure 4 for A Surrogate-Assisted Controller for Expensive Evolutionary Reinforcement Learning

Abstract:The integration of Reinforcement Learning (RL) and Evolutionary Algorithms (EAs) aims at simultaneously exploiting the sample efficiency as well as the diversity and robustness of the two paradigms. Recently, hybrid learning frameworks based on this principle have achieved great success in various challenging robot control tasks. However, in these methods, policies from the genetic population are evaluated via interactions with the real environments, limiting their applicability in computationally expensive problems. In this work, we propose Surrogate-assisted Controller (SC), a novel and efficient module that can be integrated into existing frameworks to alleviate the computational burden of EAs by partially replacing the expensive policy evaluation. The key challenge in applying this module is to prevent the optimization process from being misled by the possible false minima introduced by the surrogate. To address this issue, we present two strategies for SC to control the workflow of hybrid frameworks. Experiments on six continuous control tasks from the OpenAI Gym platform show that SC can not only significantly reduce the cost of fitness evaluations, but also boost the performance of the original hybrid frameworks with collaborative learning and evolutionary processes.

Via

Access Paper or Ask Questions