Abstract:Music scores are written representations of music and contain rich information about musical components. The visual information on music scores includes notes, rests, staff lines, clefs, dynamics, and articulations. This visual information in music scores contains more semantic information than audio and symbolic representations of music. Previous music score datasets have limited sizes and are mainly designed for optical music recognition (OMR). There is a lack of research on creating a large-scale benchmark dataset for music modeling and generation. In this work, we propose MusicScore, a large-scale music score dataset collected and processed from the International Music Score Library Project (IMSLP). MusicScore consists of image-text pairs, where the image is a page of a music score and the text is the metadata of the music. The metadata of MusicScore is extracted from the general information section of the IMSLP pages. The metadata includes rich information about the composer, instrument, piece style, and genre of the music pieces. MusicScore is curated into small, medium, and large scales of 400, 14k, and 200k image-text pairs with varying diversity, respectively. We build a score generation system based on a UNet diffusion model to generate visually readable music scores conditioned on text descriptions to benchmark the MusicScore dataset for music score generation. MusicScore is released to the public at https://huggingface.co/datasets/ZheqiDAI/MusicScore.
Abstract:Swarm aerial robots are required to maintain close proximity to successfully traverse narrow areas in cluttered environments. However, this movement is affected by the downwash effect generated by the other quadrotors in the swarm. This aerodynamic effect is highly nonlinear and hard to model by classic mathematical methods. In addition, the motor speeds of quadrotors are risky to reach the limit when resisting the effect. To solve these problems, we integrate a Neural network Downwash Predictor with Nonlinear Model Predictive Control (NDP-NMPC) to propose a trajectory-tracking approach. The network is trained with spectral normalization to ensure robustness and safety on uncollected cases. The predicted disturbances are then incorporated into the optimization scheme in NMPC, which handles constraints to ensure that the motor speed remains within safe limits. We also design a quadrotor system, identify its parameters, and implement the proposed method onboard. Finally, we conduct an open-loop prediction experiment to verify the safety and effectiveness of the network, and a real-time closed-loop trajectory tracking experiment which demonstrates a 75.37% reduction of tracking error in height under the downwash effect.