In recent years, there has been growing focus on the study of automated recommender systems. Music recommendation systems serve as a prominent domain for such works, both from an academic and a commercial perspective. A fundamental aspect of music perception is that music is experienced in temporal context and in sequence. In this work we present DJ-MC, a novel reinforcement-learning framework for music recommendation that does not recommend songs individually but rather song sequences, or playlists, based on a model of preferences for both songs and song transitions. The model is learned online and is uniquely adapted for each listener. To reduce exploration time, DJ-MC exploits user feedback to initialize a model, which it subsequently updates by reinforcement. We evaluate our framework with human participants using both real song and playlist data. Our results indicate that DJ-MC's ability to recommend sequences of songs provides a significant improvement over more straightforward approaches, which do not take transitions into account.