Music listening preferences at a given time depend on a wide range of contextual factors, such as user emotional state, location and activity at listening time, the day of the week, the time of the day, etc. It is therefore of great importance to take them into account when recommending music. However, it is very difficult to develop context-aware recommender systems that consider these factors, both because of the difficulty of detecting some of them, such as emotional state, and because of the drawbacks derived from the inclusion of many factors, such as sparsity problems in contextual pre-filtering. This work involves the proposal of a method for the detection of the user contextual state when listening to music based on the social tags of music items. The intrinsic characteristics of social tagging that allow for the description of items in multiple dimensions can be exploited to capture many contextual dimensions in the user listening sessions. The embeddings of the tags of the first items played in each session are used to represent the context of that session. Recommendations are then generated based on both user preferences and the similarity of the items computed from tag embeddings. Social tags have been used extensively in many recommender systems, however, to our knowledge, they have been hardly used to dynamically infer contextual states.