Network densification and millimeter-wave technologies are key enablers to fulfill the capacity and data rate requirements of the fifth generation (5G) of mobile networks. In this context, designing low-complexity policies with local observations, yet able to adapt the user association with respect to the global network state and to the network dynamics is a challenge. In fact, the frameworks proposed in literature require continuous access to global network information and to recompute the association when the radio environment changes. With the complexity associated to such an approach, these solutions are not well suited to dense 5G networks. In this paper, we address this issue by designing a scalable and flexible algorithm for user association based on multi-agent reinforcement learning. In this approach, users act as independent agents that, based on their local observations only, learn to autonomously coordinate their actions in order to optimize the network sum-rate. Since there is no direct information exchange among the agents, we also limit the signaling overhead. Simulation results show that the proposed algorithm is able to adapt to (fast) changes of radio environment, thus providing large sum-rate gain in comparison to state-of-the-art solutions.