As the number of Human-Centered Internet of Things (HCIoT) applications increases, the self-adaptation of its services and devices is becoming a fundamental requirement for addressing the uncertainties of the environment in decision-making processes. Self-adaptation of HCIoT aims to manage run-time changes in a dynamic environment and to adjust the functionality of IoT objects in order to achieve desired goals during execution. SMASH is a semantic-enabled multi-agent system for self-adaptation of HCIoT that autonomously adapts IoT objects to uncertainties of their environment. SMASH addresses the self-adaptation of IoT applications only according to the human values of users, while the behavior of users is not addressed. This article presents Q-SMASH: a multi-agent reinforcement learning-based approach for self-adaptation of IoT objects in human-centered environments. Q-SMASH aims to learn the behaviors of users along with respecting human values. The learning ability of Q-SMASH allows it to adapt itself to the behavioral change of users and make more accurate decisions in different states and situations.