Abstract:Research has shown that personalization of health interventions can contribute to an improved effectiveness. Reinforcement learning algorithms can be used to perform such tailoring using data that is collected about users. Learning is however very fragile for health interventions as only limited time is available to learn from the user before disengagement takes place, or before the opportunity to intervene passes. In this paper, we present a cluster-based reinforcement learning approach which learns across groups of users. Such an approach can speed up the learning process while still giving a level of personalization. The clustering algorithm uses a distance metric over traces of states and rewards. We apply both online and batch learning to learn policies over the clusters and introduce a publicly available simulator which we have developed to evaluate the approach. The results show batch learning clearly outperforms online learning. Furthermore, clustering can be beneficial provided that a proper clustering is found.