Abstract:Social media have become a rich source of data, particularly in health research. Yet, the use of such data raises significant ethical questions about the need for the informed consent of those being studied. Consent mechanisms, if even obtained, are typically broad and inflexible, or place a significant burden on the participant. Machine learning algorithms show much promise for facilitating a 'middle ground' approach: using trained models to predict and automate granular consent decisions. Such techniques, however, raise a myriad of follow-on ethical and technical considerations. In this paper, we present an exploratory user study (n = 67) in which we find that we can predict the appropriate flow of health-related social media data with reasonable accuracy, while minimising undesired data leaks. We then attempt to deconstruct the findings of this study, identifying and discussing a number of real-world implications if such a technique were put into practice.