Abstract:Social media have become a rich source of data, particularly in health research. Yet, the use of such data raises significant ethical questions about the need for the informed consent of those being studied. Consent mechanisms, if even obtained, are typically broad and inflexible, or place a significant burden on the participant. Machine learning algorithms show much promise for facilitating a 'middle ground' approach: using trained models to predict and automate granular consent decisions. Such techniques, however, raise a myriad of follow-on ethical and technical considerations. In this paper, we present an exploratory user study (n = 67) in which we find that we can predict the appropriate flow of health-related social media data with reasonable accuracy, while minimising undesired data leaks. We then attempt to deconstruct the findings of this study, identifying and discussing a number of real-world implications if such a technique were put into practice.
Abstract:Demand is growing for more accountability in the technological systems that increasingly occupy our world. However, the complexity of many of these systems - often systems of systems - poses accountability challenges. This is because the details and nature of the data flows that interconnect and drive systems, which often occur across technical and organisational boundaries, tend to be opaque. This paper argues that data provenance methods show much promise as a technical means for increasing the transparency of these interconnected systems. Given concerns with the ever-increasing levels of automated and algorithmic decision-making, we make the case for decision provenance. This involves exposing the 'decision pipeline' by tracking the chain of inputs to, and flow-on effects from, the decisions and actions taken within these systems. This paper proposes decision provenance as a means to assist in raising levels of accountability, discusses relevant legal conceptions, and indicates some practical considerations for moving forward.