Humans have the fascinating capacity of processing non-verbal visual cues to understand and anticipate the actions of other humans. This "intention reading" ability is underpinned by shared motor-repertoires and action-models, which we use to interpret the intentions of others as if they were our own. We investigate how the different cues contribute to the legibility of human actions during interpersonal interactions. Our first contribution is a publicly available dataset with recordings of human body-motion and eye-gaze, acquired in an experimental scenario with an actor interacting with three subjects. From these data, we conducted a human study to analyse the importance of the different non-verbal cues for action perception. As our second contribution, we used the motion/gaze recordings to build a computational model describing the interaction between two persons. As a third contribution, we embedded this model in the controller of an iCub humanoid robot and conducted a second human study, in the same scenario with the robot as an actor, to validate the model's "intention reading" capability. Our results show that it is possible to model (non-verbal) signals exchanged by humans during interaction, and how to incorporate such a mechanism in robotic systems with the twin goal of : (i) being able to "read" human action intentions, and (ii) acting in a way that is legible by humans.