Evaluating human-robot social interactions in a rigorous manner is notoriously difficult: studies are either conducted in labs with constrained protocols to allow for robust measurements and a degree of replicability, but at the cost of ecological validity; or in the wild, which leads to superior experimental realism, but often with limited replicability and at the expense of rigorous interaction metrics. We introduce a novel interaction paradigm, designed to elicit rich and varied social interactions while having desirable scientific properties (replicability, clear metrics, possibility of either autonomous or Wizard-of-Oz robot behaviours). This paradigm focuses on child-robot interactions, and builds on a sandboxed free-play environment. We present the rationale and design of the interaction paradigm, its methodological and technical aspects (including the open-source implementation of the software platform), as well as two large open datasets acquired with this paradigm, and meant to act as experimental baselines for future research.