This paper discusses experimental design for inference and estimation of individualized treatment allocation rules in the presence of unknown interference, with units being organized into large independent clusters. The contribution is two-fold. First, we design a short pilot study with few clusters for testing whether base-line interventions are welfare-maximizing, with its rejection motivating larger-scale experimentation. Second, we introduce an adaptive randomization procedure to estimate welfare-maximizing individual treatment allocation rules valid under unobserved interference. We propose non-parametric estimators of direct treatments and marginal spillover effects, which serve for hypothesis testing and policy-design. We discuss the asymptotic properties of the estimators and small sample regret guarantees of the estimated policy. Finally, we illustrate the method's advantage in simulations calibrated to an existing experiment on information diffusion.