We cover how to determine a sufficiently large sample size for a $K$-armed randomized experiment in order to estimate conditional counterfactual expectations in data-driven subgroups. The sub-groups can be output by any feature space partitioning algorithm, including as defined by binning users having similar predictive scores or as defined by a learned policy tree. After carefully specifying the inference target, a minimum confidence level, and a maximum margin of error, the key is to turn the original goal into a simultaneous inference problem where the recommended sample size to offset an increased possibility of estimation error is directly related to the number of inferences to be conducted. Given a fixed sample size budget, our result allows us to invert the question to one about the feasible number of treatment arms or partition complexity (e.g. number of decision tree leaves). Using policy trees to learn sub-groups, we evaluate our nominal guarantees on a large publicly-available randomized experiment test data set.