We present a novel vision-based control method to make a group of ground mobile robots achieve a specified formation shape with unspecified size. Our approach uses multiple aerial control units equipped with downward-facing cameras, each observing a partial subset of the multirobot team. The units compute the control commands from the ground robots' image projections, using neither calibration nor scene scale information, and transmit them to the robots. The control strategy relies on the calculation of image similarity transformations, and we show it to be asymptotically stable if the overlaps between the subsets of controlled robots satisfy certain conditions. The presence of the supervisory units, which coordinate their motions to guarantee a correct control performance, gives rise to a hybrid system topology. All in all, the proposed system provides relevant practical advantages in simplicity and flexibility. Within the problem of controlling a team shape, our contribution lies in addressing several simultaneous challenges: the controller needs only partial information of the robotic group, does not use distance measurements or global reference frames, is designed for unicycle agents, and can accommodate topology changes. We present illustrative simulation results.