Unmanned aerial vehicle (UAV) swarms must exploit machine learning (ML) in order to execute various tasks ranging from coordinated trajectory planning to cooperative target recognition. However, due to the lack of continuous connections between the UAV swarm and ground base stations (BSs), using centralized ML will be challenging, particularly when dealing with a large volume of data. In this paper, a novel framework is proposed to implement distributed federated learning (FL) algorithms within a UAV swarm that consists of a leading UAV and several following UAVs. Each following UAV trains a local FL model based on its collected data and then sends this trained local model to the leading UAV who will aggregate the received models, generate a global FL model, and transmit it to followers over the intra-swarm network. To identify how wireless factors, like fading, transmission delay, and UAV antenna angle deviations resulting from wind and mechanical vibrations, impact the performance of FL, a rigorous convergence analysis for FL is performed. Then, a joint power allocation and scheduling design is proposed to optimize the convergence rate of FL while taking into account the energy consumption during convergence and the delay requirement imposed by the swarm's control system. Simulation results validate the effectiveness of the FL convergence analysis and show that the joint design strategy can reduce the number of communication rounds needed for convergence by as much as 35% compared with the baseline design.