Over-the-air federated learning (OTA-FL) unifies communication and model aggregation by leveraging the inherent superposition property of the wireless medium. This strategy can enable scalable and bandwidth-efficient learning via simultaneous transmission of model updates using the same frequency resources, if care is exercised to design the physical layer jointly with learning. In this paper, a federated learning system facilitated by a heterogeneous edge-intelligent network is considered. The edge users (clients) have differing user resources and non-i.i.d. local dataset distributions. A general non-convex learning objective is considered for the model training task(s) at hand. We augment the network with Reconfigurable Intelligent Surfaces (RIS) in order to enhance the learning system. We propose a cross-layer algorithm that jointly assigns communication, computation and learning resources. In particular, we adaptively adjust the number of local steps in conjunction with RIS configuration to boost the learning performance. Our system model considers channel noise and channel estimation errors in both the uplink (model updates) and downlink (global model broadcast), employing dynamic power control for both. We provide the convergence analysis for the proposed algorithms and extend the frameworks to personalized learning. Our experimental results demonstrate that the proposed algorithms outperform the state-of-the-art joint communication and learning baselines.