Federated learning refers to a distributed learning scenario in which users/nodes keep their data private but only share intermediate locally computed iterates with the master node. The master, in turn, shares a global aggregate of these iterates with all the nodes at each iteration. In this work, we consider a wireless federated learning scenario where the nodes communicate to and from the master node via a wireless channel. Current and upcoming technologies such as 5G (and beyond) will operate mostly in a non-orthogonal multiple access (NOMA) mode where transmissions from the users occupy the same bandwidth and interfere at the access point. These technologies naturally lend themselves to an "over-the-air" superposition whereby information received from the user nodes can be directly summed at the master node. However, over-the-air aggregation also means that the channel noise can corrupt the algorithm iterates at the time of aggregation at the master. This iteration noise introduces a novel set of challenges that have not been previously studied in the literature. It needs to be treated differently from the well-studied setting of noise or corruption in the dataset itself. In this work, we first study the subspace learning problem in a federated over-the-air setting. Subspace learning involves computing the subspace spanned by the top $r$ singular vectors of a given matrix. We develop a federated over-the-air version of the power method (FedPM) and show that its iterates converge as long as (i) the channel noise is very small compared to the $r$-th singular value of the matrix; and (ii) the ratio between its $(r+1)$-th and $r$-th singular value is smaller than a constant less than one. The second important contribution of this work is to show how over-the-air FedPM can be used to obtain a provably accurate federated solution for subspace tracking in the presence of missing data.