In this paper, we propose leveraging the active reconfigurable intelligence surface (RIS) to support reliable gradient aggregation for over-the-air computation (AirComp) enabled federated learning (FL) systems. An analysis of the FL convergence property reveals that minimizing gradient aggregation errors in each training round is crucial for narrowing the convergence gap. As such, we formulate an optimization problem, aiming to minimize these errors by jointly optimizing the transceiver design and RIS configuration. To handle the formulated highly non-convex problem, we devise a two-layer alternative optimization framework to decompose it into several convex subproblems, each solvable optimally. Simulation results demonstrate the superiority of the active RIS in reducing gradient aggregation errors compared to its passive counterpart.