In the past years, power grids have become a valuable target for cyber-attacks. Especially the attacks on the Ukrainian power grid has sparked numerous research into possible attack vectors, their extent, and possible mitigations. However, many fail to consider realistic scenarios in which time series are incorporated into simulations to reflect the transient behaviour of independent generators and consumers. Moreover, very few consider the limited sensory input of a potential attacker. In this paper, we describe a reactive power attack based on a well-understood scenario. We show that independent agents can learn to use the dynamics of the power grid against it and that the attack works even in the face of other generator and consumer nodes acting independently.