This paper proposes a cascading failure mitigation strategy based on Reinforcement Learning (RL) method. Firstly, the principles of RL are introduced. Then, the Multi-Stage Cascading Failure (MSCF) problem is presented and its challenges are investigated. The problem is then tackled by the RL based on DC-OPF (Optimal Power Flow). Designs of the key elements of the RL framework (rewards, states, etc.) are also discussed in detail. Experiments on the IEEE 118-bus system by both shallow and deep neural networks demonstrate promising results in terms of reduced system collapse rates.