This work presents the application of reinforcement learning to improve the performance of a highly dynamic hopping system with a parallel mechanism. Unlike serial mechanisms, parallel mechanisms can not be accurately simulated due to the complexity of their kinematic constraints and closed-loop structures. Besides, learning to hop suffers from prolonged aerial phase and the sparse nature of the rewards. To address them, we propose a learning framework to encode long-history feedback to account for the under-actuation brought by the prolonged aerial phase. In the proposed framework, we also introduce a simplified serial configuration for the parallel design to avoid directly simulating parallel structure during the training. A torque-level conversion is designed to deal with the parallel-serial conversion to handle the sim-to-real issue. Simulation and hardware experiments have been conducted to validate this framework.