Grasping has long been considered an important and practical task in robotic manipulation. Yet achieving robust and efficient grasps of diverse objects is challenging, since it involves gripper design, perception, control and learning, etc. Recent learning-based approaches have shown excellent performance in grasping a variety of novel objects. However, these methods either are typically limited to one single grasping mode, or else more end effectors are needed to grasp various objects. In addition, gripper design and learning methods are commonly developed separately, which may not adequately explore the ability of a multimodal gripper. In this paper, we present a deep reinforcement learning (DRL) framework to achieve multistage hybrid robotic grasping with a new soft multimodal gripper. A soft gripper with three grasping modes (i.e., \textit{enveloping}, \textit{sucking}, and \textit{enveloping\_then\_sucking}) can both deal with objects of different shapes and grasp more than one object simultaneously. We propose a novel hybrid grasping method integrated with the multimodal gripper to optimize the number of grasping actions. We evaluate the DRL framework under different scenarios (i.e., with different ratios of objects of two grasp types). The proposed algorithm is shown to reduce the number of grasping actions (i.e., enlarge the grasping efficiency, with maximum values of 161\% in simulations and 154\% in real-world experiments) compared to single grasping modes.