Graph kernel is a powerful tool measuring the similarity between graphs. Most of the existing graph kernels focused on node labels or attributes and ignored graph hierarchical structure information. In order to effectively utilize graph hierarchical structure information, we propose pyramid graph kernel based on optimal transport (OT). Each graph is embedded into hierarchical structures of the pyramid. Then, the OT distance is utilized to measure the similarity between graphs in hierarchical structures. We also utilize the OT distance to measure the similarity between subgraphs and propose subgraph kernel based on OT. The positive semidefinite (p.s.d) of graph kernels based on optimal transport distance is not necessarily possible. We further propose regularized graph kernel based on OT where we add the kernel regularization to the original optimal transport distance to obtain p.s.d kernel matrix. We evaluate the proposed graph kernels on several benchmark classification tasks and compare their performance with the existing state-of-the-art graph kernels. In most cases, our proposed graph kernel algorithms outperform the competing methods.