Graph neural networks (GNNs) have been widely used to learn vector representation of graph-structured data and achieved better task performance than conventional methods. The foundation of GNNs is the message passing procedure, which propagates the information in a node to its neighbors. Since this procedure proceeds one step per layer, the scope of the information propagation among nodes is small in the early layers, and it expands toward the later layers. The problem here is that the model performances degrade as the number of layers increases. A potential cause is that deep GNN models tend to lose the nodes' local information, which would be essential for good model performances, through many message passing steps. To solve this so-called oversmoothing problem, we propose a multi-level attention pooling (MLAP) architecture. It has an attention pooling layer for each message passing step and computes the final graph representation by unifying the layer-wise graph representations. The MLAP architecture allows models to utilize the structural information of graphs with multiple levels of localities because it preserves layer-wise information before losing them due to oversmoothing. Results of our experiments show that the MLAP architecture improves deeper models' performance in graph classification tasks compared to the baseline architectures. In addition, analyses on the layer-wise graph representations suggest that MLAP has the potential to learn graph representations with improved class discriminability by aggregating information with multiple levels of localities.