Multi-view time series classification aims to fuse the distinctive temporal information from different views to further enhance the classification performance. Existing methods mainly focus on fusing multi-view features at an early stage (e.g., learning a common representation shared by multiple views). However, these early fusion methods may not fully exploit the view-specific distinctive patterns in high-dimension time series data. Moreover, the intra-view and inter-view label correlations, which are critical for multi-view classification, are usually ignored in previous works. In this paper, we propose a Global-Local Correlative Channel-AwareFusion (GLCCF) model to address the aforementioned issues. Particularly, our model extracts global and local temporal patterns by a two-stream structure encoder, captures the intra-view and inter-view label correlations by constructing a graph based correlation matrix, and extracts the cross-view global patterns via a learnable channel-aware late fusion mechanism, which could be effectively implemented with a convolutional neural network. Extensive experiments on two real-world datasets demonstrate the superiority of our approach over the state-of-the-art methods. An ablation study is furtherprovided to show the effectiveness of each model component.