The extraction of spatial-temporal features is a crucial research in transportation studies, and current studies typically use a unified temporal modeling mechanism and fixed spatial graph for this purpose. However, the fixed spatial graph restricts the extraction of spatial features for similar but not directly connected nodes, while the unified temporal modeling mechanism overlooks the heterogeneity of temporal variation of different nodes. To address these challenges, a multi-view fusion neural network (MVFN) approach is proposed. In this approach, spatial local features are extracted through the use of a graph convolutional network (GCN), and spatial global features are extracted using a cosine re-weighting linear attention mechanism (CLA). The GCN and CLA are combined to create a graph-cosine module (GCM) for the extraction of overall spatial features. Additionally, the multi-channel separable temporal convolutional network (MSTCN) makes use of a multi-channel temporal convolutional network (MTCN) at each layer to extract unified temporal features, and a separable temporal convolutional network (STCN) to extract independent temporal features. Finally, the spatial-temporal feature data is input into the prediction layer to obtain the final result. The model has been validated on two traffic demand datasets and achieved the best prediction accuracy.