Deep convolutional networks based methods have brought great breakthrough in images classification, which provides an end-to-end solution for handwritten Chinese character recognition(HCCR) problem through learning discriminative features automatically. Nevertheless, state-of-the-art CNNs appear to incur huge computation cost, and require the storage of a large number of parameters especially in fully connected layers, which is difficult to deploy such networks into alternative hardware device with the limit of computation amount. To solve the storage problem, we propose a novel technique called Global Weighted Arverage Pooling for reducing the parameters in fully connected layer without loss in accuracy. Besides, we implement a cascaded model in single CNN by adding mid output layer to complete recognition as early as possible, which reduces average inference time significantly. Experiments were performed on the ICDAR-2013 offline HCCR dataset, and it is found that the proposed approach only needs 6.9ms for classfying a chracter image on average, and achieves the state-of-the-art accuracy of 97.1% while requiring only 3.3MB for storage.