The recent state-of-the-art deep learning methods have significantly improved brain tumor segmentation. However, fully supervised training requires a large amount of manually labeled masks, which is highly time-consuming and needs domain expertise. Weakly supervised learning with scribbles provides a good trade-off between model accuracy and the effort of manual labeling. However, for segmenting the hierarchical brain tumor structures, manually labeling scribbles for each substructure could still be demanding. In this paper, we use only two kinds of weak labels, i.e., scribbles on whole tumor and healthy brain tissue, and global labels for the presence of each substructure, to train a deep learning model to segment all the sub-regions. Specifically, we train two networks in two phases: first, we only use whole tumor scribbles to train a whole tumor (WT) segmentation network, which roughly recovers the WT mask of training data; then we cluster the WT region with the guide of global labels. The rough substructure segmentation from clustering is used as weak labels to train the second network. The dense CRF loss is used to refine the weakly supervised segmentation. We evaluate our approach on the BraTS2017 dataset and achieve competitive WT dice score as well as comparable scores on substructure segmentation compared to an upper bound when trained with fully annotated masks.