Balanced order batching problem (BOBP) arises from the process of warehouse picking in Cainiao, the largest logistics platform in China. Batching orders together in the picking process to form a single picking route, reduces travel distance. The reason for its importance is that order picking is a labor intensive process and, by using good batching methods, substantial savings can be obtained. The BOBP is a NP-hard combinational optimization problem and designing a good problem-specific heuristic under the quasi-real-time system response requirement is non-trivial. In this paper, rather than designing heuristics, we propose an end-to-end learning and optimization framework named Balanced Task-orientated Graph Clustering Network (BTOGCN) to solve the BOBP by reducing it to balanced graph clustering optimization problem. In BTOGCN, a task-oriented estimator network is introduced to guide the type-aware heterogeneous graph clustering networks to find a better clustering result related to the BOBP objective. Through comprehensive experiments on single-graph and multi-graphs, we show: 1) our balanced task-oriented graph clustering network can directly utilize the guidance of target signal and outperforms the two-stage deep embedding and deep clustering method; 2) our method obtains an average 4.57m and 0.13m picking distance ("m" is the abbreviation of the meter (the SI base unit of length)) reduction than the expert-designed algorithm on single and multi-graph set and has a good generalization ability to apply in practical scenario.