Real-world face detection and alignment demand an advanced discriminative model to address challenges by pose, lighting and expression. Illuminated by the deep learning algorithm, some convolutional neural networks based face detection and alignment methods have been proposed. Recent studies have utilized the relation between face detection and alignment to make models computationally efficiency, however they ignore the connection between each cascade CNNs. In this paper, we propose an structure to propose higher quality training data for End-to-End cascade network training, which give computers more space to automatic adjust weight parameter and accelerate convergence. Experiments demonstrate considerable improvement over existing detection and alignment models.