Recognizing text in the wild is a really challenging task because of complex backgrounds, various illuminations and diverse distortions, even with deep neural networks (convolutional neural networks and recurrent neural networks). In the end-to-end training procedure for scene text recognition, the outputs of deep neural networks at different iterations are always demonstrated with diversity and complementarity for the target object (text). Here, a simple but effective deep learning method, an adaptive ensemble of deep neural networks (AdaDNNs), is proposed to simply select and adaptively combine classifier components at different iterations from the whole learning system. Furthermore, the ensemble is formulated as a Bayesian framework for classifier weighting and combination. A variety of experiments on several typical acknowledged benchmarks, i.e., ICDAR Robust Reading Competition (Challenge 1, 2 and 4) datasets, verify the surprised improvement from the baseline DNNs, and the effectiveness of AdaDNNs compared with the recent state-of-the-art methods.