Abstract:In this paper, we present Deep Extreme Feature Extraction (DEFE), a new ensemble MVA method for searching $\tau^{+}\tau^{-}$ channel of Higgs bosons in high energy physics. DEFE can be viewed as a deep ensemble learning scheme that trains a strongly diverse set of neural feature learners without explicitly encouraging diversity and penalizing correlations. This is achieved by adopting an implicit neural controller (not involved in feedforward compuation) that directly controls and distributes gradient flows from higher level deep prediction network. Such model-independent controller results in that every single local feature learned are used in the feature-to-output mapping stage, avoiding the blind averaging of features. DEFE makes the ensembles 'deep' in the sense that it allows deep post-process of these features that tries to learn to select and abstract the ensemble of neural feature learners. With the application of this model, a selection regions full of signal process can be obtained through the training of a miniature collision events set. In comparison of the Classic Deep Neural Network, DEFE shows a state-of-the-art performance: the error rate has decreased by about 37\%, the accuracy has broken through 90\% for the first time, along with the discovery significance has reached a standard deviation of 6.0 $\sigma$. Experimental data shows that, DEFE is able to train an ensemble of discriminative feature learners that boosts the overperformance of final prediction.