Abstract:Convolutional neural networks (CNNs) have gained remarkable success in recent years. However, their performance highly relies on the architecture hyperparameters, and finding proper hyperparameters for a deep CNN is a challenging optimization problem owing to its high-dimensional and computationally expensive characteristics. Given these difficulties, this study proposes a surrogate-assisted highly cooperative hyperparameter optimization (SHCHO) algorithm for chain-styled CNNs. To narrow the large search space, SHCHO first decomposes the whole CNN into several overlapping sub-CNNs in accordance with the overlapping hyperparameter interaction structure and then cooperatively optimizes these hyperparameter subsets. Two cooperation mechanisms are designed during this process. One coordinates all the sub-CNNs to reproduce the information flow in the whole CNN and achieve macro cooperation among them, and the other tackles the overlapping components by simultaneously considering the involved two sub-CNNs and facilitates micro cooperation between them. As a result, a proper hyperparameter configuration can be effectively located for the whole CNN. Besides, SHCHO also employs the well-performing surrogate technique to assist in the hyperparameter optimization of each sub-CNN, thereby greatly reducing the expensive computational cost. Extensive experimental results on two widely-used image classification datasets indicate that SHCHO can significantly improve the performance of CNNs.
Abstract:Problem decomposition plays a vital role when applying cooperative coevolution (CC) to large scale global optimization problems. However, most learning-based decomposition algorithms either only apply to additively separable problems or face the issue of false separability detections. Directing against these limitations, this study proposes a novel decomposition algorithm called surrogate-assisted variable grouping (SVG). SVG first designs a general-separability-oriented detection criterion according to whether the optimum of a variable changes with other variables. This criterion is consistent with the separability definition and thus endows SVG with broad applicability and high accuracy. To reduce the fitness evaluation requirement, SVG seeks the optimum of a variable with the help of a surrogate model rather than the original expensive high-dimensional model. Moreover, it converts the variable grouping process into a dynamic-binary-tree search one, which facilitates reutilizing historical separability detection information and thus reducing detection times. To evaluate the performance of SVG, a suite of benchmark functions with up to 2000 dimensions, including additively and non-additively separable ones, were designed. Experimental results on these functions indicate that, compared with six state-of-the-art decomposition algorithms, SVG possesses broader applicability and competitive efficiency. Furthermore, it can significantly enhance the optimization performance of CC.
Abstract:Divide-and-conquer-based (DC-based) evolutionary algorithms (EAs) have achieved notable success in dealing with large-scale optimization problems (LSOPs). However, the appealing performance of this type of algorithms generally requires a high-precision decomposition of the optimization problem, which is still a challenging task for existing decomposition methods. This study attempts to address the above issue from a different perspective and proposes an eigenspace divide-and-conquer (EDC) approach. Different from existing DC-based algorithms that perform decomposition and optimization in the original decision space, EDC first establishes an eigenspace by conducting singular value decomposition on a set of high-quality solutions selected from recent generations. Then it transforms the optimization problem into the eigenspace, and thus significantly weakens the dependencies among the corresponding eigenvariables. Accordingly, these eigenvariables can be efficiently grouped by a simple random strategy and each of the resulting subproblems can be addressed more easily by a traditional EA. To verify the efficiency of EDC, comprehensive experimental studies were conducted on two sets of benchmark functions. Experimental results indicate that EDC is robust to its parameters and has good scalability to the problem dimension. The comparison with several state-of-the-art algorithms further confirms that EDC is pretty competitive and performs better on complicated LSOPs.