In contrast to deep networks, kernel methods cannot directly take advantage of depth. In this regard, the deep Restricted Kernel Machine (DRKM) framework allows multiple levels of kernel PCA (KPCA) and Least-Squares Support Vector Machines (LSSVM) to be combined into a deep architecture using visible and hidden units. We propose a new method for DRKM classification coupling the objectives of KPCA and classification levels, with the hidden feature matrix lying on the Stiefel manifold. The classification level can be formulated as an LSSVM or as an MLP feature map, combining depth in terms of levels and layers. The classification level is expressed in its primal formulation, as the deep KPCA levels can embed the most informative components of the data in a much lower dimensional space. In the experiments on benchmark datasets with few available training points, we show that our deep method improves over the LSSVM/MLP and that models with multiple KPCA levels can outperform models with a single level.