When performing classification tasks, raw high dimensional features often contain redundant information, and lead to increased computational complexity and overfitting. In this paper, we assume the data samples lie on a single underlying smooth manifold, and define intra-class and inter-class similarities using pairwise local kernel distances. We aim to find a linear projection to maximize the intra-class similarities and minimize the inter-class similarities simultaneously, so that the projected low dimensional data has optimized pairwise distances based on the label information, which is more suitable for a Diffusion Map to do further dimensionality reduction. Numerical experiments on several benchmark datasets show that our proposed approaches are able to extract low dimensional discriminate features that could help us achieve higher classification accuracy.