Despite the recent progress, the existing multi-view unsupervised feature selection methods mostly suffer from two limitations. First, they generally utilize either cluster structure or similarity structure to guide the feature selection, neglecting the possibility of a joint formulation with mutual benefits. Second, they often learn the similarity structure by either global structure learning or local structure learning, lacking the capability of graph learning with both global and local structural awareness. In light of this, this paper presents a joint multi-view unsupervised feature selection and graph learning (JMVFG) approach. Particularly, we formulate the multi-view feature selection with orthogonal decomposition, where each target matrix is decomposed into a view-specific basis matrix and a view-consistent cluster indicator. Cross-space locality preservation is incorporated to bridge the cluster structure learning in the projected space and the similarity learning (i.e., graph learning) in the original space. Further, a unified objective function is presented to enable the simultaneous learning of the cluster structure, the global and local similarity structures, and the multi-view consistency and inconsistency, upon which an alternating optimization algorithm is developed with theoretically proved convergence. Extensive experiments demonstrate the superiority of our approach for both multi-view feature selection and graph learning tasks.