Model reduction of the Markov process is a basic problem in modeling state-transition systems. Motivated by the state aggregation approach rooted in control theory, we study the statistical state compression of a finite-state Markov chain from empirical trajectories. Through the lens of spectral decomposition, we study the rank and features of Markov processes, as well as properties like representability, aggregatability, and lumpability. We develop a class of spectral state compression methods for three tasks: (1) estimate the transition matrix of a low-rank Markov model, (2) estimate the leading subspace spanned by Markov features, and (3) recover latent structures of the state space like state aggregation and lumpable partition. The proposed methods provide an unsupervised learning framework for identifying Markov features and clustering states. We provide upper bounds for the estimation errors and nearly matching minimax lower bounds. Numerical studies are performed on synthetic data and a dataset of New York City taxi trips.