Multilayer graphs are appealing mathematical tools for modeling multiple types of relationship in the data. In this paper, we aim at analyzing multilayer graphs by properly combining the information provided by individual layers, while preserving the specific structure that allows us to eventually identify communities or clusters that are crucial in the analysis of graph data. To do so, we learn a clustered representative graph by solving an optimization problem that involves a data fidelity term to the observed layers, and a regularization pushing for a sparse and community-aware graph. We use the contrastive loss as a data fidelity term, in order to properly aggregate the observed layers into a representative graph. The regularization is based on a measure of graph sparsification called "effective resistance", coupled with a penalization of the first few eigenvalues of the representative graph Laplacian matrix to favor the formation of communities. The proposed optimization problem is nonconvex but fully differentiable, and thus can be solved via the projected gradient method. Experiments show that our method leads to a significant improvement w.r.t. state-of-the-art multilayer graph learning algorithms for solving clustering problems.