Facial expressions play a fundamental role in human communication. Indeed, they typically reveal the real emotional status of people beyond the spoken language. Moreover, the comprehension of human affect based on visual patterns is a key ingredient for any human-machine interaction system and, for such reasons, the task of Facial Expression Recognition (FER) draws both scientific and industrial interest. In the recent years, Deep Learning techniques reached very high performance on FER by exploiting different architectures and learning paradigms. In such a context, we propose a multi-resolution approach to solve the FER task. We ground our intuition on the observation that often faces images are acquired at different resolutions. Thus, directly considering such property while training a model can help achieve higher performance on recognizing facial expressions. To our aim, we use a ResNet-like architecture, equipped with Squeeze-and-Excitation blocks, trained on the Affect-in-the-Wild 2 dataset. Not being available a test set, we conduct tests and models selection by employing the validation set only on which we achieve more than 90\% accuracy on classifying the seven expressions that the dataset comprises.