Current facial expression recognition systems demand an expensive re-training routine when deployed to different scenarios than they were trained for. Biasing them towards learning specific facial characteristics, instead of performing typical transfer learning methods, might help these systems to maintain high performance in different tasks, but with a reduced training effort. In this paper, we propose Contrastive Inhibitory Adaptati On (CIAO), a mechanism that adapts the last layer of facial encoders to depict specific affective characteristics on different datasets. CIAO presents an improvement in facial expression recognition performance over six different datasets with very unique affective representations, in particular when compared with state-of-the-art models. In our discussions, we make an in-depth analysis of how the learned high-level facial features are represented, and how they contribute to each individual dataset's characteristics. We finalize our study by discussing how CIAO positions itself within the range of recent findings on non-universal facial expressions perception, and its impact on facial expression recognition research.