Emotion recognition using electroencephalography (EEG) signals has garnered widespread attention in recent years. However, existing studies have struggled to develop a sufficiently generalized model suitable for different datasets without re-training (cross-corpus). This difficulty arises because distribution differences across datasets far exceed the intra-dataset variability. To solve this problem, we propose a novel Soft Contrastive Masked Modeling (SCMM) framework. Inspired by emotional continuity, SCMM integrates soft contrastive learning with a new hybrid masking strategy to effectively mine the "short-term continuity" characteristics inherent in human emotions. During the self-supervised learning process, soft weights are assigned to sample pairs, enabling adaptive learning of similarity relationships across samples. Furthermore, we introduce an aggregator that weightedly aggregates complementary information from multiple close samples based on pairwise similarities among samples to enhance fine-grained feature representation, which is then used for original sample reconstruction. Extensive experiments on the SEED, SEED-IV and DEAP datasets show that SCMM achieves state-of-the-art (SOTA) performance, outperforming the second-best method by an average accuracy of 4.26% under two types of cross-corpus conditions (same-class and different-class) for EEG-based emotion recognition.