High-quality representation of transactional sequences is vital for modern banking applications, including risk management, churn prediction, and personalized customer offers. Different tasks require distinct representation properties: local tasks benefit from capturing the client's current state, while global tasks rely on general behavioral patterns. Previous research has demonstrated that various self-supervised approaches yield representations that better capture either global or local qualities. This study investigates the integration of two self-supervised learning techniques - instance-wise contrastive learning and a generative approach based on restoring masked events in latent space. The combined approach creates representations that balance local and global transactional data characteristics. Experiments conducted on several public datasets, focusing on sequence classification and next-event type prediction, show that the integrated method achieves superior performance compared to individual approaches and demonstrates synergistic effects. These findings suggest that the proposed approach offers a robust framework for advancing event sequences representation learning in the financial sector.