Monitoring optical properties of coastal and open ocean waters is crucial to assessing the health of marine ecosystems. Deep learning offers a promising approach to address these ecosystem dynamics, especially in scenarios where gap-free ground-truth data is lacking, which poses a challenge for designing effective training frameworks. Using an advanced neural variational data assimilation scheme (called 4DVarNet), we introduce a comprehensive training framework designed to effectively train directly on gappy data sets. Using the Mediterranean Sea as a case study, our experiments not only highlight the high performance of the chosen neural network in reconstructing gap-free images from gappy datasets but also demonstrate its superior performance over state-of-the-art algorithms such as DInEOF and Direct Inversion, whether using CNN or UNet architectures.