Magnetic resonance imaging is a fundamental tool to reach a diagnosis of multiple sclerosis and monitoring its progression. Although several attempts have been made to segment multiple sclerosis lesions using artificial intelligence, fully automated analysis is not yet available. State-of-the-art methods rely on slight variations in segmentation architectures (e.g. U-Net, etc.). However, recent research has demonstrated how exploiting temporal-aware features and attention mechanisms can provide a significant boost to traditional architectures. This paper proposes a framework that exploits an augmented U-Net architecture with a convolutional long short-term memory layer and attention mechanism which is able to segment and quantify multiple sclerosis lesions detected in magnetic resonance images. Quantitative and qualitative evaluation on challenging examples demonstrated how the method outperforms previous state-of-the-art approaches, reporting an overall Dice score of 89% and also demonstrating robustness and generalization ability on never seen new test samples of a new dedicated under construction dataset.