Energy disaggregation, also referred to as a Non-Intrusive Load Monitoring (NILM), is the task of using an aggregate energy signal, for example coming from a whole-home power monitor, to make inferences about the different individual loads of the system. In this paper, we present a novel approach based on the encoder-decoder deep learning framework with an attention mechanism for solving NILM. The attention mechanism is inspired by the temporal attention mechanism that has been recently applied to get state-of-the-art results in neural machine translation, text summarization and speech recognition. The experiments have been conducted on two publicly available datasets AMPds and UK-DALE in seen and unseen conditions. The results show that our proposed deep neural network outperforms the state-of-the-art Denoising Auto-Encoder (DAE) proposed initially by Kelly and Knottenbely (2015) and its extended and improved architecture by Bonfigli et al. (2018), in all the addressed experimental conditions. We also show that modeling attention translates into the ability to correctly detect the state change of each appliance, that is of extreme interest in the field of energy disaggregation.