Dynamic Time Wrapping (DTW) is a widely used algorithm for measuring similarities between two time series. It is especially valuable in a wide variety of applications, such as clustering, anomaly detection, classification, or video segmentation, where the time-series have different timescales, are irregularly sampled, or are shifted. However, it is not prone to be considered as a loss function in an end-to-end learning framework because of its non-differentiability and its quadratic temporal complexity. While differentiable variants of DTW have been introduced by the community, they still present some drawbacks: computing the distance is still expensive and this similarity tends to blur some differences in the time-series. In this paper, we propose a fast and differentiable approximation of DTW by comparing two architectures: the first one for learning an embedding in which the Euclidean distance mimics the DTW, and the second one for directly predicting the DTW output using regression. We build the former by training a siamese neural network to regress the DTW value between two time-series. Depending on the nature of the activation function, this approximation naturally supports differentiation, and it is efficient to compute. We show, in a time-series retrieval context on EEG datasets, that our methods achieve at least the same level of accuracy as other DTW main approximations with higher computational efficiency. We also show that it can be used to learn in an end-to-end setting on long time series by proposing generative models of EEGs.