We propose a generative framework which takes on the video frame interpolation problem. Our framework, which we call Deep Locally Linear Embedding (DeepLLE), is powered by a deep convolutional neural network (CNN) while it can be used instantly like conventional models. DeepLLE fits an auto-encoding CNN to a set of several consecutive frames and embeds a linearity constraint on the latent codes so that new frames can be generated by interpolating new latent codes. Different from the current deep learning paradigm which requires training on large datasets, DeepLLE works in a plug-and-play and unsupervised manner, and is able to generate an arbitrary number of frames. Thorough experiments demonstrate that without bells and whistles, our method is highly competitive among current state-of-the-art models.