Deep learning has recently achieved very promising results in a wide range of areas such as computer vision, speech recognition and natural language processing. It aims to learn hierarchical representations of data by using deep architecture models. In a smart city, a lot of data (e.g. videos captured from many distributed sensors) need to be automatically processed and analyzed. In this paper, we review the deep learning algorithms applied to video analytics of smart city in terms of different research topics: object detection, object tracking, face recognition, image classification and scene labeling.