Abstract:Automatic smoky vehicle detection in videos is a superior solution to the traditional expensive remote sensing one with ultraviolet-infrared light devices for environmental protection agencies. However, it is challenging to distinguish vehicle smoke from shadow and wet regions coming from rear vehicle or clutter roads, and could be worse due to limited annotated data. In this paper, we first introduce a real-world large-scale smoky vehicle dataset with 75,000 annotated smoky vehicle images, facilitating the effective training of advanced deep learning models. To enable fair algorithm comparison, we also build a smoky vehicle video dataset including 163 long videos with segment-level annotations. Moreover, we present a new Coarse-to-fine Deep Smoky vehicle detection (CoDeS) framework for efficient smoky vehicle detection. The CoDeS first leverages a light-weight YOLO detector for fast smoke detection with high recall rate, and then applies a smoke-vehicle matching strategy to eliminate non-vehicle smoke, and finally uses a elaborately-designed 3D model to further refine the results in spatial temporal space. Extensive experiments in four metrics demonstrate that our framework is significantly superior to those hand-crafted feature based methods and recent advanced methods. The code and dataset will be released at https://github.com/pengxj/smokyvehicle.