Deep Learning methods have been proven to be flexible to model complex phenomena. This has also been the case of Intelligent Transportation Systems (ITS), in which several areas such as vehicular perception and traffic analysis have widely embraced Deep Learning as a core modeling technology. Particularly in short-term traffic forecasting, the capability of Deep Learning to deliver good results has generated a prevalent inertia towards using Deep Learning models, without examining in depth their benefits and downsides. This paper focuses on critically analyzing the state of the art in what refers to the use of Deep Learning for this particular ITS research area. To this end, we elaborate on the findings distilled from a review of publications from recent years, based on two taxonomic criteria. A posterior critical analysis is held to formulate questions and trigger a necessary debate about the issues of Deep Learning for traffic forecasting. The study is completed with a benchmark of diverse short-term traffic forecasting methods over traffic datasets of different nature, aimed to cover a wide spectrum of possible scenarios. Our experimentation reveals that Deep Learning could not be the best modeling technique for every case, which unveils some caveats unconsidered to date that should be addressed by the community in prospective studies. These insights reveal new challenges and research opportunities in road traffic forecasting, which are enumerated and discussed thoroughly, with the intention of inspiring and guiding future research efforts in this field.