The development of smart cities and their fast-paced deployment is resulting in the generation of large quantities of data at unprecedented rates. Unfortunately, most of the generated data is wasted without extracting potentially useful information and knowledge because of the lack of established mechanisms and standards that benefit from the availability of such data. Moreover, the high dynamical nature of smart cities calls for new generation of machine learning approaches that are flexible and adaptable to cope with the dynamicity of data to perform analytics and learn from real-time data. In this article, we shed the light on the challenge of under utilizing the big data generated by smart cities from a machine learning perspective. Especially, we present the phenomenon of wasting unlabeled data. We argue that semi-supervision is a must for smart city to address this challenge. We also propose a three-level learning framework for smart cities that matches the hierarchical nature of big data generated by smart cities with a goal of providing different levels of knowledge abstractions. The proposed framework is scalable to meet the needs of smart city services. Fundamentally, the framework benefits from semi-supervised deep reinforcement learning where a small amount of data that has users' feedback serves as labeled data while a larger amount is without such users' feedback serves as unlabeled data. This paper also explores how deep reinforcement learning and its shift toward semi-supervision can handle the cognitive side of smart city services and improve their performance by providing several use cases spanning the different domains of smart cities. We also highlight several challenges as well as promising future research directions for incorporating machine learning and high-level intelligence into smart city services.