Data-driven approaches, including deep learning, have shown great promise as surrogate models across many domains. These extend to various areas in sustainability. An interesting direction for which data-driven methods have not been applied much yet is in the quick quantitative evaluation of urban layouts for planning and design. In particular, urban designs typically involve complex trade-offs between multiple objectives, including limits on urban build-up and/or consideration of urban heat island effect. Hence, it can be beneficial to urban planners to have a fast surrogate model to predict urban characteristics of a hypothetical layout, e.g. pedestrian-level wind velocity, without having to run computationally expensive and time-consuming high-fidelity numerical simulations. This fast surrogate can then be potentially integrated into other design optimization frameworks, including generative models or other gradient-based methods. Here we present the use of CNNs for urban layout characterization that is typically done via high-fidelity numerical simulation. We further apply this model towards a first demonstration of its utility for data-driven pedestrian-level wind velocity prediction. The data set in this work comprises results from high-fidelity numerical simulations of wind velocities for a diverse set of realistic urban layouts, based on randomized samples from a real-world, highly built-up urban city. We then provide prediction results obtained from the trained CNN, demonstrating test errors of under 0.1 m/s for previously unseen urban layouts. We further illustrate how this can be useful for purposes such as rapid evaluation of pedestrian wind velocity for a potential new layout. It is hoped that this data set will further accelerate research in data-driven urban AI, even as our baseline model facilitates quantitative comparison to future methods.