Abstract:Vision-language models (VLMs) show great promise for 3D scene understanding but are mainly applied to indoor spaces or autonomous driving, focusing on low-level tasks like segmentation. This work expands their use to urban-scale environments by leveraging 3D reconstructions from multi-view aerial imagery. We propose OpenCity3D, an approach that addresses high-level tasks, such as population density estimation, building age classification, property price prediction, crime rate assessment, and noise pollution evaluation. Our findings highlight OpenCity3D's impressive zero-shot and few-shot capabilities, showcasing adaptability to new contexts. This research establishes a new paradigm for language-driven urban analytics, enabling applications in planning, policy, and environmental monitoring. See our project page: opencity3d.github.io
Abstract:The ability to extract general laws from a few known examples depends on the complexity of the problem and on the amount of training data. In the quantum setting, the learner's generalization performance is further challenged by the destructive nature of quantum measurements that, together with the no-cloning theorem, limits the amount of information that can be extracted from each training sample. In this paper we focus on hybrid quantum learning techniques where classical machine-learning methods are paired with quantum algorithms and show that, in some settings, the uncertainty coming from a few measurement shots can be the dominant source of errors. We identify an instance of this possibly general issue by focusing on the classification of maximally entangled vs. separable states, showing that this toy problem becomes challenging for learners unaware of entanglement theory. Finally, we introduce an estimator based on classical shadows that performs better in the big data, few copy regime. Our results show that the naive application of classical machine-learning methods to the quantum setting is problematic, and that a better theoretical foundation of quantum learning is required.