Abstract:We introduce $\textit{InteractiveVideo}$, a user-centric framework for video generation. Different from traditional generative approaches that operate based on user-provided images or text, our framework is designed for dynamic interaction, allowing users to instruct the generative model through various intuitive mechanisms during the whole generation process, e.g. text and image prompts, painting, drag-and-drop, etc. We propose a Synergistic Multimodal Instruction mechanism, designed to seamlessly integrate users' multimodal instructions into generative models, thus facilitating a cooperative and responsive interaction between user inputs and the generative process. This approach enables iterative and fine-grained refinement of the generation result through precise and effective user instructions. With $\textit{InteractiveVideo}$, users are given the flexibility to meticulously tailor key aspects of a video. They can paint the reference image, edit semantics, and adjust video motions until their requirements are fully met. Code, models, and demo are available at https://github.com/invictus717/InteractiveVideo
Abstract:The past decade has witnessed the rapid development of geospatial artificial intelligence (GeoAI) primarily due to the ground-breaking achievements in deep learning and machine learning. A growing number of scholars from cartography have demonstrated successfully that GeoAI can accelerate previously complex cartographic design tasks and even enable cartographic creativity in new ways. Despite the promise of GeoAI, researchers and practitioners have growing concerns about the ethical issues of GeoAI for cartography. In this paper, we conducted a systematic content analysis and narrative synthesis of research studies integrating GeoAI and cartography to summarize current research and development trends regarding the usage of GeoAI for cartographic design. Based on this review and synthesis, we first identify dimensions of GeoAI methods for cartography such as data sources, data formats, map evaluations, and six contemporary GeoAI models, each of which serves a variety of cartographic tasks. These models include decision trees, knowledge graph and semantic web technologies, deep convolutional neural networks, generative adversarial networks, graph neural networks, and reinforcement learning. Further, we summarize seven cartographic design applications where GeoAI have been effectively employed: generalization, symbolization, typography, map reading, map interpretation, map analysis, and map production. We also raise five potential ethical challenges that need to be addressed in the integration of GeoAI for cartography: commodification, responsibility, privacy, bias, and (together) transparency, explainability, and provenance. We conclude by identifying four potential research directions for future cartographic research with GeoAI: GeoAI-enabled active cartographic symbolism, human-in-the-loop GeoAI for cartography, GeoAI-based mapping-as-a-service, and generative GeoAI for cartography.
Abstract:Researchers are constantly leveraging new forms of data with the goal of understanding how people perceive the built environment and build the collective place identity of cities. Latest advancements in generative artificial intelligence (AI) models have enabled the production of realistic representations learned from vast amounts of data. In this study, we aim to test the potential of generative AI as the source of textual and visual information in capturing the place identity of cities assessed by filtered descriptions and images. We asked questions on the place identity of a set of 31 global cities to two generative AI models, ChatGPT and DALL-E2. Since generative AI has raised ethical concerns regarding its trustworthiness, we performed cross-validation to examine whether the results show similar patterns to real urban settings. In particular, we compared the outputs with Wikipedia data for text and images searched from Google for image. Our results indicate that generative AI models have the potential to capture the collective image of cities that can make them distinguishable. This study is among the first attempts to explore the capabilities of generative AI in understanding human perceptions of the built environment. It contributes to urban design literature by discussing future research opportunities and potential limitations.
Abstract:Spatial clustering has been widely used for spatial data mining and knowledge discovery. An ideal multivariate spatial clustering should consider both spatial contiguity and aspatial attributes. Existing spatial clustering approaches may face challenges for discovering repeated geographic patterns with spatial contiguity maintained. In this paper, we propose a Spatial Toeplitz Inverse Covariance-Based Clustering (STICC) method that considers both attributes and spatial relationships of geographic objects for multivariate spatial clustering. A subregion is created for each geographic object serving as the basic unit when performing clustering. A Markov random field is then constructed to characterize the attribute dependencies of subregions. Using a spatial consistency strategy, nearby objects are encouraged to belong to the same cluster. To test the performance of the proposed STICC algorithm, we apply it in two use cases. The comparison results with several baseline methods show that the STICC outperforms others significantly in terms of adjusted rand index and macro-F1 score. Join count statistics is also calculated and shows that the spatial contiguity is well preserved by STICC. Such a spatial clustering method may benefit various applications in the fields of geography, remote sensing, transportation, and urban planning, etc.
Abstract:The prevalence of location-based services contributes to the explosive growth of individual-level trajectory data and raises public concerns about privacy issues. In this research, we propose a novel LSTM-TrajGAN approach, which is an end-to-end deep learning model to generate privacy-preserving synthetic trajectory data for data sharing and publication. We design a loss metric function TrajLoss to measure the trajectory similarity losses for model training and optimization. The model is evaluated on the trajectory-user-linking task on a real-world semantic trajectory dataset. Compared with other common geomasking methods, our model can better prevent users from being re-identified, and it also preserves essential spatial, temporal, and thematic characteristics of the real trajectory data. The model better balances the effectiveness of trajectory privacy protection and the utility for spatial and temporal analyses, which offers new insights into the GeoAI-powered privacy protection.
Abstract:The advancement of the Artificial Intelligence (AI) technologies makes it possible to learn stylistic design criteria from existing maps or other visual art and transfer these styles to make new digital maps. In this paper, we propose a novel framework using AI for map style transfer applicable across multiple map scales. Specifically, we identify and transfer the stylistic elements from a target group of visual examples, including Google Maps, OpenStreetMap, and artistic paintings, to unstylized GIS vector data through two generative adversarial network (GAN) models. We then train a binary classifier based on a deep convolutional neural network to evaluate whether the transfer styled map images preserve the original map design characteristics. Our experiment results show that GANs have a great potential for multiscale map style transferring, but many challenges remain requiring future research.
Abstract:The emergence of big data enables us to evaluate the various human emotions at places from a statistic perspective by applying affective computing. In this study, a novel framework for extracting human emotions from large-scale georeferenced photos at different places is proposed. After the construction of places based on spatial clustering of user generated footprints collected in social media websites, online cognitive services are utilized to extract human emotions from facial expressions using the state-of-the-art computer vision techniques. And two happiness metrics are defined for measuring the human emotions at different places. To validate the feasibility of the framework, we take 80 tourist attractions around the world as an example and a happiness ranking list of places is generated based on human emotions calculated over 2 million faces detected out from over 6 million photos. Different kinds of geographical contexts are taken into consideration to find out the relationship between human emotions and environmental factors. Results show that much of the emotional variation at different places can be explained by a few factors such as openness. The research may offer insights on integrating human emotions to enrich the understanding of sense of place in geography and in place-based GIS.