Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:GeoDecoder: Empowering Multimodal Map Understanding

Jan 26, 2024

Feng Qi, Mian Dai, Zixian Zheng, Chao Wang

Figure 1 for GeoDecoder: Empowering Multimodal Map Understanding

Figure 2 for GeoDecoder: Empowering Multimodal Map Understanding

Figure 3 for GeoDecoder: Empowering Multimodal Map Understanding

Figure 4 for GeoDecoder: Empowering Multimodal Map Understanding

Share this with someone who'll enjoy it:

Abstract:This paper presents GeoDecoder, a dedicated multimodal model designed for processing geospatial information in maps. Built on the BeitGPT architecture, GeoDecoder incorporates specialized expert modules for image and text processing. On the image side, GeoDecoder utilizes GaoDe Amap as the underlying base map, which inherently encompasses essential details about road and building shapes, relative positions, and other attributes. Through the utilization of rendering techniques, the model seamlessly integrates external data and features such as symbol markers, drive trajectories, heatmaps, and user-defined markers, eliminating the need for extra feature engineering. The text module of GeoDecoder accepts various context texts and question prompts, generating text outputs in the style of GPT. Furthermore, the GPT-based model allows for the training and execution of multiple tasks within the same model in an end-to-end manner. To enhance map cognition and enable GeoDecoder to acquire knowledge about the distribution of geographic entities in Beijing, we devised eight fundamental geospatial tasks and conducted pretraining of the model using large-scale text-image samples. Subsequently, rapid fine-tuning was performed on three downstream tasks, resulting in significant performance improvements. The GeoDecoder model demonstrates a comprehensive understanding of map elements and their associated operations, enabling efficient and high-quality application of diverse geospatial tasks in different business scenarios.

View paper on

Share this with someone who'll enjoy it:

Title:GeoDecoder: Empowering Multimodal Map Understanding

Paper and Code