Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Basel Shbita

An Automatic Approach for Generating Rich, Linked Geo-Metadata from Historical Map Images

Dec 03, 2021

Zekun Li, Yao-Yi Chiang, Sasan Tavakkol, Basel Shbita, Johannes H. Uhl, Stefan Leyk, Craig A. Knoblock

Figure 1 for An Automatic Approach for Generating Rich, Linked Geo-Metadata from Historical Map Images

Figure 2 for An Automatic Approach for Generating Rich, Linked Geo-Metadata from Historical Map Images

Figure 3 for An Automatic Approach for Generating Rich, Linked Geo-Metadata from Historical Map Images

Figure 4 for An Automatic Approach for Generating Rich, Linked Geo-Metadata from Historical Map Images

Abstract:Historical maps contain detailed geographic information difficult to find elsewhere covering long-periods of time (e.g., 125 years for the historical topographic maps in the US). However, these maps typically exist as scanned images without searchable metadata. Existing approaches making historical maps searchable rely on tedious manual work (including crowd-sourcing) to generate the metadata (e.g., geolocations and keywords). Optical character recognition (OCR) software could alleviate the required manual work, but the recognition results are individual words instead of location phrases (e.g., "Black" and "Mountain" vs. "Black Mountain"). This paper presents an end-to-end approach to address the real-world problem of finding and indexing historical map images. This approach automatically processes historical map images to extract their text content and generates a set of metadata that is linked to large external geospatial knowledge bases. The linked metadata in the RDF (Resource Description Framework) format support complex queries for finding and indexing historical maps, such as retrieving all historical maps covering mountain peaks higher than 1,000 meters in California. We have implemented the approach in a system called mapKurator. We have evaluated mapKurator using historical maps from several sources with various map styles, scales, and coverage. Our results show significant improvement over the state-of-the-art methods. The code has been made publicly available as modules of the Kartta Labs project at https://github.com/kartta-labs/Project.

* 10.1145/3394486.3403381

Via

Access Paper or Ask Questions

Viola: A Topic Agnostic Generate-and-Rank Dialogue System

Aug 25, 2021

Hyundong Cho, Basel Shbita, Kartik Shenoy, Shuai Liu, Nikhil Patel, Hitesh Pindikanti, Jennifer Lee, Jonathan May

Figure 1 for Viola: A Topic Agnostic Generate-and-Rank Dialogue System

Figure 2 for Viola: A Topic Agnostic Generate-and-Rank Dialogue System

Figure 3 for Viola: A Topic Agnostic Generate-and-Rank Dialogue System

Figure 4 for Viola: A Topic Agnostic Generate-and-Rank Dialogue System

Abstract:We present Viola, an open-domain dialogue system for spoken conversation that uses a topic-agnostic dialogue manager based on a simple generate-and-rank approach. Leveraging recent advances of generative dialogue systems powered by large language models, Viola fetches a batch of response candidates from various neural dialogue models trained with different datasets and knowledge-grounding inputs. Additional responses originating from template-based generators are also considered, depending on the user's input and detected entities. The hand-crafted generators build on a dynamic knowledge graph injected with rich content that is crawled from the web and automatically processed on a daily basis. Viola's response ranker is a fine-tuned polyencoder that chooses the best response given the dialogue history. While dedicated annotations for the polyencoder alone can indirectly steer it away from choosing problematic responses, we add rule-based safety nets to detect neural degeneration and a dedicated classifier to filter out offensive content. We analyze conversations that Viola took part in for the Alexa Prize Socialbot Grand Challenge 4 and discuss the strengths and weaknesses of our approach. Lastly, we suggest future work with a focus on curating conversation data specifcially for socialbots that will contribute towards a more robust data-driven socialbot.

* Alexa Prize Socialbot Grand Challenge 4 Proceedings, 23 pages

Via

Access Paper or Ask Questions