Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CityGuessr: City-Level Video Geo-Localization on a Global Scale

Nov 10, 2024

Parth Parag Kulkarni, Gaurav Kumar Nayak, Mubarak Shah

Figure 1 for CityGuessr: City-Level Video Geo-Localization on a Global Scale

Figure 2 for CityGuessr: City-Level Video Geo-Localization on a Global Scale

Figure 3 for CityGuessr: City-Level Video Geo-Localization on a Global Scale

Figure 4 for CityGuessr: City-Level Video Geo-Localization on a Global Scale

Share this with someone who'll enjoy it:

Abstract:Video geolocalization is a crucial problem in current times. Given just a video, ascertaining where it was captured from can have a plethora of advantages. The problem of worldwide geolocalization has been tackled before, but only using the image modality. Its video counterpart remains relatively unexplored. Meanwhile, video geolocalization has also garnered some attention in the recent past, but the existing methods are all restricted to specific regions. This motivates us to explore the problem of video geolocalization at a global scale. Hence, we propose a novel problem of worldwide video geolocalization with the objective of hierarchically predicting the correct city, state/province, country, and continent, given a video. However, no large scale video datasets that have extensive worldwide coverage exist, to train models for solving this problem. To this end, we introduce a new dataset, CityGuessr68k comprising of 68,269 videos from 166 cities all over the world. We also propose a novel baseline approach to this problem, by designing a transformer-based architecture comprising of an elegant Self-Cross Attention module for incorporating scenes as well as a TextLabel Alignment strategy for distilling knowledge from textlabels in feature space. To further enhance our location prediction, we also utilize soft-scene labels. Finally we demonstrate the performance of our method on our new dataset as well as Mapillary(MSLS). Our code and datasets are available at: https://github.com/ParthPK/CityGuessr

* Accepted to ECVA Eurpoean Conference on Computer Vision(ECCV) 2024

View paper on

Share this with someone who'll enjoy it:

Title:CityGuessr: City-Level Video Geo-Localization on a Global Scale

Paper and Code