Indoor localization has many applications, such as commercial Location Based Services (LBS), robotic navigation, and assistive navigation for the blind. This paper formulates the indoor localization problem into a multimedia retrieving problem by modeling visual landmarks with a panoramic image feature, and calculating a user's location via GPU- accelerated parallel retrieving algorithm. To solve the scene similarity problem, we apply a multi-images based retrieval strategy and a 2D aggregation method to estimate the final retrieval location. Experiments on a campus building real data demonstrate real-time responses (14fps) and robust localization.