What is it about?

This paper introduces VICI, a two-stage AI system that matches limited FoV street photos to satellite images for geolocation. It first retrieves likely satellite matches using visual features, then uses a vision-language model like Gemini to rerank and justify the best match, boosting accuracy and interpretability in real-world settings where panoramic views aren't available.

Featured Image

Why is it important?

This work improves real-world visual geolocation by matching everyday narrow-angle street photos to satellite images, without GPS. It’s crucial for robots, navigation, and rescue operations where GPS fails, and it boosts accuracy and trust by using AI that explains its decisions.

Read the Original

This page is a summary of: VICI: VLM-Instructed Cross-view Image-localisation, October 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3728482.3757386.
You can read the full text:

Read

Contributors

The following have contributed to this page