Application of Visual Place Recognition for Vehicle Positioning in a Rural Environment

Kuvaus

This thesis evaluates visual place recognition (VPR) for autonomous navigation in a rural-road scenario representative of GNSS-denied conditions. A fixed image database was constructed from a car-mounted video recorded along the Nordmela–Stave route in Norway and organized according to the Pitts30k test structure within the Deep Visual Geo-Localization Benchmark framework. Ground truth was defined in WGS84 / UTM zone 33N (EPSG:32633). Two query sources were examined: in-domain car-camera frames from the same route and cross-domain Google Street View (GSV) images sampled along the same corridor. A pretrained ResNet-18 network with Generalized Mean (GeM) pooling (ResNet18-GeM) from the benchmark model zoo was used as the feature extractor, and standard retrieval metrics (Recall@k) were reported at multiple distance radii using the benchmark’s evaluation tooling. In the in-domain setting, performance reached 100% for Recall@1, Recall@5, Recall@10, and Recall@20 at a 100m radius. This outcome is consistent with near-duplicate views captured within short temporal offsets along the drive and confirms that data formatting and evaluation were correctly implemented. In the cross-domain setting, recall remained low at strict radii and increased only modestly as the success radius was relaxed. At 100 m the results were R@1/R@5/R@10/R@20 = 1/1/3/4%, and at 800 m they were 7/17/20/23%. Qualitative inspection showed that retrieved images were often visually plausible yet geographically incorrect. It reveals a substantial domain gap between GSV images and car-mounted imagery in terms of viewpoint, focal length, camera geometry, and appearance. The contribution of this work is a benchmark-compatible evaluation of VPR in a rural-road, GNSS-denied use case, combining a real-world car-video database with GSV queries and reporting results under a transparent and reproducible protocol. The study indicates that, without adaptation, a Pitts30k-pretrained ResNet18-GeM generalizes poorly from car-camera imagery to GSV queries in this setting. A practical direction to improve cross-domain performance is focal-length and field-of-view correction of GSV queries so that their effective intrinsics match those of the car-camera database images. If resources permit, light fine-tuning using focal-length-corrected GSV crops could further adapt the descriptor. The results emphasize the importance of dataset compatibility, explicit coordinate handling, and geometric alignment when assessing VPR for autonomous navigation in GNSS-denied rural environments.

URI

DOI

Emojulkaisu

ISBN

ISSN

Aihealue

OKM-julkaisutyyppi