AGI2P: Benchmarking Aerial–Ground Image-to-Point Cloud Localization with a Large-Scale Dataset

¹University of Calgary, ²Nanyang Technological University, ³Wuhan University, ⁴Chongqing Technology and Business University ⁵Queen's University

^*Corresponding author

Main contributions：

(1) We present a new large-scale dataset that enables aerial-ground cross-modal localization by combining ground-level imagery from mobile mapping systems with ALS point clouds. The data span three representative urban areas—Wuhan, Hong Kong, and San Francisco—and will be made publicly accessible to the research community.

(2) We propose an indirect yet scalable approach for generating accurate 6-DoF ground-truth image poses. This is achieved by registering mobile LiDAR submaps to ALS data using ground segmentation and façade reconstruction, followed by multi-sensor pose graph optimization.

(3) We establish a unified benchmarking suite for both global and fine-grained I2P localization, and evaluate state-of-the-art methods under challenging cross-view and cross-modality conditions. Future research trends are summarized according to the evaluation results.

AGI2P: Benchmarking Aerial–Ground Image-to-Point Cloud Localization with a Large-Scale Dataset

Global distribution of our dataset on the map.

Dataset coverage and collection. (a), (b) and (c) illustrate trajectories of the Hong Kong, California and Wuhan datasets, while (d), (e) and (f) are the corresponding data acquisition platforms, with (d) and (e) provided by authors of UrbanNav and UrbanLoco , respectively.

File structure of the dataset (taking Wuhan Loop 1 as an example).

Projection of ALS point clouds to ground images. (a), (b) and (c) are from Wuhan, Hong Kong and California datasets, respectively. Point clouds are colorized by depth, with colors ranging from blue (near) to red (far), through green and yellow.