We utilized building footprint datasets from various sources (see Table 2), including OSM (OpenStreetMap contributors, 2025), Google Open buildings (Google Research, 2023), Microsoft Building Footprints (Microsoft, 2024), and CLSM (Shi et al., 2024). Since none of the above-mentioned footprints is complete, we also generated our own global building polygons from an updated version of GlobalBuildingMap (Zhu et al., 2024),
As I'm a frequent OSM contributor, I'm familiar with those other datasets, and they are shit. Just zoom in any river or lake near a city and you will find houses in the water, clearly grid based building are nicely dancing around, etc. The demoed areas are from OSM, and mostly drawn manually, you can see them on a lot other sources without the low quality AI gen contours, e.g.:
Edit: I read it through, and the actual research is they acquired building height data, and added it to the already existing datasets:
All the aforementioned datasets are mostly 2d. OSM has some building height data.
And their point is not to have a detailed and end user friendly dataset, just a global one,
