Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing Imagery

Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing Imagery
Qualitative examples of extruded building polygons from the INRIA (150) dataset’s official test split. Pix2Poly can predict high-quality building footprints that are immediately usable for 3D reconstruction.

The paper 'Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing Imagery' by Yeshwanth Kumar Adimoolam, Charalambos Poullis, and Melinos Averkiou has been accepted for publication in IEEE/CVF WACV 2025.

TL;DR: This work introduces Pix2Poly, an attention-based, end-to-end trainable, and differentiable deep learning model that generates explicit, high-quality building footprint polygons in ring graph format from aerial images. The method outperforms existing state-of-the-art approaches in accuracy and efficiency without requiring raster losses or complex pipelines, as demonstrated on multiple challenging datasets.

Research paper & Supplementary material: https://arxiv.org/abs/2412.07899

Source code: https://github.com/yeshwanth95/Pix2Poly