Data leakage and Duplication in Benchmark Dataset
In a recent comprehensive analysis, we have uncovered substantial data leakage, duplication, and annotation discrepancies in the popular CrowdAI Mapping Challenge benchmark dataset, which has been extensively utilized for developing semantic segmentation and footprint extraction algorithms of buildings from satellite imagery. This revelation may call into question the validity of