Magnify Your Population: Statistical Downscaling to Augment the Spatial Resolution of Socioeconomic Census Data
Fine resolution estimates of demographic and socioeconomic attributes are crucial for planning and policy development. While several efforts have been made to produce fine-scale gridded population estimates, socioeconomic features are typically not available at scales finer than Census units, which may hide local heterogeneity and disparity. In this paper we present a new statistical downscaling approach to derive fine-scale estimates of key socioeconomic attributes. The method leverages demographic and geographical extensive covariates available at multiple scales and additional Census covariates only available at coarse resolution, which are included in the model hierarchically within a "forward learning" approach. For each selected socioeconomic variable, a Random Forest model is trained on the source Census units and then used to generate fine-scale gridded predictions, which are then adjusted to ensure the best possible consistency with the coarser Census data. As a case study, we apply this method to Census data in the United States, downscaling the selected socioeconomic variables available at the block group level, to a grid of 300 spatial resolution. The accuracy of the method is assessed at both spatial scales, first computing a pseudo cross-validation coefficient of determination for the predictions at the block group level and then, for extensive variables only, also for the (unadjusted) predicted counts summed by block group. Based on these scores and on the inspection of the downscaled maps, we conclude that our method is able to provide accurate, smoother, and more detailed socioeconomic estimates than the available Census data.
READ FULL TEXT