Estimating sub-national behaviour in the Danish microsimulation model SMILE

30-03-2016

This paper suggests the use of a combination of Principal Component Analysis (PCA) and classification by Conditional Inference Trees (CTREEs) when estimating transition probabilities depending on a large number of high dimensional covariates, hence overcoming the curse of dimensionality.

Abstract

The SMILE model is a Danish dynamic microsimulation model, which forecasts demography, household formation, housing demand, socioeconomic and educational attainment, income, taxation, and labour market pensions until the year of 2040. In the most recent version of the model, SMILE 3.0, selected behavioural patterns are allowed to vary across the 98 municipalities of Denmark. Especially, this provides the model with a detailed description of sub-national moving behaviour, which is essential when seeking to identify geographic areas characterized by a future positive or negative population growth.

Modelling behavioural patterns by a large number of potentially high dimensional covariates allows for a rich description of individual behaviour, but simultaneously reduces the number of observations with identical characteristics. Hence, due to data sparsity the curse of dimensionality following from introducing detailed sub-national behaviour significantly challenges the estimation of municipality dependant transition probabilities. This paper suggests the use of a combination of Principal Component Analysis (PCA) and classification by Conditional Inference Trees (CTREEs) when estimating transition probabilities depending on a large number of high dimensional covariates, hence overcoming the curse of dimensionality. The method is described and results are given.