Abstract
While spatial data analysis has received increasing attention in demographic studies, it remains a difficult subject to learn for practitioners due to its complexity and various unresolved issues. Here we give a practical guide to spatial demographic analysis, with a focus on the use of spatial regression models. We first summarize spatially explicit and implicit theories of population dynamics. We then describe basic concepts in exploratory spatial data analysis and spatial regression modeling through an illustration of population change in the 1990s at the minor civil division level in the state of Wisconsin. We also review spatial regression models including spatial lag models, spatial error models, and spatial autoregressive moving average models and use these models for analyzing the data example. We finally suggest opportunities and directions for future research on spatial demographic theories and practice.
Similar content being viewed by others
Notes
However, some human ecologists (e.g., Poston and Frisbie 2005) see these definitions as misunderstanding human ecology.
Exploratory data analysis summarizes and displays data without formal statistical inference. For the purpose of regression, it is common practice to examine the distributions of the response variable and the explanatory variables as well as the correlation among all the variables. Expectations may include normal distributions of the variables, a linear relation between the response variable and individual explanatory variables, and a reasonably low correlation among the explanatory variables. If the data do not appear to follow normal distributions or the relations among the variables are not linear, we could consider transforming the variables. However, the transformation may not reduce spatial dependence if it exists (Bailey and Gatrell 1995). Alternatively additional variables such as higher-order terms and interaction terms can be incorporated (Fox 1997). In addition, a high correlation among the explanatory variables may make estimation and statistical inference unreliable, which is known as the problem of multicollinearity (Baller et al. 2001). Principal component or factor analysis may be used to create new explanatory variables from the highly correlated explanatory variables.
Apparently scholars from different fields understand these terms differently. For example, some demographers distinguish spatial autocorrelation from spatial dependence, and argue that the former simply is one indicator of the latter and, possibly, of spatial heterogeneity. Geographers view spatial autocorrelation as being composed of large-scale spatial irregularities and local-scale spatial interaction effects. Here we use the terms of spatial autocorrelation and spatial dependence as synonymous, explain the conceptual difference between spatial autocorrelation and spatial heterogeneity, and focus on spatial autocorrelation in the data analysis.
The first-order queen contiguity spatial weight matrix defines all observations that share common boundaries or vertices as neighbors. The first-order rook contiguity spatial weight matrix defines the observations that share common boundaries as neighbors. The second-order queen and rook contiguity weight matrices see both the first-order neighbors and their neighbors as neighbors. The k-nearest neighbor weights are constructed to contain the k nearest neighbors for each observation. In the distance weight matrices, all observations that have centroids within the defined distance band from each other are categorized as neighbors. The general weight matrices see all neighbors as equally weighted, and the inverse distance weight matrices assume continuous change of interaction between two observations with distance (e.g., a squared inverse distance spatial weight matrix can be constructed for the gravity model of spatial interaction).
The boundaries, and even the names, of MCDs in Wisconsin are not fixed over time. Boundaries change, new MCDs emerge, old MCDs disappear, names change, and status in the geographic hierarchy shifts, e.g., towns become villages, villages become cities. In order to adjust the data for these changes, we have set up three rules: new MCDs must be merged into the original MCDs from which they emerge; disappearing MCD problems can be solved by dissolving the original MCDs into their current “home” MCDs; and occasionally, several distinct MCDs must be dissolved into one super-MCD in order to establish a consistent data set over time. In the end, 1,837 MCD-like units (cities, villages, and towns) constitute this analytical dataset.
The Moran’s I plot of errors can also detect if there are any outliers. Outliers are not necessarily “bad,” and further exploration of the outliers might provide interesting findings. Practically, we can use the outliers as one independent variable where the outliers are represented as 1 while others as 0. If these outliers are “real” outliers, the coefficient should be statistically significant. In the spatial data analysis, outliers detected by Moran scatter plot may indicate possible problems with the specification of the spatial weights matrix or with the spatial scale at which the observations are recorded (Anselin 1996). Outliers should be studied carefully before being discarded.
An extensive review of the relevant literature results in more than 37 variables that significantly affect population change theoretically or empirically (Chi 2006). These 37 variables are chosen for this research on the basis of a combination of judgment established theoretical or empirical relationships, and the availability of data. The variables that have been used to generate the demographic index are population density, age structure, race, college population, educational attainment, stayers, female-headed households, and seasonal housing. Social and economic conditions include crime rate, school performance, employment, income, public transportation, public water, new housing, buses, county seat status, and real estate value. Transportation accessibility is made up of residential preference, accessibility to airports and highway, highway infrastructure, and journey to work. Natural amenities contain forest, water, the lengths of lakeshore, riverbank, and coastline, golf courses, and slope. Land development and conversion include water, wetlands, slope, tax-exempt lands, and built-up lands.
References
Alba, R. D., & Logan, J. R. (1993). Minority proximity to whites in suburbs: An individual-level analysis of segregation. American Journal of Sociology, 98(6), 1388–1427.
Anselin, L. (1988). Spatial econometrics: Methods and models. Dordrecht, Netherlands: Kluwer Academic Publishers.
Anselin, L. (1990). Spatial dependence and spatial structural instability in applied regression analysis. Journal of Regional Science, 30, 185–207.
Anselin, L. (1992). SpaceStat tutorial: A workbook for using SpaceStat in the analysis of spatial data. National Center for Geographic Information and Analysis, University of California, Santa Barbara CA.
Anselin, L. (1995). Local indicators of spatial autocorrelation—LISA. Geographical Analysis, 27, 93–115.
Anselin, L. (1996). The Moran scatterplot as an ESDA tool to assess local instability in spatial association. In M. Fischer, H. J. Scholten, & D. Unwin (Eds.), Spatial analytical perspectives on GIS (pp. 111–125). London, England: Taylor & Francis.
Anselin, L. (2001). Spatial econometrics. In B. Baltagi (Ed.), A companion to theoretical econometrics (pp. 310–330). Oxford, England: Blackwell.
Anselin, L. (2002). Under the hood: Issues in the specification and interpretation of spatial regression models. Agricultural Economics, 27, 247–267.
Anselin, L. (2003). Spatial externalities, spatial multipliers, and spatial econometrics. International Regional Science Review, 26, 153–166.
Anselin, L., & Griffith, D. A. (1988). Do spatial effects really matter in regression analysis? Papers in Regional Science, 65, 11–34.
Anselin, L., & Bera, A. (1998). Spatial dependence in linear regression models with an introduction to spatial econometrics. In A. Ullah & D. Giles (Eds.), Handbook of applied economic statistics (pp. 237–289). New York: Marcel Dekker.
Bailey, T. C., & Gatrell, A. C. (1995). Interactive spatial data analysis. Harlow, England: Longman Scientific & Technical.
Baller, R. D., Anselin, L., Messner, S. F., Deane, G., & Hawkins, D. F. (2001). Structural covariates of U. S. county homicide rates: Incorporating spatial effects. Criminology, 39, 561–590.
Banerjee, S., Carlin, B. P., & Gelfand, A. E. (2003). Hierarchical modeling and analysis for spatial data. Boca Raton, FL: Chapman & Hall/CRC.
Beaujeu-Garnier, J. (1966). Geography of population. London, England: Longman.
Berry, B. J. L., & Kasarda, J. D. (1977). Contemporary urban ecology. New York: Macmillan.
Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society, Series B, 36, 192–236.
Boarnet, M. G. (1997). Highways and economic productivity: Interpreting recent evidence. Journal of Planning Literature, 11(4), 476–486.
Boarnet, M. G. (1998). Spillovers and the locational effects of public infrastructure. Journal of Regional Science, 38(3), 381–400.
Brown, D. L., Fuguitt, G. V., Heaton, T. B., & Waseem, S. (1997). Continuities in size of place preferences in the United States, 1972–1992. Rural Sociology, 62(4), 408–428.
Cardille, J. A., Ventura, S. J., & Turner, M. G. (2001). Environmental and social factors influencing wildfires in the Upper Midwest, USA. Ecological Applications, 11, 111–127.
Case, A., Rosen, H. S., & Hines, J. R. (1993). Budget spillovers and fiscal policy interdependence: Evidence from the states. Journal of Public Economics, 52, 285–307.
Cervero, R. (2002). Induced travel demand: Research design, empirical evidence, and normative policies. Journal of Planning Literature, 17(1), 3–20.
Cervero, R. (2003). Road expansion, urban growth, and induced travel: A path analysis. Journal of the American Planning Association, 69(2), 145–163.
Cervero, R., & Hansen, M. (2002). Induced travel demand and induced road investment: A simultaneous-equation analysis. Journal of Transport Economics and Policy, 36(3), 469–490.
Charles, C. Z. (2003). The dynamics of racial residential segregation. Annual Review of Sociology, 29, 167–207.
Chi, G. (2006). Environmental demography, small-area population forecasting, and spatio-temporal econometric modeling: Demographics, accessibility, developability, desirability, and livability. Dissertation, Department of Urban and Regional Planning, University of Wisconsin-Madison, Madison WI.
Chi, G., & Voss, P. R. (2005). Migration decision-making: A hierarchical regression approach. Journal of Regional Analysis and Policy, 35(2), 11–22.
Chi, G., Voss, P. R., & Deller, S. C. (2006). Rethinking highway effects on population change. Public Works Management and Policy, 11, 18–32.
Christaller, W. (1966). Central places in southern Germany (Die zentralen Orte in Süddeutschland, Baskin CW, 1933, Trans). Englewood Cliffs, NJ: Prentice-Hall.
Clark, W. (1996). Understanding residential segregation in American cities: Interpreting the evidence. Population Research and Policy Review, 5, 95–127.
Cliff, A., & Ord, J. K. (1973). Spatial autocorrelation. London, England: Pion Limited.
Cliff, A., & Ord, J. K. (1981). Spatial processes, models and applications. London, England: Pion Limited.
Cowen, D. J., & Jensen, J. R. (1998). Extraction and modeling of urban attributes using remote sensing technology. In D. Liverman, E. F. Moran, R. R. Rindfuss, & P. C. Stern (Eds.), People and pixels: Linking remote sensing and social science (pp. 164–188). Washington, DC: National Academy Press.
Cressie, N. (1993). Statistics for spatial data. New York: Wiley.
Dalenberg, D. R., & Partridge, M. D. (1997). Public infrastructure and wages: Public capital’s role as a productive input and household amenity. Land Economics, 73, 268–284.
Diggle, P., Heagerty, P., Liang, K. Y., & Zeger, S. (2002). Analysis of longitudinal data. Oxford, England: Oxford University Press.
Doreian, P. (1980). Linear models with spatial distributed data: Spatial disturbances or spatial effects? Sociological Methods and Research, 9, 29–60.
Draper, N. R., & Smith, H. (1998). Applied regression analysis. New York: John Wiley & Sons.
Elhorst, P. J. (2001). Dynamic models in space and time. Geographical Analysis, 33, 119–140.
Fleming, M. M. (2004). Techniques for estimating spatially dependent discrete choice models. In L. Anselin, R. J. G. M. Florax, & S. J. Rey (Eds.), Advances in spatial econometrics (pp. 145–168). Berlin, Germany: Springer.
Florax, R. J. G. M., & Van der Vlist, A. J. (2003). Spatial econometric data analysis: Moving beyond traditional models. International Regional Science Review, 26(3), 223–243.
Fossett, M. (2005). Urban and spatial demography. In D. L. Poston, & M. Micklin (Eds.), Handbook of Population (pp. 479–524). New York: Springer.
Fotheringham, A. S., & Wong, D. W. S. (1991). The modifiable areal unit problem in multivariate statistical analysis. Environment and Planning A, 23, 1025–1034.
Fotheringham, A. S., Brunsdon, M., & Charlton, M. (1998). Geographically weighted regression: A natural evolution of the expansion method for spatial data analysis. Environment and Planning A, 30, 1905–1927.
Fox, J. (1997). Applied regression analysis, linear models, and related methods. Thousand Oaks, CA: Sage Publications.
Frisbie, W. P., Kasarda, J. D. (1988). Spatial processes. In N. J. Smelser (Ed.), Handbook of sociology (pp. 629–666). Newbury Park, CA: Sage Publications.
Fuguitt, G. V., & Brown, D. (1990). Residential preferences and population redistribution. Demography, 27, 589–600.
Fuguitt, G. V., & Zuiches, J. J. (1975). Residential preferences and population distribution. Demography, 12(3), 491–504.
Fuguitt, G. V., Brown, D. L., & Beale, C. L. (1989). Rural and small town America: The population of the United States in the 1980s. New York: Russell Sage Foundation.
Galster, G. C. (1988). Residential segregation in American cities: A contrary review. Population Research and Policy Review, 7, 93–112.
Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2003). Bayesian data analysis. Boca Raton, FL: Chapman & Hall/CRC.
Getis, A. (1984). Interaction modeling using second-order analysis. Environment and Planning A, 16(2), 173–183.
Getis, A. (1995). Spatial filtering in a regression framework: Examples using data on urban crime, regional inequality, government expenditures. In L. Anselin & R. J. G. M. Florax (Eds.), New directions in spatial econometrics (pp. 172–185). Berlin, Germany: Springer Verlag.
Goodchild, M. F. (1992). Geographical data modeling. Computer and Geoscience, 18, 401–408.
Gordon, D. (1978). Capitalist development and the history of American cities. In W. Tabb & L. Sawers (Eds.), Marxism and the metropolis (pp. 25–63). Oxford, England: Oxford University Press.
Graaff, T., Florax, R. J. G. M., Nijkamp, P., & Reggiani, A. (2001). A general misspecification test for spatial regression models: Dependence, heterogeneity, and nonlinearity. Journal of Regional Science, 41, 255–276.
Green, M., & Flowerdew, R. (1996). New evidence on the modifiable areal unit problem. In P. Longley & M. Batty (Eds.), Spatial analysis: Modelling in a GIS environment (pp. 41–54). Cambridge, MA: GeoInformation International.
Greene, W. H. (2000). Econometric analysis. Upper Saddle River, NJ: Prentice-Hall, Inc.
Griffith, D. A. (2003). Spatial autocorrelation and spatial filtering: Gaining understanding through theory and scientific visualization. New York: Springer Verlag.
Hall, P. (1988). The city of theory. In R. LeGates & F. Stout (Eds.), The city reader (pp. 391–393). New York: Routledge.
Hawley, A. H. (1950). Human ecology: A theory of community structure. New York: Ronald Press.
Hill, R. C. (1977). Capital accumulation and urbanization in the United States. Comparative Urban Research, 4, 39–60.
Hudson, J. C. (1972). Geographical diffusion theory. Evanston, IL: Northwestern University.
Humphrey, C. R. (1980). The promotion of growth in small urban places and its impact on population change. Social Science Quarterly, 61, 581–594.
James, P. (1954). The geographic study of population. In P. James & C. Jones (Eds.), American geography: Inventory and prospect (pp. 106–122). Syracuse, NY: Syracuse University Press.
Jaret, C. (1983). Recent neo-Marxist urban analysis. Annual Review of Sociology, 9, 499–525.
Jensen, J. R., Cowen, D. J., Halls, J., Narumalani, S., Schmidt, N. J., Davis, B. A., & Burgess, B. (1994). Improved urban infrastructure mapping and forecasting for BellSouth using remote sensing and GIS technology. Photogrammetric Engineering and Remote Sensing, 60, 339–346.
Johnson, K. M., & Beale, C. L. (1994). The recent revival of widespread population growth in nonmetropolitan areas of the United States. Rural Sociology, 59(4), 655–667.
Johnson, K. M., & Purdy, R. L. (1980). Recent nonmetropolitan population change in fifty-year perspective. Demography, 17(1), 57–70.
Johnson, K. M. (1982). Organization adjustment to population change in nonmetropolitan America: A longitudinal analysis of retail trade. Social Forces, 60(4), 1123–1139.
Johnson, K. M. (1989). Recent population redistribution trends in nonmetropolitan America. Rural Sociology, 54(3), 301–326.
Jones, H. R. (1990). Population geography. New York: The Guilford Press.
Krugman, P. (1991). Geography and trade. Cambridge, MA: MIT Press.
Land Information and Computer Graphics Facility. (2000). Mapping growth management factors: A practical guide for land use planning. Madison, WI: University of Wisconsin-Madison.
Land Information and Computer Graphics Facility. (2002). Population and land allocation: Evolution of geospatial tools helps citizens engage in land-planning process. Madison, WI: University of Wisconsin-Madison.
Langford, M., & Unwin, D. J. (1994). Generating and mapping population density surfaces within a geographical information system. The Cartographic Journal, 31, 21–26.
Langford, M., Maguire, D. J., & Unwin, D. J. (1991). The areal interpolation problem: Estimating population using remote sensing. in a GIS framework. In I. Masser & M. Blakemore (Eds.), Handling geographical information: Methodology and potential applications (pp. 55–77). London, England: Longman Scientific & Technical.
LeSage, J. P. (1999). A spatial econometric examination of China’s economic growth. Geographic Information Sciences, 5, 143–153.
Lewis, P. H. (1996). Tomorrow by design: A regional design process for sustainability. New York: John Wiley & Sons.
Littell, R., Milliken, G., Stroup, W., Wolfinger, R., & Schabenberger, O. (2006). SAS for mixed models. Cary, NC: SAS Institute Inc.
Loftin, C., & Ward, S. K. (1983). A spatial autocorrelation model of the effects of population density on fertility. American Sociological Review, 48, 121–128.
Logan, J. R., & Molotch, H. L. (1987). Urban fortunes: The political economy of place. Berkeley, CA: University of California Press.
Massey, D. S., & Denton, N. A. (1993). American apartheid: Segregation and the making of the underclass. Cambridge, MA: Harvard University Press.
McKenzie, R. D. (1924). The ecological approach to the study of the human community. American Journal of Sociology, 30, 287–301.
Mennis, J. (2003). Generating surface models of population using dasymetric mapping. The Professional Geographer, 55, 31–42.
Moran, P. (1948). The interpolation of statistical maps. Journal of the Royal Statistical Society B, 10, 243–251.
Mollenkopf, J. (1978). The postwar politics of urban development. In W. Tabb & L. Sawers (Eds.), Marxism and the metropolis (pp. 117–152). Oxford, England: Oxford University Press.
Mollenkopf, J. (1981). Neighborhood political development and the politics of urban growth: Boston and San Francisco 1958–1978. International Journal of Urban and Regional Research, 5, 15–39.
Openshaw. S. (1984). The modifiable areal unit problem. In Concepts and techniques in modern geography (Vol. 38). London, England: Geobooks.
Openshaw, S., & Taylor, P. J. (1981). The modifiable areal unit problem. In N. Wrigley, & R. J. Bennett (Eds.), Quantitative geography: A British view (pp. 60–70). London, England: Routledge.
Ord, J. K., & Getis, A. (1995). Local spatial autocorrelation statistics: Distributional issues and an application. Geographical Analysis, 27, 286–306.
Pacheco, A. I., & Tyrrell, T. J. (2002). Testing spatial patterns and growth spillover effects in clusters of cities. Journal of Geographical Systems, 4, 275–285.
Paelinck, J. H. P. (2000). On aggregation in spatial econometric modeling. Journal of Geographical Systems, 2, 157–165.
Parent, O., & Riou, S. (2005). Bayesian analysis of knowledge spillovers in European regions. Journal of Regional Science, 45(4), 747–775.
Parisi, D., Lichter, D. T., Grice, S. M., & Taquino, M. (2007). Disaggregating trends in racial residential segregation: Metropolitan, micropolitan, and non-core counties compared. Presented at the annual meeting of the Population Association of America, March 29, 2007, New York NY.
Perroux, F. (1955). Note sur la Notion de pole de croissance. Economie Appliquée, 8, 307–320.
Poston, D. L., & Frisbie, W. P. (2005). Ecological demography. In D. L. Poston & M. Micklin (Eds.), Handbook of population (pp. 601–623). New York: Springer.
Robinson W. S. (1950). Ecological correlations and the behavior of individuals. American Sociological Review, 15, 351–357.
Sampson R. J., Morenoff J. D., & Earls, F. (1999). Beyond social capital: Spatial dynamics of collective efficacy for children. American Sociological Review, 64(5), 633–660.
Schabenberger, O., & Gotway, C. A. (2005). Statistical methods for spatial data analysis. Boca Raton, FL: Chapman & Hall/CRC Press.
Tiefelsdorf, M. (2000). Modelling spatial processes—The identification and analysis of spatial relationships in regression residuals by means of Moran’s I. Berlin, Germany: Springer Verlag.
Tobler, W. (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography, 46, 234–240.
Tolnay, S. E., Deane, G., & Beck, E. M. (1996). Vicarious violence: Spatial effects on southern lynchings, 1890–1919. American Journal of Sociology, 102, 788–815.
Trewartha, G. (1953). A case for population geography. Annals of the Association of American Geographers, 43, 71–97.
Voss, P. R., & Chi, G. (2006). Highways and population change. Rural Sociology, 71(1), 33–58.
Voss, P. R., White, K. J. C., & Hammer, R. B. (2006). Explorations in spatial demography. In W. A. Kandel, &D. L. Brown (Eds.), Population change and rural society (pp. 407–429). Dordrecht: Springer.
Wrigley, N., Holt, T., Steel, D., & Tranmer, M. (1996). Analysing, modelling, and resolving the ecological fallacy. In P. Longley & M. Batty (Eds.), Spatial analysis: Modelling in a GIS environment (pp. 23–40). Cambridge, MA: GeoInformation International.
Zelinsky, W. (1966). A prologue to population geography. Englewood Cliffs, NJ: Prentice-Hall.
Zhu, J., Huang, H.-C., & Wu, J. (2005). Modeling spatial-temporal binary data using Markov random field models. Journal of Agricultural, Biological, and Environmental Statistics, 10, 212–225.
Zuiches, J. J., & Rieger, J. H. (1978). Size of place preferences and life cycle migration: A cohort comparison. Rural Sociology, 43(4), 618–633.
Acknowledgments
We are indebted to Paul R. Voss for his guidance with this research and for providing us with insightful suggestions on earlier drafts. Appreciation is extended to three anonymous reviewers for their many helpful comments. We also acknowledge support from the Social Science Research Center at Mississippi State University and Department of Statistics and Department of Soil Science at University of Wisconsin-Madison. Funding has been provided for this research by the USDA Cooperative State Research, Education and Extension Service (CSREES) Hatch project WIS04536 and the Wisconsin Alumni Research Foundation.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chi, G., Zhu, J. Spatial Regression Models for Demographic Analysis. Popul Res Policy Rev 27, 17–42 (2008). https://doi.org/10.1007/s11113-007-9051-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11113-007-9051-8