Introduction to Spatial Econometrics and Its Application in Foresight Initiative Chun Song, Alliance of Bioversity and CIAT (CGIAR) June 13, 2024 DELoS, University of Florence Why we are doing this • Increasing use of spatial data and methods in DELoS research • Importance accounting for spatial effect in dev econ, applied econometrics, and Foresight • “Behind the Curtain” Expectations Basic idea about spatial econometrics Where you find useful, not so useful, or would like to learn more? If you see link with your study/research? Outline 9:00 – 10:00 am • Behavioral & economic motivation • Data & inferential motivation • Type of spatial data • Spatial dependence & heterogeneity • Define neighbors in spatial & social contexts • Q&A 10:06 - 11:00 am • Testing spatial autocorrelation • Local and global tests • Spatial regression • Applications of spatial econometrics in CGIAR Foresight 11:00 - 11:30 am • Discussion Economic & behavioral motivation Many economic models are a-spatial (topologically invariant) • Supply demand • Optimization • Regression Why we care about space in economics Economic & behavioral motivation Increasing incorporation of geography • Trade, pricing: transportation cost & arbitrage • Interaction among agents that are close in space • Industry clustering/agglomeration: Spatial equilibrium, return to scale; spillover/multiplier • Spatial externality • Direct and indirect spatial effect • Poverty trap: endowment; labor mobility • Social & spatial interaction • Policy context Why we care about space in economics Data motivation Increasing quality, spectrum, and frequency of GIS data • Lower cost + transparency + accuracy • GPS, GIS, computation power • Social network data Agropastoral Value Chains Project (2015 – 2023) was the rehabilitation of roads in the project area. GIS helps measure spatial indicators. This map shows 87 km of road rehabilitated by the project. Why we care about space in economics Data motivation • Data shows spatial structure • Pattern recognition • Locational effect • Spatial mismatch • Endogenous unit Why we care about space in economics 1854 London Cholera map, John Snow Distribution of African American in each census tract Data motivation • Pattern recognition Why we care about space in economics Econometric motivation • Inferential issue • Biasness Why we care about space in economics What is spatial econometrics •Spatial models account for the role that space plays in determining many of the variables that economists and other social scientists are interested in. •Taking explicit treatment of spatial aspects Timeline of spatial econometrics development Birth of spatial econometrics • Paelinck and Klaassen (1979) “Spatial Econometrics” • explicitly model spatial effects in economic models Modern developments, 1980–90s • rapid growth spatial econometrics since Anselin (1988) Growth and acceptance since mid–1990s • proliferation to mainstream journals (Journal of Transport Geography) • sophisticated interpretation, new estimators, new tests • increasing focus on interface of stats and econometrics Type of spatial data Areal Polygon, lattice, irregular area (admin unit, village) Per capita income (2020) Type of spatial data Point The observed spatial locations are the locations of discrete events Water Dispute in Africa : Understanding the impact of climatic and institutional drivers. With Thanasis Petsakos, Cesar Saavedra, Yohannes Gebretsadik and Elisabetta Gotor. Type of spatial data Geostatistical Sampled data from continuous surface Interpolation & smoothing Rainfall precipitation Type of spatial data Geostatistical Sampled data from continuous surface Interpolation & smoothing Doesn’t always make sense to interpolate Rainfall precipitation Spatial patterns are characterized by dependence and heterogeneity • Observational equivalence • “True” interpretation cannot be inferred from a map • Two competing hypotheses Spatial dependence At first sight, spatial dependence may seem similar to time-wise dependence. However, standard econometrics from time series analysis cannot be directly adopted for spatial dependence in cross-sectional. Spatial dependence vs time dependence Time is simple • Unidirectional • One dimension • No reciprocity Spatial dependence vs time dependence Space is more complex • Multiple directions • Two dimension • Reciprocity Spatial heterogeneity • Lack of stability over space of the behavioral or other relationships • Heterogeneity directly related to location in space • Functional forms and parameters vary with location • Example • Rural-urban • North-south • Easier to deal with compared to spatial dependence • Mainly resolved by standard econometric approaches Defining neighbors • Defining each unit’s neighbors is an essential step in modeling spatial • A spatial weights matrix is an N × N matrix W with elements wij as follows: • Specification of W depends on type of data • Lattices or areal data (regular or irregular) point data Spatial neighbors: contiguity It corresponds to the horizontal and vertical moves of a rook in a chess game Rook • Neighbors are the areal units that share a common edge Queen • Any object sharing either a common edge or vertex with i is defined as a neighbor of i Spatial neighbors: contiguity Spatial neighbors: contiguity Row standardization As a result, each row sum of the row-standardized weights equals 1 The sum of all weights equals N The row standardization of a symmetric contiguity-based matrix typically produces a nonsymmetric matrix k-Nearest Neighbors • All units among the k nearest neighbors of unit i are treated as neighbors of I • The remaining units are treated as non-neighbors How can we theoretically support the choice of k? • Intuition (e.g., a typical person is influenced primarily by k-best friends) • Past literature can be used to support such a hypothesis • The row sum of the W matrix will always be equal to k, before standardization • However, this does not imply that the standardized matrix will always be symmetric • j is among the k nearest neighbors of i • i is not among the k nearest neighbors of j Distance-Based Neighbors • Units within a particular Euclidean distance of unit i exhibit spatial autocorrelation with unit i • Units beyond this critical distance are spatially independent from unit i • All neighbors have the same influence on unit i Distance decay • Spatial autocorrelation is assumed to be stronger among more spatially proximate units and decline as distance increases: Distance decay How? • Correct functional form must be known by the researcher based on theoretical reasoning • Validity of these assumptions cannot be easily tested • No simple or unproblematic empirical test exists that would allow us to determine the “correct” functional form of W • Better specification of the underlying theory: does theory suggest that spatial dependence diminishes very rapidly or slowly as distance increases? • Robustness tests • Divide the continuum of distance into discrete bands and attach different weights • Endogenous W Identifying endogenous W • Postulate/Elicit • Monte Carlo • Recover strength of social interactions between nodes, and the centrality of nodes where social ties are not collected • Requires sparsity Non spatial extensions Dependence are not necessarily geographical For example, defined connectivity based on • Volume of trade flows: a country is considered connected to all other countries to which it has trade • Transportation cost • Group membership, political party, friendship Non spatial extensions: net work interaction Non spatial extensions: net work interaction Modelling spatial pattern After learning about how to define neighbors, we need to focus on testing for spatial patterns Two principal sources of spatial patterns • Behavioral diffusion: spatially proximate units are influenced directly by the behavior of their neighbors and vice versa. • Attributional similarity: neighboring units share similar behaviors as a result of geographic clustering of the sources of these behaviors Spatial autocorrelation • Spatial autocorrelation: non-zero covariance between the values of a random variable for neighboring locations: • i and j locations have a spatial interpretation • Coincidence of locational and attribute similarity • It measures the direction of the linear association and the degree of intensity of the spatial pattern Correlation measures the degree of linear association between two different variables Spatial autocorrelation Null hypothesis • The values of a random variable are distributed randomly in relation to space • The data generating process (DGP) is completely independent of location Alternative hypothesis Positive spatial autocorrelation • Low values of a given variable are statistically associated with other low values within a similar localization • High values of a given variable are statistically associated with other high values within a similar localization Negative spatial autocorrelation • High values are surrounded by low values and vice versa • Not very common in empirical work • Typically arises under competition for space or resources: if one tree grows larger, the tree next to it will be small Global Moran Global Moran Global Moran Global Moran Global Moran example: Columbus poverty rate Local Moran Global statistics • Provide information on the “average” spatial autocorrelation in a sample • Ideally used when the spatial units being studied are relatively homogeneous Local statistics • Evaluate the local structure of spatial autocorrelation • Find local clusters of low or high values • Identify the individual contributions to the global spatial autocorrelation Local Moran High-High (HH) quadrant • High values of y are surrounded by high values of y • Positive local spatial autocorrelation above the mean Low-Low (LL) quadrant • Low values of y are surrounded by low values of y • Positive local spatial autocorrelation below the mean Majority of points in HH and LL → global positive spatial Autocorrelation High-High (HH) quadrant • High values of y are surrounded by high values of y • Positive local spatial autocorrelation above the mean Low-Low (LL) quadrant • Low values of y are surrounded by low values of y • Positive local spatial autocorrelation below the mean Majority of points in HH and LL → global positive spatial Autocorrelation Next • Diagnosis of univariate spatial autocorrelation in the absence of covariates • Detecting spatial autocorrelation in variables of interest is the first step in a spatial analysis • However, it does not provide information on the data generating process • We can identify spatial variables with similar spatial behavior • Spatial weights matrix definition → necessary but can impact results • Next step: Estimation of econometric models Linear regression recap Example of spatial regression Housing price in Amsterdam 2015 Example of spatial regression Spatial lag model Spatial lag model Spatial error model Applications in CARD • Decompose of direct and indirect effect of policy effect • Manski model: peer effect versus contextual effect • Spatial mismatch in agrifood system • Spatial difference-in-difference • Geospatial impact evaluation in data poor context • In conjunction with climate data science and economic modelling for future implications CGIAR Foresight Initiative • What do we know about the future • What are the pathway to get to the desirable future • How does our knowledge about future affect decision making today • Broad space for combining spatial dimension with temporal dimention • Spatial explicit climatic shocks • What might happen in the future • Examples from our works: application of spatial econometrics Slide 1: Introduction to Spatial Econometrics and Its Application in Foresight Initiative Slide 2: Why we are doing this Slide 3: Expectations Slide 4: Outline Slide 5: Why we care about space in economics Slide 6: Why we care about space in economics Slide 7: Why we care about space in economics Slide 8: Why we care about space in economics Slide 9 Slide 10: Why we care about space in economics Slide 11: Why we care about space in economics Slide 12: What is spatial econometrics Slide 13: Timeline of spatial econometrics development Slide 14: Type of spatial data Slide 15: Type of spatial data Slide 16: Type of spatial data Slide 17: Type of spatial data Slide 18: Spatial patterns are characterized by dependence and heterogeneity Slide 19: Spatial dependence Slide 20: Spatial dependence vs time dependence Slide 21: Spatial dependence vs time dependence Slide 22: Spatial heterogeneity Slide 23: Defining neighbors Slide 24: Spatial neighbors: contiguity Slide 25: Spatial neighbors: contiguity Slide 26: Spatial neighbors: contiguity Slide 27: Row standardization Slide 28 Slide 29: k-Nearest Neighbors Slide 30: Distance-Based Neighbors Slide 31: Distance decay Slide 32: Distance decay Slide 33: How? Slide 34: Identifying endogenous W Slide 35: Non spatial extensions Slide 36: Non spatial extensions: net work interaction Slide 37: Non spatial extensions: net work interaction Slide 38 Slide 39 Slide 40: Modelling spatial pattern Slide 41: Spatial autocorrelation Slide 42: Spatial autocorrelation Slide 43 Slide 44: Global Moran Slide 45: Global Moran Slide 46: Global Moran Slide 47: Global Moran Slide 48: Global Moran example: Columbus poverty rate Slide 49: Local Moran Slide 50: Local Moran Slide 51 Slide 52 Slide 53: Next Slide 54: Linear regression recap Slide 55: Example of spatial regression Slide 56: Housing price in Amsterdam 2015 Slide 57: Example of spatial regression Slide 58: Spatial lag model Slide 59 Slide 60: Spatial lag model Slide 61: Spatial error model Slide 62: Applications in CARD Slide 63: CGIAR Foresight Initiative