IFPRI Discussion Paper 01836 

May 2019 

Human Capital and Structural Transformation 

Quasi-Experimental Evidence from Indonesia 

Naureen Karachiwalla 

Giordano Palloni 

Poverty, Health, and Nutrition Division 


INTERNATIONAL FOOD POLICY RESEARCH INSTITUTE 
The International Food Policy Research Institute (IFPRI), established in 1975, provides research-based 
policy solutions to sustainably reduce poverty and end hunger and malnutrition. IFPRI’s strategic research 
aims to foster a climate-resilient and sustainable food supply; promote healthy diets and nutrition for all; 
build inclusive and efficient markets, trade systems, and food industries; transform agricultural and rural 
economies; and strengthen institutions and governance. Gender is integrated in all the Institute’s work. 
Partnerships, communications, capacity strengthening, and data and knowledge management are essential 
components to translate IFPRI’s research from action to impact. The Institute’s regional and country 
programs play a critical role in responding to demand for food policy research and in delivering holistic 
support for country-led development. IFPRI collaborates with partners around the world.  
 

AUTHORS 
Naureen Karachiwalla (n.karachiwalla@cgiar.org) is a Research Fellow in the Poverty, Health, and 
Nutrition Division of the International Food Policy Research Institute (IFPRI), Washington, DC.  
 
Giordano Palloni (g.palloni@cgiar.org) is a Research Fellow in the Poverty, Health, and Nutrition 
Division of the International Food Policy Research Institute (IFPRI), Washington, DC. 
 
 
Notices  
 
1 IFPRI Discussion Papers contain preliminary material and research results and are circulated in order to stimulate discussion and 
critical comment. They have not been subject to a formal external review via IFPRI’s Publications Review Committee. Any opinions 
stated herein are those of the author(s) and are not necessarily representative of or endorsed by IFPRI.  
 
2 The boundaries and names shown and the designations used on the map(s) herein do not imply official endorsement or 
acceptance by the International Food Policy Research Institute (IFPRI) or its partners and contributors. 
 
3 Copyright remains with the authors. The authors are free to proceed, without further IFPRI permission, to publish this paper, or any 
revised version of it, in outlets such as journals, books, and other publications. 

mailto:n.karachiwalla@cgiar.org
mailto:g.palloni@cgiar.org


Human Capital and Structural Transformation:
Quasi-Experimental Evidence from Indonesia∗

Naureen Karachiwalla† Giordano Palloni‡

May 8, 2019

Abstract

This paper provides quasi-experimental evidence on the long-term causal effect of increases
in human capital on participation in agriculture. We use variation in male educational attain-
ment generated by Indonesia’s Sekolah Dasar INPRES program, one of the largest ever school
building programs. Consistent with the first evaluation [Duflo, 2001], we find that males exposed
to a higher program intensity have improved measures of human capital as adults. We then
show that treated cohorts are more likely to be employed outside of agriculture–particularly
in industry–and less likely to be agricultural workers. Then, exploiting variation in exposure
across adjacent districts, we demonstrate that higher INPRES intensity in neighboring districts
decreases non-agricultural employment and earnings, consistent with cross-district spillovers me-
diating the total impacts. Together, the results suggest that government investment in human
capital can have profound effects on the rural economy and may help to accelerate shifts away
from agriculture.

JEL Codes: I25, I28, O13, J43

Key words: education, human capital, structural transformation, agriculture, general equi-
librium, Indonesia

∗This paper has substantially benefited from excellent research assistance from Natasha Ledlie, as well as
feedback from Harold Alderman, Daniel Gilligan, Valerie Mueller, participants of the Centre for the Study
of African Economies (CSAE) Conference, Georgetown University, and IFPRI. Esther Duflo kindly provided
the data. We gratefully acknowledge funding from the Dutch Government through SNV and the Voice for
Change Partnership (V4CP) Programme, as well as funding from the Policies, Institutions, and Markets
Research Program of the CGIAR.

†International Food Policy Research Institute (IFPRI), 1201 Eye Street NW, Washington DC, 2005 USA
(N.Karachiwalla@cgiar.org).

‡International Food Policy Research Institute (IFPRI), 1201 Eye Street NW, Washington DC, 2005 USA
(G.Palloni@cgiar.org).

1


I Introduction

This paper is motivated by two pervasive global trends. First, educational attainment has increased

rapidly across low-income countries in recent decades. Since 1970, the average number of years of

schooling completed by individuals fifteen and over from 122 non-advanced economies has more

than doubled, rising from a mean of 3.4 years in 1970 to 7.5 years in 2010 [Barro and Lee, 2013].

This increase in educational attainment has been accompanied by increases in government invest-

ment in education: Glewwe and Muralidharan [2016] estimate that governments in low-income

countries now spend nearly one trillion dollars each year to finance education. Second, labor force

participation in the agricultural sector has declined in much of the developing world. In low- and

middle-income countries in East and Southeast Asia, the share of the labor force engaged in eco-

nomic activity in the agricultural sector dropped from 49.5% in 2000 to 29.6% in 2015 [ILO, 2017]–a

40% reduction in just 15 years.1 Given the well-established association between a declining role

for agriculture and economic growth,2 the observed decline in agricultural participation is likely to

have important implications for overall economic productivity and growth. Figure 1 plots trends in

educational attainment and employment in agriculture between 1990 and 2010 for all non-advanced

countries in the Barro-Lee education data, and all low- and middle-income countries in the World

Bank Development Indicators data [World Bank, 2017]. The figure is striking both for the steep

gradient of the curves as well as the remarkable correspondence in the magnitude of the observed

changes: a 36% increase in educational attainment and a 30% decrease in the percent of the labor

force in agriculture during the twenty year period.

Utility maximizing individuals will choose whether to supply labor in agriculture or to another

sector on the basis of the expected costs and benefits of each option. Education could influence

the relative attractiveness of either sectoral allocation through several channels. It could increase

the return to agricultural work by providing skills or access to networks that reduce the costs of

technology adoption or improving managerial practices. Similarly, education could increase the

1Similarly, the value added from agriculture as a percent of gross domestic product (GDP) has dropped precipi-
tously in low and middle income countries during recent decades: from 27.5% of GDP in 1970 to just 9.5% of GDP
in 2010 [World Bank, 2017].

2See among others Johnston and Mellor [1961], Schultz [1964], Fei and Ranis [1966], Kuznets [1966], Hayami and
Ruttan [1985], Anderson [1987], Syrquin [1988], Timmer [1988], Mellor [1995], Caselli and Coleman II [2001], Gollin
et al. [2004], Caselli [2005], Restuccia et al. [2008].

2


Figure 1: Trends in Education and Employment in Agriculture

30
35

40
45

50
55

 
%

 o
f E

m
p.

 in
 A

gr
ic

.

5
5.

5
6

6.
5

7
7.

5

Ye
ar

s 
of

 S
ch

oo
lin

g
 

1990 1995 2000 2005 2010 2015
 

Year

Years of Schooling % of Emp. in Agric.

Notes:
Data from the Barro-Lee education data and ILOStat. Sample limited to low and middle-income countries.

returns to non-agricultural work by enhancing the determinants of non-agricultural productivity

such as literacy or numeracy, enabling individuals to overcome education-related criteria that may

be used to screen candidates or ration non-agricultural jobs (e.g. education certifications),3 or

by expanding networks that ease social or economic assimilation to new areas and thereby reduce

migration costs to more productive areas [Yang and Zhou, 1999, Agüero and Ramachandran, 2010,

Beegle et al., 2011, de Brauw et al., 2014]. Finally, education could affect individual preferences,

changing the ordinal rank of labor supply alternatives without impacting the financial returns

[Perez-Arce, 2017, Kim et al., 2018]. Economic theory thus makes no clear prediction about the

expected sign or magnitude of the association between education and participation in agriculture. In

the context of large-scale changes in educational attainment, general equilibrium impacts stemming

from the adjustment–or lack thereof–of other factors of production and the resulting variation in

3See [Schultz et al., 1965, Jamison and Lau, 1982, Jacoby, 1993, Newman and Gertler, 1994, Foster and Rosen-
zweig, 1995, 1996, Fafchamps and Quisumbing, 1999, Goldin and Katz, 2000, Taylor and Yunez-Naude, 2000, Caselli
and Coleman II, 2001, Yang, 2002, Jolliffe, 2003, Reimers and Klasen, 2013].

3


factor prices further complicate predictions about sectoral changes in labor supply [Acemoglu, 2010,

Duflo, 2004, Khanna, 2018].

In this paper we provide causal evidence regarding the long-term impact of education on partic-

ipation in agriculture using variation in educational attainment generated by Indonesia’s Sekolah

Dasar INPRES program. We employ a difference-in-differences (DID) empirical strategy to esti-

mate the impacts of the primary school building campaign.4 Treatment effects are identified by

comparing differences in outcomes between older cohorts–that should have finished primary school

before the program began–and younger cohorts–that did not start primary school until after the

program began–across districts that received a high intensity of school construction and those that

received a lower intensity of school construction. Consistent with the evidence presented in Du-

flo [2001] and Duflo [2004], males with higher exposure to the program completed more years of

schooling, were more likely to complete primary school, and are more likely to be employed for

a wage. We show that these individuals are also more likely to be literate and that, collectively,

these increases in human capital induce men to shift their labor supply away from agriculture:

each additional primary school built per 1,000 children increases the likelihood of being employed

outside of agriculture by 1.86 percentage points and the likelihood of being employed in industry

by 1.14 percentage points.5 The increase in non-agricultural employment is partly explained by a

decrease in agricultural employment and partly by an increase in overall labor force participation.

Interestingly, the shifts towards non-agricultural employment occur despite there being no associ-

ation between exposure to the INPRES program and migration. This suggests that beneficiaries

of the program were able to find non-agricultural work in their district of birth or a neighboring

district that did not require them to permanently migrate. Two-stage least squares (2SLS) specifi-

cations that use the DID treatment variable as an excluded instrument for educational attainment

indicate that each additional year of school increases the likelihood of non-agricultural employment

by 15.3 percentage points, a 23.9 percent increase relative to the mean among males in the older

birth cohort; even weak instrument robust inference based on the Anderson-Rubin confidence set

4The DID strategy is the same as the methods employed in Duflo [2001], Akresh et al. [2018], and Ashraf et al.
[Forthcoming] to estimate the impact of INPRES on education and wages.

5We also estimate a positive point estimate for the likelihood of employment in the services sector, but the results
are smaller in magnitude and not statistically significant at conventional levels, with a p-value of 0.24.

4


[Anderson and Rubin, 1949] and estimates that relax some of the assumptions required for 2SLS

[De Chaisemartin and D’Haultfoeuille, 2018] to identify the local average treatment effect for men

induced to complete additional schooling suggest a positive, albeit smaller impact of around a 1 to

2 percentage point increase in non-agricultural employment per year of additional education.

The districts included in the analysis sample are spread across geographically diverse locations in

Indonesia, a country comprised of more than 13,000 islands.6 The resulting differences in district-

level isolation imply that, in addition to the variation in each district’s direct exposure to the

INPRES program, there is potentially meaningful heterogeneity in the the intensity with which

neighboring districts were affected by the program. These disparities in the treatment intensity

of adjacent districts may be especially important in our context, where the median district in the

sample borders four other districts and is small enough that supplying labor outside of the district

of residence is feasible for many individuals: the median distance from the centroid of one district

to the centroid of a neighboring district is just 33 miles. We therefore add an interaction term

between the average INPRES program intensity in districts adjacent to the district of birth and an

indicator for whether individuals were young enough to potentially benefit from the program. The

results provide evidence that general equilibrium adjustments operating through cross-district labor

market spillovers may have importantly affected the distribution of program benefits. The average

INPRES intensity in neighboring districts is associated with a modest but statistically insignificant

decline in the likelihood of being employed outside of agriculture and of being wage employed, as

well as with statistically significant reductions in the likelihood of being employed in industry, the

natural log of the hourly wage and annual earnings. Accounting for these cross-district spillovers

also produces corresponding increases in the magnitude of the estimates of the relationship between

INPRES intensity in the district of birth and labor market outcomes. These results are consistent

with a setting in which job seekers are willing to search across district boundaries for employment

and neighboring district exposure to INPRES therefore increases the pool of educated labor supply

competing for employment, generating downward pressure on the wages of educated workers.

This paper primarily contributes to two branches of literature. First, we provide among the

first causal estimates of the relationship between human capital and participation in agriculture. It

6Data are not available for East Timor.

5


is well established that cross-country differences in non-agricultural productivity are substantially

smaller than the corresponding differences in agricultural productivity, and that lower income coun-

tries employ a much larger share of their population in agriculture work. Differences in agricultural

productivity and the share of the labor force participating in agriculture are therefore thought to

be key determinants of cross-country differences in aggregate productivity.7 Recent work exploiting

time series variation in productivity and sectoral choice offers support for the idea that shifting

labor out of agriculture can accelerate growth [McCaig and Pavcnik, 2017, McMillan and Rodrik,

2011] and suggests that shifting workers from agricultural to non-agricultural employment may be

an important policy goal in itself. There is less clarity about how best to encourage this transfor-

mation, though a dense body of research on the relationship between educational attainment and

participation in agriculture points towards education as a potentially useful policy lever.

Typically, the papers in this literature rely on either the estimation of fully structural models

of human capital accumulation, sectoral choice, and productivity using cross-sectional data or on

reduced form methods like individual or household fixed effects or instrumental variables strategies

[Gisser, 1965, Huffman, 1980, Jacoby, 1993, Newman and Gertler, 1994, Yang, 1997a,b, Fafchamps

and Quisumbing, 1999, Yunez-Naude and Taylor, 2001, Yang, 2002, Spohr, 2003, Jolliffe, 2003,

Laszlo, 2008, Akresh et al., 2018]. Three existing papers offer potentially causal answers to related

questions. Spohr [2003] uses a reform in the length of guaranteed tuition-free schooling (which

increased from 6 to 9 years) in Taiwan and shows that cohorts exposed to the reform completed

additional schooling and are less likely to be employed in the traditional sector, which includes

agriculture as well as infrastructure-related activities. In contrast, we are able to focus specifically

on movement out of agriculture and we do not rely purely on deviations in the outcomes from a

linear pre-reform trend, as Spohr [2003] is forced to do because the studied reform was country wide.

In two parallel working papers to our own Akresh et al. [2018] and Porzio and Santangelo [2019]

present estimates from the same DID specification that we rely on to produce the main results of this

paper. Akresh et al. [2018] focus primarily on exploring the inter-generational impact of INPRES on

the children of exposed cohorts, but also show that there are similar impacts of INPRES exposure

7Huffman [1980], Gollin et al. [2002, 2004], Caselli [2005], Caselli and Coleman II [2006], Restuccia et al. [2008],
Gollin [2009], Herrendorf and Teixeira [2011], Lagakos and Waugh [2013], Adamopoulos and Restuccia [2014], Gollin
et al. [2014b,a], Herrendorf and Schoellman [2015]

6


on non-agricultural employment for men in 2016, forty-three years after the program began. By

using data from 1995 we are able to avoid potential complications related to mapping the division

of districts over time and correctly allocating individuals to their appropriate INPRES exposure:

we observe 289 districts in the 1995 data, by 2016 there are 511 districts in Indonesia. Further,

we show the importance of accounting for cross-district spillovers from the INPRES program when

measuring impacts. Adjusting for these effects in the context of INPRES would likely increase

the magnitude of the inter-generational importance of the program. Porzio and Santangelo [2019]

build a dynamic overlapping generation model of labor supply and production with re-allocation

frictions, and use their model to guide empirical estimates of the contribution of human capital

accumulation to structural change using data from 49 countries. They supplement their main,

country-level specifications which use variation in education across cohorts, with DID and 2SLS

regressions from Indonesia that are identified by district-level differences in INPRES exposure.

The estimates are nearly identical to our results for the impact of INPRES on employment in

agriculture and the impact of an additional year of education on the likelihood of employment in

agriculture. Our focus is on employment outside of agriculture, which combines extensive margin

shifts from out of the labor force or unemployment to employment in agriculture as well as shifts

from employment in agriculture to employment outside of agriculture. In addition, we explore the

impacts of educational attainment on the economic sector of labor supply, a critical distinction given

the large differences in value added per worker across sectors within non-agricultural employment.

We contribute to this literature by linking adult male decisions to supply labor outside of agri-

culture to plausibly exogenous variation in human capital accumulation generated by the INPRES

program. The data and empirical strategy enable us to perform a variety of assessments of the iden-

tifying assumptions, the results of which support the interpretation of the main results as estimates

of the causal impact of increases in educational attainment on participation in agriculture. Our

findings thus help answer whether government investment in education can accelerate structural

transformation out of agriculture.

Second, we contribute to the expanding literature investigating whether general equilibrium

(GE) effects contribute to the total impact of large scale programs [Acemoglu, 2010, Duflo, 2004,

Khanna, 2018, Muralidharan et al., 2018]. In the same context, Duflo [2004] presents evidence

7


that primary school educated males in cohorts too old to have benefited from the INPRES pro-

gram experienced slower wage growth than they otherwise would have. Khanna [2018] specifies a

model of school and labor market supply and demand and uses a regression discontinuity design

(RDD) to estimate the impact of a similar construction program in India. Importantly, the program

assignment rules enable him to estimate the returns to education and to identify GE effects sepa-

rately for treated and untreated cohorts. The results suggest that GE effects depress the returns

to education by 32% and reduce overall labor market benefits by 23%, with the GE effects con-

centrated among young (treated) cohorts. Muralidharan et al. [2018] employs random sub-district

level variation in a government public employment program in India to estimate the impacts of the

program on earnings and wages. They find that the program had direct effects on earnings from

program wages, and that GE effects resulted in positive spillover effects on private sector wages.

In contrast with Khanna [2018], we are not able to identify the total GE effects of the program

through the DID empirical strategy. Instead, we estimate the differential GE effect borne by the

cohort young enough to have been impacted by the school building program in their own district

of birth. If, as in Khanna [2018], GE effects are concentrated among similarly aged workers, then

our estimates will capture most of the total GE impacts resulting from the changing distribution

of worker characteristics in neighboring districts. Our findings indicate that, at least in contexts

where administrative divisions are small in area, it can be important to allow for cross-boundary

spillovers in program impacts.

The rest of the paper is organized as follows. Section II describes the Indonesian context, the

INPRES school building program, and the data. Section III presents the empirical strategy. Section

IV presents the results, and Section V concludes.

II Context, Program, and Data

Context

In 1995, Indonesia was in the midst of a period of rapid growth, economic and otherwise. Between

1970 and 1995 GDP per capita grew at a rate of 4.35% per year, life expectancy at birth increased

by over 10 years, and cereal yields–driven by the adoption of hybrid seeds and other technologies

8


developed during the Green Revolution [Hayami and Ruttan, 1985, Gollin et al., 2014a]–nearly

doubled [World Bank, 2017]. The gains in agricultural productivity were sufficiently large that

Indonesia, which averaged net food and animal product imports of over 650 million USD per year

between 1961-1983,8 became a net exporter of food and animal products during the subsequent

decade [FAO, 2017]. Existing cross-country studies on structural transformation and productivity

gaps in low-income countries uniformly emphasize the important role played by subsistence con-

straints in preventing shifts out of agricultural sector employment [Caselli and Coleman II, 2001,

Caselli, 2005, Lagakos and Waugh, 2013, Restuccia et al., 2008, Gollin et al., 2014a]. Relative to

previous decades, the late 1980s and 1990s therefore presented fewer barriers to the movement of

labor out of agriculture in Indonesia.

Consistent with this, calculations using data from the Indonesian intercensal surveys of 1976,

1985, and 1995 show a drastic reduction in the likelihood that men between the ages of 25 and 64

were employed in agriculture over these two decades. Between 1976 and 1995, the share of men

primarily employed in agriculture fell from 0.59 to 0.37 and the share of men primarily employed

outside of agriculture rose from 0.37 to 0.57. While there was a modest increase in male employment

in the service sector (from 0.26 in 1976 to 0.32 in 1995), the bulk of the increase in non-agricultural

labor was absorbed by labor supplied to industry, which experienced a 127.4% increase off of a

base share of 0.08 in 1976. Over the same time period, the share of men aged 25-64 with at least

a primary school education more than doubled from 0.30 in 1976 to 0.68 in 1995.

Table 1: Education and Literacy by Sector of Employment in 1985

Share Completed Primary School Share Literate
Agriculture, Fishing, Forestry 0.32 0.76
Industry 0.57 0.90
Services 0.71 0.94
Not working/Unknown 0.53 0.79

Notes:
Data come from the 1985 Intercensal population survey for Indonesia (SUPAS). Only males between the ages of 25
and 64 are included.

In the 1985 Indonesian intercensal survey–when birth cohorts affected by INPRES were all below

the age of eighteen–education and literacy were strongly associated with the sector of employment

8In 2017 USD.

9


for men aged 25-64. Table 1 displays the share of men with at least a primary school education and

the share of employed men that were literate by sector. Just 32% of men employed in agriculture

had a primary school education, over 20 percentage points below the level for men that were out of

the labor force or unemployed (53%), 25 percentage points below the figure for men employed in

industry, and 39 percentage points below the level for men employed in the service sector. A similar

pattern emerges for literacy: the literacy rate for men employed in agriculture was 3 percentage

points lower than for men out of the labor force or unemployed, 14 percentage points lower than

for men employed in industry, and 18 percentage points lower than for men employed in the service

sector. While there were undoubtedly multiple factors that contributed to the time trends and to

the cross-sectional association between education and the sectoral allocation of labor, we focus on

whether any of the shift out of agriculture was caused by the increase in education. To do so, we

exploit variation across Indonesian districts and birth cohorts in exposure to a massive primary

school building program.

Program

The Sekolah Dasar INPRES school building program began in 1973 and was part of a large push

for development by the Indonesian government that was funded using money from an oil boom. By

1980, 61,807 schools had been constructed; on average, this resulted in an average of 2 schools being

built per 1,000 primary school-aged children in each district and it nearly doubled the number of

public primary schools in Indonesia from the 1973-level baseline of approximately 67,000. Figure

2 displays the number of INPRES schools built per 1,000 children for all of the districts in our

data using district boundaries as defined in 1995. Program intensity ranged from a minimum of

1.27 schools to a maximum of 8.6 schools per 1,000 primary school-aged children. The number of

schools built were as follows: 6,142 in 1973, 6,138 in 1974, 10,159 in 1975, 10,157 in 1976, 15,134 in

1977, and 15,134 in 1978.9 Also evident from the figure is the large dispersion in program intensity,

even across contiguous districts.

By design, the number of schools constructed in each district was proportional to the number

of primary school-aged children not enrolled in school in 1971. Duflo [2001] shows that in general,

9We only have data for these years, although school construction did continue after 1978.

10


Figure 2: INPRES Schools Built per 1,000 Children by District

(2
.7

5,
9]

(2
,2

.7
5]

(1
.5

,2
]

[0
,1

.5
]

N
o 

da
ta

Notes:
Authors’ own calculations using SUPAS data, information from Ministry of Education and Culture and presidential
instructions released by the Indonesian planning agency, with geographic information from GADM GADM Database
[2015]. Number of schools constructed per 1,000 primary school-aged children in the district in 1971 based on 1995
administrative boundaries.

this rule of thumb was followed, though compliance was not perfect.10 The government also made

10A district-level regression of the natural log of the total number of INPRES schools built in a district on the

11


complementary investments, including recruiting the necessary teachers to staff the schools, ensur-

ing that all the newly hired teachers received pre-service training, and paying teacher salaries at the

newly constructed schools. This was in sharp contrast to the years prior to 1973, when there was a

freeze in both hiring and capital expenditure in the education sector. Though school construction

and teacher training was the largest component of the INPRES program, it also included other

sub-programs including a water and sanitation intervention which was the second largest in terms

of funding. The program resulted in a large increase in primary school enrollment, from 69 percent

in 1973 to 83 percent in just five years later in 1978. Readers are referred to Duflo [2001] for further

details on the program.

Data

We use two primary sources of data in this paper. First, we identify the timing and intensity

of the school building program for each district from the data employed by Duflo [2001] based

on information she collected from the Ministry of Education and Culture, the 1971 Census of

Indonesia, and presidential instructions released by the Indonesian planning agency. From these

data, we use district level measures of the number of schools constructed between 1973-79, the

number of school-aged children in 1971, the number of children enrolled in school in 1971, and the

allocation of the water and sanitation program.

We merge these data to the 1995 intercensal survey carried out by the Indonesian Central Bu-

reau of Statistics–the Intercensal population survey (SUPAS)–and made available through IPUMS

IPUMS [2018]. We focus on men born between 1950 and 1972, and match all of the men observed

in the SUPAS data to the school construction data from their district of birth. We focus on men

because their labor force participation is nearly universal during this time period. This enables us

to focus on individual decisions about whether to supply labor in agriculture or outside of agricul-

ture, and to abstract away from issues related to marriage and its effect on the extensive margin of

natural log of the total number of primary school age children in that district in 1973 and the natural log of one
minus the enrollment rate for those children suggests that a one percent increase in the number of children was
associated with a 0.78 percent increase in the number of INPRES schools built, while a one percent increase in the
non-enrollment rate was associated with a 0.12 percent increase in the number of INPRES schools built [Duflo, 2001].
Both coefficients are smaller than one, the level of the proposed allocation rule, but still suggest that non-enrollment
and the number of primary school age children were strong predictors of the intensity of the school building campaign.

12


labor supply of women.11 In addition, Duflo [2001] shows that, on average, INPRES only affected

educational attainment for men. For women, INPRES appears to only have affected schooling for

those born in districts where the practice of bride price is common Ashraf et al. [Forthcoming]. In

addition to the SUPAS data from 1995, we also use earlier SUPAS data from 1976 and 1985–to

characterize historical education, employment, and sectoral allocation patterns in Indonesia–and

Census data from 1971–to determine whether districts were urban or rural in 1971, prior to the

start of the program. Districts are classified as being rural if more than half of the households

interviewed in a district in 1971 were identified as residing in a rural area.

The SUPAS data allow us to construct detailed measures of human capital accumulation, eco-

nomic activity, and demographic circumstances for the men in our sample. In keeping with previous

research on INPRES, we focus on three measures of human capital: the total number of years of

school completed, an indicator for whether primary school was completed, and an indicator for

whether individuals are literate. Sector of employment is constructed on the basis of each indi-

vidual’s reported industry for their primary activity during the week preceding the survey. The

disaggregated categories are combined to allocate employment into one of three different sectors by

International Standard Industrial Classification (ISIC) standards: agriculture,12 industry,13 or ser-

vices.14 Annual income and average hourly wages are constructed based on total reported monthly

income during the previous month and the total number of hours worked during the previous

week.15 We also generate an indicator for whether the district of residence does not match the

district of birth–implying that an individual has moved since birth–and an indicator for whether

the district of residence is classified as urban or rural in 1995.

Table 2 displays summary statistics for three groups of men in our sample: 1) main sample

“treated” cohorts born between 1968-1972 who were young enough that they would not have started

primary school before the program began, 2) main sample “control” cohorts born between 1957-

1962 who should have completed primary school prior to the start of the INPRES program (these

11We feel strongly that this is an important question that future research should carefully address.
12Forestry, hunting, and cultivation of crops and livestock.
13Manufacturing, mining, construction, electricity, water, and gas.
14Wholesale and retail trade, transport, and government, financial, professional, and personal services such as

education, health care, and real estate services.
15All nominal currency amounts are converted to their 2017 USD values and unemployed individuals with no

reported earnings are coded as missing hourly wage information but as having earned no income.

13


individuals also serve as the “treated” sample for the placebo tests), and 3) placebo sample “control”

cohorts born between 1950-1956 who are older than the main sample “control” cohorts and would

have finished primary school well before the start of the program. The main sample “treatment”

and “control” cohorts are used in specifications to estimate the impact of INPRES on the outcomes

of interest. The main sample “control” along with the placebo sample “control” cohorts are used

to assess the likelihood that the conditional independence assumption is met in the context of the

DID identification strategy. We discuss both in more detail below.

By construction, the average age is increasing across columns: it is just under 25 for men in

the youngest group, 35 for the main sample “control” group, and nearly 42 in the placebo sample

“control” group. Consistent with the Indonesia-wide trends, primary school completion rates and

educational attainment are inversely related to age. Eighty-nine percent of the youngest group has

at least a primary school education, while only 73 percent and 69 percent of the older two cohorts

had completed primary school; on average, years of schooling is 9 for the youngest group, 7.3 for

men born 1957-1962, and 6.8 for those born between 1950-1956.

Men in the older two groups are nearly universally employed–97 percent of both groups are

employed–while employment is lower for the youngest group (83 percent). The gradients across

birth cohorts are still present, though notably less steep, for employment outside of agriculture,

wage employment, and employment in services than they are for employment in agriculture. The

youngest group of men is actually more likely to be employed in industry than men from the two

older groups. Hourly wages and annual salary are increasing in age, as are the likelihoods of ever

having migrated and residing in a rural area.

14


Table 2: Summary Statistics

Main Sample Placebo Sample
Treated Control Control

Born 1968-1972 Born 1957-1962 Born 1950-1956

Age 24.88 35.35 41.75
(1.43) (1.68) (2.04)

Completed primary school 0.89 0.73 0.69
(0.32) (0.44) (0.46)

Years of schooling 9.06 7.32 6.81
(3.83) (4.27) (4.20)

Literate 0.98 0.94 0.93
(0.14) (0.23) (0.26)

Household not engaged in any agriculture 0.55 0.55 0.51
(0.50) (0.50) (0.50)

Employed 0.83 0.97 0.97
(0.37) (0.17) (0.16)

Employed in agriculture 0.28 0.33 0.36
(0.45) (0.47) (0.48)

Employed outside agriculture 0.56 0.64 0.61
(0.50) (0.48) (0.49)

Employed in Industry 0.21 0.20 0.18
(0.41) (0.40) (0.38)

Employed in Services 0.28 0.38 0.37
(0.45) (0.49) (0.48)

Wage Employed 0.42 0.44 0.40
(0.49) (0.50) (0.49)

Self-employed 0.31 0.51 0.56
(0.46) (0.50) (0.50)

Unpaid work 0.10 0.02 0.01
(0.30) (0.14) (0.12)

Unpaid work in agriculture 0.08 0.01 0.01
(0.27) (0.12) (0.10)

Annual salary (2017 USD) 828.56 1362.24 1461.76
(792.33) (903.24) (977.74)

Log Hourly wage (2017 USD) -0.89 -0.65 -0.58
(0.61) (0.67) (0.68)

Migrated since birth 0.26 0.30 0.30
(0.44) (0.46) (0.46)

Rural resident 0.53 0.58 0.59
(0.50) (0.49) (0.49)

Observations 28,699 29,706 28,695

Notes:
Data comprise individuals included in the empirical sample, and thus are restricted to males only. Column 1 includes
all men born between 1968 and 1972 (the treated sample in the “Experiment of Interest”). Column 2 includes all
men born between 1957 and 1962 (the control sample, or unexposed cohorts, in the “Experiment of Interest”, who
are also the treated sample for the “Placebo Experiment”). Column 3 includes all men born between 1950 and 1956
(the control group for the “Placebo Experiment”).

15


III Empirical Strategy

To identify the impact of the INPRES program on educational attainment and employment in

agriculture, we rely on a difference in differences (DID) empirical strategy. We estimate treatment

effects by comparing the difference in outcomes between cohorts of men that were born late enough

that they should have started primary school after school construction began in 1974 and were thus

exposed to the program, and cohorts that were old enough that they should have completed primary

school before the program began and were therefore not exposed to the program. This cross-cohort

difference in outcomes is then differenced across districts that experienced a higher intensity of

school construction and districts that experienced a lower intensity of school construction. As

Indonesian children attend primary school between the ages of 7 and 12, we follow Duflo [2001] and

classify children born between 1968 and 1972 (at most age 6 in 1974) as young enough to have been

affected by the program and those born between 1957 and 1962 (at least age 12 in 1974) as being

too old to have been affected by the program. In the primary specifications children born between

1963 and 1967, who would have been partially exposed to the program, are excluded. Duflo [2001]

shows that delayed enrollment and repetition are negligible.

We use the number of schools built per 1,000 children in the district of birth by 1979 as a

continuous measure of program intensity. Importantly, we use district of birth rather than district

of residence to match children to the INPRES school building intensity measures. This ensures

that non-random sorting of individuals across districts in response to the school building campaign

will not contaminate the estimates.

To calculate the DID treatment effects in a more parsimonious regression framework, we esti-

mate the following equation:

Yidt = β1Tt · Sd + αd + αt +
∑

k

X ′dI
k
t Γk + εidt (1)

where Y is the outcome of interest for individual i, born in district d, in year t. Tt is a dummy

variable equal to one for those born between 1968-1972 (i.e. those aged 2-6 in 1974 who were

potentially affected by INPRES) and zero for those born 1957-1962 (aged 12 to 17 in 1974). Sd is

the number of INPRES schools built per 1,000 children in district d. αd are district of birth fixed

16


effects and αt are year of birth fixed effects.
∑
k
X ′dI

k
t Γk denotes year of birth fixed effects interacted

with a series of district-level control variables: the primary school-aged population in 1971 (ages

5-14) and the enrollment rate in the district of birth in 1971 (i.e. the variables that formed the basis

for the number of schools that were to be built in each district). We also control for the interaction

between birth year fixed effects and the allocation of a water and sanitation program in the district

of birth–the second largest component of the INPRES program–to account for the possibility that

other INPRES programs could also have affected education and sectoral allocation outcomes for

the sample. Standard errors are clustered at the level of the district of birth. In what follows, we

follow a convention used in Duflo [2001] and refer to this as the “Experiment of Interest.”

The coefficient of interest is β1, which captures the impact on the outcome of an additional

school built per 1,000 primary school-aged children. To interpret β1 as causal it must be true

that, in the absence of the program, the differences in outcomes between men in the older and

younger cohorts in districts with higher program exposure and lower program exposure would have

followed parallel paths. To assess this parallel trends assumption, we estimate specifications that

are analogous to (1), but we replace the treated cohorts from (1) with the control cohorts that were

12-17 in 1974 (born 1957-1962) and use men who were 18-24 in 1974 (born 1950-1956) as placebo

control cohorts. This “Placebo Experiment” explores whether outcomes in high intensity and low

intensity districts were on similar paths prior to the start of the INPRES program. A failure to

reject the null hypothesis that the outcomes in high and low school building intensity districts were

on similar paths prior to the start of the INPRES program offers support for the DID identifying

assumptions.

Following a strategy employed in Duflo [2001] we estimate a less parametric variant of equation

(1) that replaces the school building intensity interacted with the indicator for whether individuals

were born late enough to have been exposed to the program (Tt · Sd) with a series of interaction

terms between the school building intensity in the district and a full set of indicators for whether the

individual was born in one of two contiguous years. We include the “partially treated” individuals

born between 1963-1967 who were omitted when estimating (1), resulting in 11 different birth

cohort groups.16 Adjacent birth years are pooled together to improve the precision of the estimates

16The cohort groups are men born in 1950-52, 1953-54, 1955-56, 1957-58, 1959-60, 1961-62, 1963-64, 1965-66,

17


and the controls are identical to those included in equation (1).

To provide a more direct answer to how educational attainment affects individuals’ decisions

to participate in agriculture we run two-stage least squares (2SLS) specifications that instrument

for educational attainment (years of school completed) with the main DID treatment term: Tt ·Sd.

These specifications provide a causal estimate of the impact of an additional year of schooling on the

sectoral allocation of male employment if, in addition to satisfying the parallel trends assumption,

INPRES exposure only impacted sectoral allocations of individuals by changing their educational

attainment.17

In addition to the results based on estimates from equation (1), we implement several variations

of the main specification to provide evidence on heterogeneity in the treatment effects along two

key dimensions. First, to explore whether INPRES may have affected labor market outcomes in

adjacent districts, we include a term measuring the average school building intensity in neighboring

districts. In the same way that the own-district INPRES intensity is interacted with an indicator

for whether individuals were young enough to have been impacted by the program, we interact

the average school building intensity in neighboring districts with an indicator for whether the

individual was born between 1968 and 1972 (Tt):

Yidt = β1Tt · Sd + β2Nd · Sd + αd + αt +
∑

k

X ′dI
k
t Γk + εidt (2)

where Nd is the average number of schools built per 1,000 primary school-aged children in the

adjacent districts (those with contiguous borders) to the district of birth. β2 captures the DID

impact of the INPRES program intensity in neighboring districts. The distribution of average

1967-68, 1969-70, 1971-72.
17We discuss whether this assumption is reasonable and the likely biases if the assumption is not satisfied in

Section IV. De Chaisemartin and D’Haultfoeuille [2018] show that in order for the 2SLS (Wald-DID) estimator to
provide an unbiased estimate of the local average treatment effect (LATE) for individuals induced to complete more
schooling by INPRES in a “Fuzzy DID” context, several other assumptions are needed. First, individuals must
respond monotonically to the increase in primary school availability. Though not testable, this seems likely to be
satisfied in our context, particularly given that INPRES exerted considerable effort to maintain student-teacher ratios
by fully funding teacher salaries in the program’s early years. Potentially more concerning, the effects of schooling
must be stable over time and either schooling impacts must be homogeneous for individuals in old and young cohorts
or educational attainment must be stable over time in districts with a low level of INPRES exposure. We show
results based on two alternative estimands proposed in De Chaisemartin and D’Haultfoeuille [2018], the Wald-DID
with an alternative control group defined to ensure stable treatment levels over time and the time corrected Wald
ratio (Wald-TC)–which should identify LATE even in the absence of time or group invariant treatment effects–in the
appendix.

18


INPRES intensity in neighboring districts closely tracks the distribution of INPRES intensity more

broadly, ranging from 1.29 schools per 1,000 children at the 10th percentile, 1.87 schools per 1,000

children at the 50th percentile, and 3.1 schools per 1,000 children at the 95th percentile. We include

the same set of control variables as in (1), and standard errors are again clustered at the district

of birth level. We note that as with all time-invariant district-level characteristics, any impact of

the number of neighboring districts and the average INPRES intensity in neighboring districts (not

interacted with the treatment cohort indicator Tt) will be absorbed by the district of birth fixed

effects.

When estimating (2), we focus on the primary outcomes–sectoral allocation (employed outside

of agriculture, employed in industry, employed in services)–as well as on the hourly wages and

annual salaries of men. If cross-district spillovers contribute importantly to the total treatment

effects of the program, we would expect to observe β2 coefficients that are of the opposite sign of

the β1 terms; for example, if INPRES exposure induced more employment outside of agriculture

by increasing the relative expected return to non-agricultural employment, then we would estimate

β1 > 0 for both the non-agricultural employment and the hourly wage equations. If, as seems

likely in the Indonesian context, workers can supply labor across district boundaries, then a larger

pool of educated workers driven by higher INPRES exposure in neighboring districts could reduce

the market clearing wage for educated workers outside of agriculture. This would then attenuate

the impact of the program on employment outside of agriculture as the (previously) marginal

non-agricultural worker now prefers supplying labor in agriculture. In addition to running (2)

for the main outcomes and hourly wages, we show estimates with primary school completion and

educational attainment as outcomes. We view these as placebo outcome tests for β2 as children

in Indonesia almost universally attend primary school in their district of residence18 so that an

increase in primary school supply in neighboring districts should not impact contemporaneous

schooling outcomes.

Finally, we estimate specifications that enable us to comment on whether the program had

differential effects in urban and rural areas. To classify districts as rural or urban we use Indonesian

1897.8% of children at least six years of age who ever attended primary school in the first wave of the Indonesian
Family Life Survey (IFLS1) did so in their district of birth [Frankenberg and Karoly, 1995].

19


Census data from 1971; districts are designated as being rural if over half of the households in that

district were identified as residing in a rural area in 1971 and urban otherwise.19 We intentionally

identify districts as urban or rural using data from before INPRES began to avoid the possibility

that the urban/rural status of districts may have been affected by the program. While Duflo

[2004] shows that the education impacts are larger in rural areas, to our knowledge no existing

work explores whether the sectoral allocation patterns are different for urban and rural households.

Nearly 15% of the men in our main sample who are classified as residing in an urban district report

that their primary economic activity is in agriculture, suggesting that there is scope for impacts on

participation in agriculture even in urban districts.

We interact the main DID treatment variable (Tt · Sd) with the dummy variable for whether

the district of birth was rural:

Yidt = β1Tt · Sd + β2Rd · Tt · Sd + αd + αt +
∑

k

X ′dI
k
t Γk + εidt (3)

where Rd is a dummy variable for whether the district of birth was rural in 1971. β1 gives

the impact of INPRES exposure for men born in urban districts while β1 + β2 gives the analogous

impact for men born in rural districts. We include the same controls as in (1) and (2) and again

cluster standard errors at the district of birth level.

IV Results

Main Results

We begin by presenting the results from estimating equation (1) for the human capital and em-

ployment: educational attainment, primary school completion, literacy, employment, agricultural

employment, and non-agricultural employment. Panel A of Table 3 shows the estimates and stan-

dard errors corresponding to β1 in (1) for the “Experiment of Interest”.

19Given that district boundaries changed between 1971 and 1995, we re-code the 1971 districts to match the 1995
codes. This exercise was conducted manually, and in some cases, historical sources were consulted to determine
whether districts had split or whether other events had occurred. Ultimately, 98% of the individuals in our main
sample are successfully matched to an urban/rural classification from the 1971 Census data. This indicates that
selection into the sample used to explore heterogeneity by the urban/rural status of the district of birth is unlikely
to importantly affect the results.

20


Table 3: Education and Employment in Agriculture

Completed Primary Years of Literate Employed Employed in Employed outside
School Schooling Agriculture Agriculture
(1) (2) (3) (4) (5) (6)

Panel A: Experiment of Interest
Born 1968-1972*No. INPRES 0.0194∗∗ 0.1213∗ 0.0075∗∗ 0.0098 -0.0088 0.0186∗∗
schools per 1,000 children (0.0078) (0.0660) (0.0037) (0.0069) (0.0057) (0.0088)
Observations 58,405 58,405 58,405 58,405 58,405 58,405
Control group mean 0.73 7.32 0.94 0.97 0.33 0.64

Panel B: Placebo Experiment
Born 1957-1962*No. INPRES -0.0011 -0.0076 0.0024 0.0011 0.0036 -0.0025
schools per 1,000 children (0.0057) (0.0421) (0.0035) (0.0020) (0.0046) (0.0047)
Observations 58,401 58,401 58,401 58,401 58,401 58,401
Control group mean 0.69 6.81 0.93 0.97 0.36 0.61

Notes:
Panel A: The sample for the “Experiment of Interest” includes men aged 2-6 in 1974 (“treated” cohorts) and men aged 12-17 in 1974 (“control” cohorts). Panel
B: The sample for the “Placebo Experiment” includes men aged 12-17 in 1974 (“treated” cohorts) and men aged 18-24 in 1974 (“control” cohorts). All regressions
include year of birth fixed effects, district of birth fixed effects, year of birth interacted with the primary school-aged population in the district of birth in 1971, year
of birth interacted with the enrollment rate among primary school-aged children in the district of birth in 1971, and year of birth interacted with the allocation
of the water and sanitation program in the district of birth. Standard errors are reported in parentheses and are clustered at the level of the district of birth. ***
p<0.01, ** p<0.05, * p<0.1.

21


The coefficients in columns (1) and (2) suggest that each additional school built per 1,000 school-

aged children increases the likelihood of completing primary school by 1.94 percentage points (p-

value < 0.01) and increases the number of completed years of schooling by 0.12 years (p-value 0.06).

Column (3) shows that INPRES exposure also led to an increase in the likelihood of being literate

of 0.75 percentage points (p-value < 0.01). These results confirm the findings in Duflo [2001],

Ashraf et al. [Forthcoming], Akresh et al. [2018]: exposure to the INPRES program substantially

increased human capital among treated cohorts of men. The literacy impacts are notable as they

indicate that the program induced improvements in skills in addition to simply increasing the time

spent in school. This is important distinction as skills may be the more relevant determinants of

employment and growth [Hanushek and Woessmann, 2011]. Clearly, the Indonesian government’s

investments in education through the INPRES program succeeded in improving human capital

outcomes.

Columns (4) to (6) display the analogous estimates for the likelihood of being employed, em-

ployed in agriculture, and employed out of agriculture. Each additional school built per 1,000

children increases the likelihood that men are employed in 1995 by 0.98 percentage points and de-

creases the likelihood that they are employed in the agricultural sector by 0.88 percentage points,

though neither estimate is statistically significant at conventional levels (p-values of 0.16 and 0.12,

respectively). Additional exposure to the INPRES program, however, importantly increased the

likelihood of male employment outside of the agricultural sector: the point estimate indicates that

each additional school built per 1,000 primary school-aged child in the district of birth increased

the likelihood of non-agricultural employment by 1.86 percentage points (p-value 0.04). Given that,

on average, two schools were constructed per 1,000 primary school-aged children, this translates

to an increase in non-agricultural employment that is roughly 6% of the mean non-agricultural

employment rate for the unaffected older cohort (those born between 1957-1962).

Turning to the results of the “Placebo Experiment” displayed in Panel B of Table 3, the point

estimates are all extremely small in magnitude–typically around one-tenth of the size of the corre-

sponding estimates in Panel A –not consistently of the same sign as the main estimates, and the

p-values are never smaller than 0.48 for the human capital outcomes or 0.43 for the employment

outcomes. The lack of meaningful associations between future levels of INPRES exposure and

22


lagged outcomes shown in Panel B offers support for the idea that in the absence of the INPRES

program outcomes would have evolved along similar paths.

We next explore the relationship between exposure to the INPRES program in childhood and

the likelihood of employment in industry, in services, the likelihood of being employed for a wage,

and the likelihood of being self-employed. Table 4 shows the results of estimating equation (1)

with this second group of employment outcomes. Employment in both industry and the service

sector increases with exposure to INPRES, with each school built per 1,000 children in the district

of birth resulting in an increase of 1.14 percentage points in the likelihood of being employed in

industry (significant at the 5% level) and a (statistically insignificant) increase in the likelihood of

employment in the service sector of 0.76 percentage points. Indonesia-wide data from the World

Development Indicators [World Bank, 2017] hints that these sectoral shifts may have increased

economy wide productivity. Mean value added per worker (in 2010 USD) was 4,482.79 in the

service sector, 13,342.69 in industry (implying a mean value added per worker of 7,308.82 outside

of agriculture), and 1,775.88 in agriculture in the year 2000. While these national-level averages are

certainly not unbiased estimates of the expected value added for workers on the margin of supplying

labor to agriculture or outside of agriculture, the magnitude of the gap is sufficiently large that

seems likely the INPRES-induced shift in economic sector resulted in important productivity gains.

Consistent with results in Duflo [2001] and the observed movement across sectors (away from

agriculture and towards more formal activity in industry and services), we also find a large effect

of INPRES on the likelihood that men report their primary activity as being some type of wage

employment: an additional school built per 1,000 children increases the likelihood of being employed

for a wage by 1.55 percentage points (p-value: 0.01). Self-employment also increases, though

the point estimate is not statistically significantly different from zero at conventional levels (p-

value: 0.11). The evidence presented in Table 4 therefore reinforces the idea that the increases in

education induced by INPRES led men to select into more formal labor supply arrangements (wage

employment) in more productive economic sectors.

As was the case with the human capital and more coarse employment outcomes shown in Table

3, Panel B again finds no important relationships between the INPRES exposure intensity and

the outcomes in Table 4 for the birth cohorts too old to have benefited from the primary school

23


Table 4: Sector of Employment, Wage- and Self-Employment

Employed Employed Wage Self-
in Industry in Services Employed Employed

(1) (2) (3) (4)

Panel A: Experiment of Interest
Born 1968-1972*No. INPRES 0.0114∗∗ 0.0076 0.0155∗∗ 0.0086
schools per 1,000 children (0.0055) (0.0064) (0.0060) (0.0053)
Observations 58,405 58,405 58,405 58,405
Control group mean 0.20 0.38 0.44 0.51

Panel B: Placebo Experiment
Born 1957-1962*No. INPRES -0.0043 -0.0014 -0.0009 0.0028
schools per 1,000 children (0.0055) (0.0061) (0.0050) (0.0046)
Observations 58,401 58,401 58,401 58,401
Control group mean 0.18 0.37 0.40 0.56

Notes:
Panel A: The sample for the “Experiment of Interest” includes men aged 2-6 in 1974 (“treated” cohorts) and men
aged 12-17 in 1974 (“control” cohorts). Panel B: The sample for the “Placebo Experiment” includes men aged 12-17
in 1974 (“treated” cohorts) and men aged 18-24 in 1974 (“control” cohorts). All regressions include year of birth
fixed effects, district of birth fixed effects, year of birth interacted with the primary school-aged population in the
district of birth in 1971, year of birth interacted with the enrollment rate among primary school-aged children in the
district of birth in 1971, and year of birth interacted with the allocation of the water and sanitation program in the
district of birth. Standard errors are reported in parentheses and are clustered at the level of the district of birth.
*** p<0.01, ** p<0.05, * p<0.1.

construction.

One potential mechanism through which INPRES-induced increases in human capital could

influence individual sectoral allocation is by changing the probability of migration. There is im-

portant heterogeneity across districts in the share of men born between 1957 and 1972 that are

employed outside of agriculture (0.32 at the 10th percentile, 0.86 at the 90th percentile), employed

in agriculture (0.03 at the 10th percentile, 0.65 at the 90th percentile), and that are wage employed

(0.22 at the 10th percentile, 61.8 at the 90th percentile). Though this variation reflects differences

in both labor supply and demand, workers on the margin of supplying labor to agriculture may

make different decisions when they reside in a district where just 32 percent of similarly aged peers

are engaged in non-agricultural employment than they would in a district where 86% are employed

outside of agriculture. Table 5 explores this mechanism using an indicator for whether individuals

have migrated to a new district since birth, an indicator for whether they migrated within the five

years preceding the survey, and an indicator for whether they currently reside in a rural location.

24


Table 5: Migration

Migrant since birth Migrant past five years Rural location
(1) (2) (3)

Panel A: Experiment of Interest
Born 1968-1972*No. INPRES -0.0049 0.0018 -0.0002
schools per 1,000 children (0.0044) (0.0053) (0.0053)
Observations 58,405 58,404 58,405
Control group mean 0.26 0.06 0.61

Panel B: Placebo Experiment
Born 1957-1962*No. INPRES 0.0008 0.0035 0.0017
schools per 1,000 children (0.0048) (0.0027) (0.0049)
Observations 58,401 58,400 58,401
Control group mean 0.26 0.04 0.63

Notes:
Panel A: The sample for the “Experiment of Interest” includes men aged 2-6 in 1974 (“treated” cohorts) and men
aged 12-17 in 1974 (“control” cohorts). Panel B: The sample for the “Placebo Experiment” includes men aged 12-17
in 1974 (“treated” cohorts) and men aged 18-24 in 1974 (“control” cohorts). All regressions include year of birth
fixed effects, district of birth fixed effects, year of birth interacted with the primary school-aged population in the
district of birth in 1971, year of birth interacted with the enrollment rate among primary school-aged children in the
district of birth in 1971, and year of birth interacted with the allocation of the water and sanitation program in the
district of birth. Standard errors are reported in parentheses and are clustered at the level of the district of birth.
*** p<0.01, ** p<0.05, * p<0.1.

Twenty-six percent of the men in the cohort too old to have attended INPRES-constructed

primary schools have migrated to a new district since birth, 6 percent have migrated across district

boundaries in the preceding five years, and 61 percent currently reside in a rural area; migration

is therefore reasonably common over longer durations and a majority of our sample still reside in

rural areas. However, the DID estimates indicate that exposure to INPRES had no impact on the

likelihood of migration. The point estimates are all precisely estimated zeros with 95% confidence

intervals ruling out increases in the likelihood of migration greater than 0.4 percentage points since

birth and increases greater than 1.2 percentage points in the past five years. The impact on rural

residence is similarly statistically insignificant and close to zero, with a 95% confidence interval

that includes effects ranging from a 1 percentage point decrease to a 1 percentage point increase

in the likelihood of rural residence in 1995. Together the results in Table 5 indicate that INPRES

did little to physically shift workers in early adulthood despite the substantial impacts observed on

sectoral changes in labor supply. Further, this suggests that the workers that INPRES induced to

complete additional schooling were able to find employment outside of agriculture largely without

25


moving away from their district of birth.20 Panel B again finds no meaningfully sized pre-trends

for any of the three outcomes in Table 5.

Finally, Table 6 shows the results of estimating equation (1) with the natural log of the hourly

wage and annualized income, both measured in 2017 USD. Each additional school built per 1,000

primary school age children increases the hourly wage (among wage earners) by roughly 1.4 percent-

age points and increases annual income by nearly 54 USD: a 4 percent increase in annual income;

the p-values for the log hourly wage and annual income are 0.17 and less than 0.01, respectively,

and the former is nearly identical to the results in Duflo [2001] though the standard errors are

larger. The stronger association with annual income is due to two factors. First, the association

between INPRES exposure and weekly labor supply in hours is negative, albeit small in magnitude

and not statistically significantly different from zero. Second, INPRES has a positive and statisti-

cally insignificant impact on employment, implying there are fewer “0 earners” in districts where

there was a greater intensity of primary school building; both contribute to a stronger relationship

between INPRES exposure and annual income. Panel B again finds no statistically significant or

meaningfully large in magnitude pre-trends for the outcomes.

Figures A1 and A2 plot the coefficient estimates and 90% confidence intervals from running the

more flexible version of equation (1) that replaces the interaction between the number of INPRES

schools built in the district of birth per 1,000 primary age children and the indicator for whether

men were born between 1968-1972 (Tt · Sd) with a full set of interactions between birth year group

(with adjacent years combined to improve statistical power) and the born 1968-1972 indicator(∑10
t=1 1 {Ti = t} · Sd

)
.21 These specifications include the “partially treated” cohorts born between

1963-1967 and all of the figures denote the last cohort without any partially treated men (labeled

Age 13 in the figures) and the first cohort with one year of fully treated men (labeled Age 7 in the

figures) with dotted vertical lines.

For all eight outcomes (primary school completion, years of schooling, employed, wage-employed,

20We also fail to reject the null of no difference in the DID treatment effects for individuals who have migrated
since birth and those that have not migrated since birth. We encourage caution in interpreting this result given the
additional selection-related complications generated by conditioning these specifications on an outcome (migration)
that is strongly associated with non-agricultural employment.

21Where 1 {·} denotes the indicator function and t takes values based on birth cohort categories: 1950-52, 1953-54,
1955-56, 1957-58, 1959-60, 1961-62, 1963-64, 1965-66, 1967-68, 1969-70, 1971-72.

26


Table 6: Wages and Income

Annual Salary Log Hourly Wage
(2017 USD) (2017 USD)

(1) (2)

Panel A: Experiment of Interest
Born 1968-1972*No. INPRES 53.7903∗∗∗ 0.0143
schools per 1,000 children (17.7133) (0.0134)
Observations 30,693 24,861
Control group mean 1362.24 -0.65

Panel B: Placebo Experiment
Born 1957-1962*No. INPRES 6.8846 0.0017
schools per 1,000 children (13.7066) (0.0103)
Observations 25,939 24,072
Control group mean 1461.76 -0.58

Notes:
Panel A: The sample for the “Experiment of Interest” includes men aged 2-6 in 1974 (“treated” cohorts) and men
aged 12-17 in 1974 (“control” cohorts). Panel B: The sample for the “Placebo Experiment” includes men aged 12-17
in 1974 (“treated” cohorts) and men aged 18-24 in 1974 (“control” cohorts). All regressions include year of birth
fixed effects, district of birth fixed effects, year of birth interacted with the primary school-aged population in the
district of birth in 1971, year of birth interacted with the enrollment rate among primary school-aged children in the
district of birth in 1971, and year of birth interacted with the allocation of the water and sanitation program in the
district of birth. Standard errors are reported in parentheses and are clustered at the level of the district of birth.
*** p<0.01, ** p<0.05, * p<0.1.

employed out of agriculture, employed in industry, employed in services, and annual income) the

point estimates for untreated birth cohorts are flat and the confidence intervals include zero. This

reinforces the Panel B results from Tables 3-6 which suggested there was no evidence of non-

parallel trends in the outcomes between the two older groups of birth cohorts included in the

“Placebo Experiment.” While even the results grouping together adjacent birth cohorts are some-

what imprecisely estimated, the treatment effects–which can be interpreted as the impact of each

additional school constructed per 1,000 children relative to men born 1950-1952–typically start to

increase between the first partially treated birth cohorts (Age 11 in 1974) and the first category

that includes a fully treated birth cohort (Age 7 in 1974). The more flexible DID specifications

therefore support the main results based on the more parametric estimation of equation (1).

27


Heterogeneity

Tables A1 through A4 display the point estimates and standard errors from allowing the impact

of exposure to INPRES to differ for districts that were rural in 1971 and those that were urban

in 1971. In absolute value, impacts on primary school completion, literacy, employment, and

non-agricultural employment, and employment in services are larger in rural areas than they are

in urban areas, while treatment effects are larger for men in urban districts for the likelihood of

migration (since birth and in the past five years) and the log of hourly wages. However, in all but

three cases–employment, employment in services, and the natural log of the hourly wage–the point

estimates are either similarly signed or neither is statistically distinguishable from zero. Together,

the results that separately estimate INPRES impacts for males in rural and urban districts suggest

that program benefits occurred in both settings, though sectoral allocation was especially affected

for rural males.

The sample reporting wages is clearly selected, and Table 4 shows that selection into reporting

a wage is affected by the program [Duflo, 2001]. We therefore follow previous work and show the

DID impacts on human capital and migration outcomes for the sample reporting a wage in Table

A5. The point estimates are similar to their Table 3 and Table 5 analogs for all four outcomes. The

impacts on primary school completion and years of schooling completed are somewhat larger for

the sample of wage earners, but neither difference is statistically significant at conventional levels.

Two-Stage Least Squares and Fuzzy Differences-in-Differences Estimates

Our preferred approach for understanding how human capital accumulation impacts participation

in agriculture and sectoral choice is based on separately estimating the DID effect of the INPRES

program on educational attainment and sector of employment, and making inferences based on

the sign and magnitude of these relationships. This requires, at least explicitly, relatively mild

assumptions to justify interpreting the DID estimates as causal. However, this strategy also does

not provide a direct answer to the question of how changes in educational attainment affect sectoral

allocations. We therefore estimate Two-Stage Least Squares (2SLS) specifications using the number

of schools built per 1,000 children in the district of birth interacted with exposure to the program

28


as an instrument for the number of years of schooling completed, with the main outcomes of the

paper as dependent variables.22

Though the 2SLS estimates provide a more direct measure of the estimand of interest, the

first stage estimates are often too small to yield precise estimates (see the estimates with years

of schooling as the dependent variable in Table 3). Further, interpreting the 2SLS estimates as

causal estimates of the local average treatment effect (LATE) for men induced to complete extra

education by INPRES requires several additional assumptions. First, for INPRES exposure to

be a valid instrument for educational attainment it must be true that INPRES only affected the

outcomes through its impact on educational attainment. This assumption is untestable, however

given results presented later in the paper, we would expect any violations of the exclusion restriction

to lead to attenuation bias in the coefficients.23 Second, because of the fuzzy DID setting,24 the

impact of educational attainment on the outcomes must be stable over time and either educational

attainment must remain stable over time in a subset of districts (a control group), or the local

average treatment effects for individuals induced to complete additional schooling by the program

must be homogeneous across districts that experienced an increase in schooling, experienced no

change in schooling, and experienced a decline in schooling between the 1957-62 and 1968-1972

cohorts [De Chaisemartin and D’Haultfoeuille, 2018].

De Chaisemartin and D’Haultfoeuille [2018] provide evidence that the latter two assumptions

may be unlikely to be satisfied in the context of INPRES and suggest three alternative estima-

tors–a time corrected Wald ratio (Wald-TC), a changes-in-changes Wald ratio (Wald-CIC), and a

difference-in-differences Wald ratio with a modified control group (Wald-DID) that rely on weaker

assumptions or limit the sample to ensure some of the identifying assumptions are satisfied. The

Wald-TC estimator, in particular, replaces the standard DID parallel trends assumption as well

22We also run 2SLS specifications with primary school completion instead of years of schooling completed as the
endogenous variable of interest. The results are qualitatively similar and available upon request.

23Specifically, we present evidence that INPRES exposure in neighboring districts may have had opposing impacts
on the employment and income outcomes, but not affected the human capital results. As long as the bias is not
sufficiently large to alter the sign of the reduced form equation, we can infer that bias would lead to smaller (in
absolute value) reduced form coefficients, without any impact on the first stage coefficients. As a result, the 2SLS
estimates will, if anything, be biased towards zero by this particular violation of the exclusion restriction. Other
types of exclusion restriction violations could lead to bias of an indeterminate sign.

24A fuzzy DID setting is one in which there is no sharp change from zero to non-zero exposure for a treatment
group and constant non-exposure for a control group.

29


as the requirement that the impact of educational attainment on outcomes is stable over time and

across treatment groups, with a milder, conditional version of the parallel trends assumption that

potential outcomes in different treatment groups evolve in the same way over time in districts with

the same pre-treatment level of education. The Wald-TC estimator also maintains the requirement

that treatment (education in our case) remain stable over time in a control group of districts.

Given the evidence presented in the Panel Bs of Tables 3-6, which suggest there is no evidence of

(unconditional) non-parallel pre-trends, the conditional parallel trends assumptions are likely to be

satisfied in our context.

Appendix Tables A6 and A7 show estimates from 2SLS specifications for the main outcomes,

log hourly wages, and annual income, and also present results from two of the estimators proposed

in De Chaisemartin and D’Haultfoeuille [2018]: Wald-DID and Wald-TC estimators with a control

group constructed to ensure that years of schooling remain constant between the 1957-62 and

1968-1972 cohorts. To form the alternative control group we modify the method proposed in De

Chaisemartin and D’Haultfoeuille [2018] for constructing the groups by designating districts where

the normalized difference25 of years of schooling between the older (born 1957-62) and younger

(born 1968-72) cohorts is less than 0.25 in absolute value as being districts with a constant level

of treatment over time. In contrast with using the p-value from a chi-squared test of whether

the distribution of schooling is equal across cohorts (as in De Chaisemartin and D’Haultfoeuille

[2018]), the normalized difference provides a measure of balance that is scale free and much less

sensitive to sample size [Imbens and Rubin, 2015]. The latter property is especially important

in the SUPAS data as many districts have few observations.26 Using the normalized difference

therefore has the benefit of preventing a relationship between district size and assignment to the

treatment group (districts with non-stable education levels over time), an issue that is more likely

to plague classifications based on t-statistics or chi-squared tests. We do not show estimates based

on the Wald-CIC estimator as it does not permit the inclusion of controls, which are critical in our
25The normalized difference between the distributions of a variable x across two groups G ∈ {c, t} is defined as:

4ct = µt−µc√
(σ2
t

+σ2
c)

2

, where µG is the mean and σ2
G is the variance of characteristic x for group G. To construct the

normalized difference we replace the means and variances with their sample analogs, setting µG = X̄G = 1
NG

∑
i∈GXi

and σ2
G = s2

G = 1
NG−1

∑
i∈G

(
Xi − X̄G

)2
,where NG denotes the number of observations in group G [Imbens and

Rubin, 2015].
26Sample districts have an average of 101 men born 1968-72 and 104 men born 1968-72.

30


context as they were used to assign school construction levels.

While the Wald-DID and Wald-TC estimators in Tables A6 and A7 ensure educational attain-

ment is stable over time in the control group and relax other identifying assumptions necessary

for the estimates to measure LATE, they also loosen the link between the variation in educational

attainment being exploited for identification and the plausibly exogenous variation in education

generated by the INPRES school building program. We therefore interpret both the 2SLS and

Wald-DID/Wald-TC results with caution.

As expected given the results in Table 3, the first stage estimates–while typically statistically

significant from zero at the 5 or 10% level–are fairly weak: Kleinbergen-Paap Wald rank statistics

are 3.37 for the employment outcomes, 3.81 for the hourly wage outcome, and just 1.12 for the

annual salary outcome; weak instruments will add noise to the 2SLS estimates, but they are also

likely to confer bias onto the estimates of the relationship between schooling and the outcomes.

To adjust for the weak instrument-related issues, we show 95% Anderson-Rubin (AR) confidence

intervals [Anderson and Rubin, 1949] for all outcomes; AR confidence intervals provide the correct

coverage and the AR test is the most powerful in a just identified model with one endogenous

variable [Moreira, 2009].

The point estimates in Table A6 are consistent with the DID results presented earlier, though

we are unable to reject that any of the coefficients are statistically significantly different from

zero based on a Wald test. Ignoring the precision and weak instrument-related issues, the 2SLS

estimates for employment outside of agriculture and employment in industry suggest that each

additional year of school completed by men is associated with a 15 percentage point increase in

the likelihood of non-agricultural employment and 9 percentage point increase in the likelihood of

employment in industry. The Anderson-Rubin 95% confidence intervals highlight the importance

of accounting for a weak first stage, with all five coefficients only bounded on one side and four

of the confidence sets inclusive of zero. Still, the AR confidence interval for employment out of

agriculture is bounded above zero, indicating that we can reject that additional years of education

increase the likelihood of non-agricultural employment as an adult by less than 1 percentage point.

The Wald-DID and Wald-TC estimates are generally consistent in sign with the 2SLS estimates

(employment being the lone exception), though smaller in magnitude. P-values are universally

31


smaller for the TC estimates than for the Wald-DID or 2SLS estimates, and we are able to reject

the null of no difference in employment, employment in agriculture, employment out of agriculture,

and employment in services.

While issues related to weak instruments cloud our ability to make strong conclusions based on

the estimates in Table A6, the relationship between male educational attainment and employment

out of agriculture is the only exception. As mentioned above, the 2SLS estimate indicates that

each year of schooling increases the likelihood of non-agricultural employment by 15 percentage

points, the AR 95% confidence intervals can not rule out infinitely large positive effects but are

able to reject effect sizes smaller than a 1 percentage point increase, and the both the Wald-DID

and Wald-TC estimates are positive, with the latter statistically significantly different from zero.

Conservatively, we can therefore infer that an additional year of completed schooling increases

non-agricultural employment by at least 1 percentage point, and most likely results in between a 2

percentage point increase (Wald-DID and TC estimates) and a 15 percentage point increase (2SLS

estimate).

Table A7 presents the same set of results for the likelihood of wage employment, the natural log

of the hourly wage, and annual income. The 2SLS point estimate for wage employment suggests

an additional year of education leads to a 12.8 percentage point increase in the likelihood of wage

employment (p-value 0.12). As with employment outside of agriculture, the AR 95% confidence

interval is not bounded above, but we can reject effect sizes smaller than a 1.8 percentage point

increase in wage employment. The Wald-DID and Wald-TC estimates are again between the lower

bound of the AR confidence set and the 2SLS point estimate, and statistically distinguishable from

zero at the 5% level in both cases.

The 2SLS point estimates for both the hourly wage and annual income outcomes are imprecise

and large–with an additional year of education associated with a 9.4 percent increase in the hourly

wage and a 656 USD increase in annual earnings–but the AR confidence intervals include zero in

the case of the hourly wage and are unable to reject large negative or large positive impacts for

annual salary owing to the extremely weak first stage. The Wald-DID and Wald-TC estimates are

once again smaller than their 2SLS counterparts but positive and statistically different from zero.

Tables A6 and A7 thus offer less than definitive answers to the direct question of how edu-

32


cational attainment affects the sectoral allocation and earnings of men. That said, the two most

robust results from these tables are that additional schooling induces men to participate in wage

employment and also pushes men towards employment outside of agriculture, with each additional

year of education increasing wage employment by between 1.8 and 12.8 percentage points and

non-agricultural employment by between 1 and 15 percentage points. While the possible ranges in

effect sizes are quite wide, we can be relatively confident that increasing educational attainment

has an economically large impact on the outcome. For example, if we assume the true impact of

an additional year of schooling on non-agricultural employment is the lower bound of the AR 95%

confidence interval (1 percentage point), the observed increase in male educational attainment in

Indonesia between 1976 and 1995 (2.9 years) can explain 14.7% of the 20 percentage point increase

in non-agricultural employment over this period. If we less conservatively assume that the true

effect is the Wald-TC estimate or the midpoint between the lower bound of the AR 95% confidence

interval and the 2SLS estimate, changes in educational attainment can explain 29.4% to 100% of

the sectoral shift during these two decades. Thus, regardless of which estimate we use, educa-

tion had important implications for the changes in the sectoral allocation of male workers between

1976-1995.

Cross District Spillovers in General Equilibrium

Tables 7, 8, and Appendix Table A8 show the results from estimating equation (2) for the employ-

ment, earnings (income and hourly wages), and human capital outcomes we examine in the paper.

To start, note that consistent with the prior that adjacent district INPRES exposures should not

affect the schooling and literacy outcomes, we fail to reject the null of no impact of the average

number of schools built per 1,000 primary school age children on years of school completed, primary

school completion, or literacy (Appendix Table A8). This is reassuring given the observation that

children in Indonesia almost exclusively attend primary school in their district of residence.27 The

Panel B point estimates are also small and statistically indistinguishable from zero.

Turning to the employment outcomes, Table 7 suggests there may be small effects of neighboring

27Migration across districts between birth and primary school age could generate a relationship between neigh-
boring district INPRES exposure and the human capital outcomes.

33


district exposure on the likelihood of employment in agriculture, employment out agriculture, and

employment in industry, though only the last estimate is statistically different from zero. For

these outcomes, the point estimates on adjacent district exposure are of the opposite sign of the

own district exposure measure, and as a result, the own district exposure point estimates are

larger in magnitude than they were in Tables 3 and 4. Each additional school built per 1,000

children in a male’s district of birth is now predicted to decrease the likelihood of employment in

agriculture by 1.11 percentage points, increase the likelihood of employment outside of agriculture

by 1.92 percentage points, and increase the likelihood of employment in industry by 1.77 percentage

points; conversely, a one school increase in the average number of INPRES schools constructed in

adjacent districts per 1,000 children increases the likelihood of employment in agriculture by 0.79

percentage points (p-value 0.32), decreases the likelihood of employment outside of agriculture by

0.19 percentage points (p-value 0.86), and decreases the likelihood of employment in industry by 1.67

percentage points (0.04). Despite the small magnitude of the point estimates for employment in and

out of agriculture, the cross district spillovers–which, as a reminder measure just the differential

impact of adjacent district INPRES exposure on young cohorts relative to old cohorts–suggest

that the total impact of INPRES varies importantly with the exposure of adjacent districts. For

employment out of agriculture, at the mean own district INPRES exposure, the total implied

impact of INPRES for a district at the 5th percentile of adjacent district school building intensity

is 12.5% higher than it is for a district at the 95th percentile. Similarly, at the 5th percentile for

neighboring INPRES exposure, the total impact on employment in agriculture at the mean own

district exposure implies a 1.2 percentage point decrease; at the 95th percentile the total implied

impact is zero. Even more drastically, for a district in the 5th percentile of adjacent district

exposure the mean level of own district INPRES treatment leads to a 1.4 percentage point increase

in the likelihood of employment in industry; at the 95th percentile it leads to a 1.9 percentage point

decrease.

The analogous results for wage employment, the natural log of the hourly wage, and annual

income are shown in Table 8. With the inclusion of the control for adjacent district INPRES

exposure, the own district INPRES exposure is 23% larger for the likelihood of wage employment,

more than twice as large for the log hourly wage outcome, and 20% larger for the annual income

34


Table 7: Spillovers and Employment

Employed Employed Employed Employed Employed
in Ag. Out of Ag. in Industry in Services

(1) (2) (3) (4) (5)

Panel A: Experiment of Interest
Born 1968-1972*No. INPRES 0.0080 -0.0111∗ 0.0192∗∗ 0.0177∗∗∗ 0.0047
schools per 1,000 children (0.0063) (0.0067) (0.0089) (0.0061) (0.0069)
Born 1968-1972*Av. No. INPRES schools per 0.0060 0.0079 -0.0019 -0.0167∗∗ 0.0075
1,000 children in neighboring districts (0.0077) (0.0079) (0.0108) (0.0080) (0.0090)
Observations 57,233 57,233 57,233 57,233 57,233
Control group mean 0.97 0.32 0.65 0.20 0.38

Panel B: Placebo Experiment
Born 1957-1672*No. INPRES 0.0001 0.0061 -0.0060 -0.0085 -0.0017
schools per 1,000 children (0.0023) (0.0052) (0.0050) (0.0061) (0.0071)
Born 1957-1962*Av. No. INPRES schools per 0.0023 -0.0069 0.0092 0.0110∗ 0.0006
1,000 children in neighboring districts (0.0028) (0.0057) (0.0066) (0.0061) (0.0071)
Observations 57,251 57,251 57,251 57,251 57,251
Control group mean 0.97 0.36 0.61 0.18 0.37

Notes:
Panel A: The sample for the “Experiment of Interest” includes men aged 2-6 in 1974 (“treated” cohorts) and men aged 12-17 in 1974 (“control” cohorts). Panel
B: The sample for the “Placebo Experiment” includes men aged 12-17 in 1974 (“treated” cohorts) and men aged 18-24 in 1974 (“control” cohorts). All regressions
include year of birth fixed effects, district of birth fixed effects, year of birth interacted with the primary school-aged population in the district of birth in 1971, year
of birth interacted with the enrollment rate among primary school-aged children in the district of birth in 1971, and year of birth interacted with the allocation
of the water and sanitation program in the district of birth. Standard errors are reported in parentheses and are clustered at the level of the district of birth. ***
p<0.01, ** p<0.05, * p<0.1.

35


outcome. For the log hourly wage and annual income outcomes, the adjacent district coefficients are

negative and statistically different from zero, with p-values of less than 0.01 and 0.06, respectively.

For wage-employment the neighboring district coefficient is not statistically significantly different

from zero (p-value 0.26) but it is over half the size of the own district exposure coefficient (in

absolute value).

Table 8: Spillovers, Hourly Wages, and Annual Income

Wage Ln Hourly Wage Annual Salary
Employed (2017 USD) (2017 USD)

(1) (2) (3)

Panel A: Experiment of Interest
Born 1968-1972*No. INPRES 0.0190∗∗∗ 0.0289∗∗ 64.6734∗∗∗
schools per 1,000 children (0.0067) (0.0131) (18.4221)
Born 1968-1972*Av. No. INPRES schools per -0.0097 -0.0436∗∗∗ -36.6054∗
1,000 children in neighboring districts (0.0086) (0.0142) (19.4915)
Observations 57,233 24,486 30,226
Control group mean 0.44 -0.65 1357.72

Panel B: Placebo Experiment
Born 1957-1962*No. INPRES -0.0008 -0.0080 -6.7848
schools per 1,000 children (0.0060) (0.0116) (15.2228)
Born 1957-1962*Av. No. INPRES schools per 0.0011 0.0243∗ 37.6053∗∗
1,000 children in neighboring districts (0.0073) (0.0127) (17.5499)
Observations 57,251 23,701 25,540
Control group mean 0.40 -0.59 1457.13

Notes:
Panel A: The sample for the “Experiment of Interest” includes men aged 2-6 in 1974 (“treated” cohorts) and men
aged 12-17 in 1974 (“control” cohorts). Panel B: The sample for the “Placebo Experiment” includes men aged 12-17
in 1974 (“treated” cohorts) and men aged 18-24 in 1974 (“control” cohorts). All regressions include year of birth
fixed effects, district of birth fixed effects, year of birth interacted with the primary school-aged population in the
district of birth in 1971, year of birth interacted with the enrollment rate among primary school-aged children in the
district of birth in 1971, and year of birth interacted with the allocation of the water and sanitation program in the
district of birth. Standard errors are reported in parentheses and are clustered at the level of the district of birth.
*** p<0.01, ** p<0.05, * p<0.1.

For all three outcomes in Table 8–as well as for employment in agriculture, employment outside

of agriculture, and employment in industry–adjacent district INPRES exposure counteracts the

impact of own district exposure. This is consistent with a situation in which workers of a similar

age and human capital level in neighboring districts either directly (by commuting) or indirectly (by

affecting firm location or capital allocation decisions) compete with one another for employment.

A positive shock to the number of young, educated competing searchers increases the supply of

36


these workers and, in the absence of an adjustment in capital, exerts downward pressure on the

market clearing wage. This, in turn, may encourage workers on the margin between shifting into

wage employment, employment in industry, or broader employment outside of agriculture to remain

in the informal sector or in agriculture. Though it is certainly possible that these cross district

spillovers from an increased supply of young, educated workers depress wages and sectoral shifts

for men of all ages, the results in Table 8 indicate that these impacts are stronger for the younger

cohort of men.

Threats to Identification

While the DID identification strategy eliminates a number of potential sources of bias (e.g. any

time-invariant district-level or time-varying Indonesia-wide determinants of the outcomes), there

remain other classes of confounders that would violate the identifying assumption of parallel trends

in potential outcomes across districts with different levels of INPRES school building intensity.

The primary school construction was the largest component of the INPRES program through the

mid-1970s. However, INPRES also included a number of other development activities including

the provision of rural and urban water supply and sanitation, as well as the construction of health

facilities and roads, and investment in agriculture and mining. Of the other activities, the water

and sanitation program was the second largest at the time [Duflo, 2001]. If the allocation of these

other INPRES programs is correlated with the school construction component and they impact the

sectoral allocation of workers, then it is possible that we are incorrectly attributing the effects of

the other programs to the school construction.

There are several reasons why we do not believe that the main DID estimates are driven by the

impact of other government programs. First, the rule for allocating primary school construction to

districts on the basis of pre-program non-enrollment rates and the size of the primary school age

population was highly specific and was not the basis for determining the receipt of other central

government assistance during this time period [Duflo, 2001]. Second, all of our specifications control

for the second largest regional development program at the time: the water and sanitation program.

Third, though the broader INPRES program included funding for activities like road construction

and investment in industry in the early 1970s, it is hard to conceptualize other interventions that

37


could have importantly affected the sectoral allocation of workers but only after a twenty year

delay. That is, the evidence we present in the Panel Bs of our main tables and in Appendix

Figures A1 and A2 show that birth cohorts that entered the labor force in the late 1970s and

early 1980s were not impacted by the primary school construction program; instead the estimated

impacts of the primary school construction are zero for cohorts precisely up to the point where

we reach individuals that were young enough to have attended the newly constructed primary

schools. We would not expect to see the benefits of, for example, road construction or investment

in cross-district information dissemination be realized with the same delay. We therefore believe it

is unlikely that we confounding the effects of the primary school construction program with impacts

of other government programs.

V Conclusion

Rapid increases in educational attainment and sharp declines in the share of employment occurring

in agriculture have been ubiquitous trends in developing countries during recent decades. While a

substantial literature attempts to link these two regularities using structural models or individual

or household-level characteristics to instrument for education, both strategies are susceptible to a

number of different potential biases. We provide quasi-experimental evidence linking the two trends

in Indonesia, using variation in male educational attainment generated by the plausibly exogenous

variation in exposure to the Sekolah Dasar INPRES primary school building program. To do so, we

employ a differences-in-differences identification strategy that compares the evolution of outcomes

across birth cohorts–contrasting men that were too old to have attended the newly constructed

primary schools with younger men that may have benefited–between districts that were allocated a

higher number of new primary schools and districts that were allocated fewer new primary schools.

Our results confirm earlier findings [Duflo, 2001, Ashraf et al., Forthcoming, Akresh et al., 2018]

that the program importantly increased educational attainment. Each additional school built per

1,000 primary school-aged children increased the likelihood of completing primary school by 1.9

percentage points, increased the total number of years of education completed by 0.12, and increased

the likelihood of being literate by 0.75 percentage points. The INPRES program also meaningfully

38


impacted the sectoral allocation of treated men: increasing the likelihood that men are employed

out of agriculture occupation by 1.86 percentage points, the likelihood of employment in industry by

1.14 percentage points, and the likelihood of wage employment by 1.55 percentage points. Exposed

cohorts also benefit in the form of higher annual income (53.8 USD for each additional primary

school constructed per 1,000 children), a result that is partly explained by an increase in the natural

log of hourly wage and partly by the aforementioned increase in wage employment. We find no

evidence that program exposure affected the likelihood of district-level migration since birth or in

the last five years, and no suggestion that the program impacted the likelihood that men reside

in an urban or a rural area. Together, this indicates that INPRES importantly affected the skills,

sectoral allocation, and welfare of treated men, and that these shifts primarily occurred in the

districts in which the men were born. In combination with estimates showing that the impacts

were somewhat larger in districts that were more rural at the time of the 1971 Census, the findings

highlight that the school construction program worked by transforming existing (and largely rural)

economies, rather than by shifting newly skilled workers to more productive geographic areas.

To directly answer the question of how the increases in educational attainment driven by the

INPRES program affected the sectoral allocation of workers, we estimate 2SLS specifications that

use the measure of INPRES exposure as an instrument for the number of years of schooling that

men completed. To correctly characterize