1 Introduction

The relationship between parental income and children’s schooling outcomes is often estimated using survey data (see, e.g., Blau 1999; Chevalier et al. 2013; Plug and Vijverberg 2005).Footnote 1 A potentially important issue in the literature on intergenerational mobility is that survey response is potentially not random. Most notably, the probability of responding to a survey has been shown to be positively related to income and other indicators of socio-economic status (see the literature review below). An unresolved question is whether non-response biases the estimated intergenerational relationship between parental income and child schooling.

In this study, we examine whether survey-based estimates of the relationship between household income and children’s performance in school are biased because of selective non-response. We focus on two sources of non-response: (1) individual unit non-response by parents unwilling to participate in a survey, and (2) item non-response by parents on questions regarding their income.

We combine parental survey data with administrative data from schools and income register data. Our analysis is based on an ongoing regional education survey conducted in the south of the Netherlands covering more than 95% of primary schools in the target region. Parental survey information is available for approximately 69% of children. We link the data relating to the children to the administrative income data of their parents, which are drawn from Statistics Netherlands. This source provides us with information on parental income for those who respond and do not respond to the survey. Our variable of interest—children’s school performance on standardized mathematics and language tests in grade 6 and 9—is based on the schools’ administrative records.

All our estimations are performed using administrative data only, from schools as well as the income register. We distinguish different types of non-response biases by using observed survey response behavior to artificially restrict the full sample to the data that would be available had the survey suffered from a certain type of non-response.Footnote 2 Observing non-responding households using administrative information enables us to directly assess whether response rates are selective and whether the relationship between parental income and child school performance is biased.

Our main findings are, firstly, that unit non-response at the household level attenuates the intergenerational relationship. Secondly, we find evidence that item non-response on the income question leads to an overestimation of the relationship between parental income and children’s test scores. In the analyses of both unit and item non-response, the point estimates for language and mathematics test scores point in the same direction, but only one of the two relationships is significant.

This study contributes to the literature on intergenerational mobility, specifically the empirical literature that analyzes how children’s schooling outcomes relate to parental income.Footnote 3 Cross-sectional surveys find a consistently significant positive relationship between parental income and schooling outcomes (Blau 1999; Chevalier et al. 2013). Also, when adoption is used as a natural experiment to exclude genetically transferred ability as a factor, the significant relationship with income persists (Plug and Vijverberg 2005). This result remains robust when controlling for characteristics of biological parents and extending the focus to the economic outcomes of the children (Björklund et al. 2006). Hence, parental income appears to be an important determinant of success in school and the labor market.

While findings on the direction of the relationship between parental income and children’s outcomes are unambiguous, their magnitude is often contested. There are several potential empirical problems in studies based on survey data: e.g., non-representative samples, truncated samples, survey non-response, and item-based non-response. Many samples are non-representative by construction, but truncations could also be the result of a choice, such as excluding observations with low or no income. The latter likely leads to overestimations of intergenerational correlations (Couch and Lillard 1998). Unit non-response is inevitable and ubiquitous in survey studies: virtually all self-reported measures of income are subject to item-based non-response.Footnote 4 The question, both with respect to unit and item non-response, is whether the non-response is selective and if so how this may bias the estimated intergenerational relationship.

Earlier research has shown that unit and item non-response are selective in the sense that they are related to income. According to the literature on social-desirability bias, income is a sensitive topic that many survey respondents do not want to speak about. Tourangeau and Yan (2007) show, for instance, that item non-response on income questions is higher than on questions related to sexuality. Bollinger and Hirsch (2013) report that item non-response is higher on income questions than on other questions in the Current Population Survey. Lillard et al. (1986) show that non-response across the earnings distribution is U-shaped, i.e., that respondents in the tails are least likely to report earnings.Footnote 5 These findings are confirmed by, among others, Bollinger et al. (2017) and Bee et al. (2016), who compare survey responses with administrative data.Footnote 6 A valuable point to mention here is that administrative data do not necessarily provide true information on income, as they could, e.g., fail to account for undeclared income.Footnote 7

While there is ample evidence that unit and item non-response are related to income, the issue of whether selective non-response biases estimates of intergenerational mobility has, to our knowledge, not yet received attention in the literature. The contribution of our study is to identify and assess the magnitude of unit and item non-response bias in the context of the intergenerational relationship between parental income and children’s schooling.

The remainder of the study is structured as follows. Section 2 provides a description of the data. Section 3 covers the empirical approach. Section 4 discusses the results. Section 5 highlights potential mechanisms and Sect. 6 concludes.

2 Data

We combine different sources of administrative records and survey data. The basis of our dataset is an ongoing regional education monitor for the south of the Netherlands: the OnderwijsMonitor Limburg (OML). The monitor has been conducted as a cooperative project between schools and Maastricht University since 2009. Almost all primary schools in the southern part of the Dutch province of Limburg participate in the project and provide records on their students on a yearly basis.Footnote 8

The data we use are gathered for one cohort of students at two points in time. The students were observed in 2009 in grade 6 (their last year of primary school, completed on average at age 12) and in their third year of secondary school in grade 9 (at about age 15). We give more information about the data below.

At both points in time, a parental survey was conducted and the results were linked to the school records. For grade 6, the parental survey was distributed via the school. In grade 9, the survey was distributed directly to the parents. For the grade 6 data, the overall response rate among the parents amounted to 69%.Footnote 9 Approximately one-third of all non-responding parents had children studying in schools in which none of the parents participated in the survey. This makes it highly likely that those schools either did not send out the survey to the parents or did not forward the completed questionnaires back to the university. In the following, these schools will be referred to as “non-responding schools” so as not to confuse them with schools not participating in the OML. Hence, in our analysis we distinguish non-response at the parental level (22%) and non-response at the institutional or school level (9%). Since non-response at the institutional level is idiosyncratic to our survey design, we discuss the results on this type of non-response in the Appendix. We do provide the main descriptive statistics of this type of non-response in the main text, however.

In the grade 9 survey, the overall parental response rate of 43% is substantially lower than in the survey conducted in grade 6, even though the parental survey was distributed directly to the parents.Footnote 10 The surveys also differ in content. In grade 9, a question regarding household income was included in the parental survey, allowing us to additionally analyze the effect of item non-response on the income questions. Approximately 63% of the responding parents did not fill out the income question.

The process of data collection, as well as the approximate response and non-response rates for both samples, are depicted in Fig. 1. Parental unit non-response can be observed in both samples. Institutional unit non-response can only be assessed in the grade 6 sample and item non-response only in the grade 9 sample.

Fig. 1
figure 1

Data collection and response rates in both samples

We merge the data from this cohort to the administrative information from the household register of the municipalities and the income register from Statistics Netherlands performing the match based on surnames and addresses. The match can be completed for 82.5% of all students in the sample assessed in grade 6 and for 74.8% of the sample assessed in grade 9.Footnote 11

2.1 Schooling and income measures

Our measure of school performance in grade 6 is the result of a nation-wide skill assessment test, in the following referred to as the Cito test.Footnote 12 The mean age of the students when they take the test is 12.2 years. The result is the main determinant of teachers’ recommendation for placing students in a certain secondary schooling track, which is often considered binding by secondary schools.Footnote 13 Therefore, the students have strong incentives to score well on this test.

Among grade 9 students, the OML administers its own tests on language and mathematics achievement, based on questions from larger standardized studies.Footnote 14 The results are used solely for research purposes. Hence, in contrast to the Cito test the stakes are low. There are three difficulty levels for the various tracks: one test for the two upper tracks, one for the two medium tracks, and one for the two lowest tracks.Footnote 15 Overlapping questions make it possible to use item response theory (IRT) to transform the raw scores into test scores on a common scale. We use a two-parameter logistic model for binary items, allowing items to vary in their difficulty and discrimination. The resulting latent trait scores are used in the analysis.

The administrative data from Statistics Netherlands on the financial situation of the households comprise measures for total and corrected income. The corrected income is based on total income adjusted for the number of household members. In our analysis, we rely on this corrected measure of household income. This measure comes closer to the income available per child, which for instance can be invested in tutoring.Footnote 16 For each of the income measures, we know to which quintile of the income distribution in the overall population of the Netherlands a household belongs.

Having data on schooling outcomes, as well as an objective measure of household income for an unusually large share of the population, and having a measure for survey participation, are unique features of our dataset. This setting enables us to observe and analyze the non-response bias in the intergenerational relationship of parental income and children’s schooling outcomes.

2.2 Baseline sample, grade 6 (age 12)

The OML covers 95% of schools in the Southern Limburg region. The baseline sample consists of all students attending those schools who took the final assessment test in grade 6 in 2009.Footnote 17 This group consists of 4512 observed students. As mentioned, 69% of their parents (3123) answered the parental questionnaire. For 1389 students no parental information is available. This latter group is split into a group of parents whose child is in a school where none of the parents responded (423) and a group of parents who did not respond, but whose child attended a school in which other parents responded (966). The former group we consider to be missing due to institutional unit non-response, and the latter to be missing due to individual unit non-response. Administrative data on income could be merged for 83.6% of the children with participating parents, and for 79.9% of the children with non-participating ones (76.3% in case of parental unit non-response and 88.2% in case of institutional non-response). Reasons for the imperfect match of administrative data to the OML data include missing or erroneous address data from school records.

Table 1 provides descriptive statistics for all students in the OML and administrative data. Students of non-participating parents and at non-participating schools are slightly older, and among them a greater proportion are first- or second-generation immigrants. Both differences are significant at the 5% level. These descriptive statistics already suggest that distinct selection mechanisms may be at work at the institutional and individual level. With respect to socio-economic status, household income, and student outcomes, there is neither significant negative selection of parents into individual unit non-response, nor of schools into institutional non-response.

Table 1 Descriptive statistics for students by survey response status, baseline sample grade 6 (age 12)

Distribution plots of Cito scores confirm the differential performance of student groups based on the response to the parental survey. Figure 2 reveals that students whose parents decide to participate in the survey indeed outperform students whose parents decide not to (significant at the 1% level). However, the graph also shows that students attending schools which did not send out the parental surveys perform slightly better on the Cito test than students whose parents responded (significant at the 10% level). Because the Cito test is the main determinant of track placement, these differences potentially also matter for long-run outcomes (College voor Toetsen en Examens 2015).

Fig. 2
figure 2

Students’ Cito score distributions by survey response status, baseline sample grade 6 (age 12). This graph is restricted to all students for whom Cito test scores were available and data could be merged with administrative household information. The group with responding parents relates to column (1) from Table 1, the groups with non-responding schools and from non-responding parents correspond to columns (2) and (3), respectively

Table 2 shows parental survey participation rates by quintiles of corrected income. The quintiles are based on the income distribution of the full Dutch population. While parental non-response is highest in the lower income quintiles, school non-response is most prevalent among families in the higher income quintiles. No income records were available for 29 of the 3931 households.

Table 2 Parental survey participation by income quintile, baseline sample grade 6 (age 12)

One caveat in our analysis is that selection may be induced in our sample by an imperfect match with public administrative data. Table 7 provides descriptive statistics of the group for which we do not have administrative data, analogue to Table 1. The table indicates that there is positive selection into the matched sample. Those with available administrative data have higher Cito scores, their socio-economic status is higher, they are less often immigrants, and their parental education level is higher.Footnote 18 Comparing the differences regarding descriptive statistics by participation status, we observe similar patterns in the matched and unmatched sample, especially regarding students’ age and performance: students with non-participating parents and at non-participating schools are slightly older, children at non-responding schools perform best, while children of non-responding parents perform worst.

2.3 Sample, grade 9 (age 15)

The OML also monitors students at grade 9. The cohort observed in grade 6 in 2009 was the first to also be observed in grade 9, the third year of secondary school. In grade 9, a parental survey was conducted and achievement tests in language and mathematics, this time with low stakes for students, were performed. We use this sample to repeat our analysis of parental non-response and explore the influence of item non-response on income questions.

When we restrict ourselves to those participating in the achievement tests, the remaining sample we were able to identify based on names and addresses consists of 2963 observed students. Since the parental surveys are sent out directly, the response rate (43.2%) in this sample solely reflects parental participation and is not influenced by school participation. The survey allows us to examine item non-response on questions regarding household income. Among the survey respondents, 37% answered the income question.

Table 3 provides descriptive statistics for the students in this sample, in analogy to Table 1. Columns (1) and (3) include children of parents who returned the survey, distinguished by whether they provided information on household income (N = 474) or not (N = 807). Column (2) shows the descriptive statistics for all children in grade 9 who participated in the achievement test but whose parents do not return the survey (N = 1682).

Table 3 Descriptive statistics for students by survey response status, sample grade 9 (age 15)

Children of parents who do not respond to the survey are slightly older and substantially more likely to be first or second-generation immigrants. With respect to parental socio-economic status and income, as well as all measures of students’ school performance, the group of parents who do not respond has unfavorable characteristics. Within the group of respondents, parents who answer the income question have on average more favorable characteristics. Their children also perform better on all tests considered.

Distribution plots for the language and mathematics tests (see Figs. 5, 6) show that children of responding parents clearly outperform children of non-responding parents (significant at 1%). The distributions are similar for children with responding parents who do or do not answer the income question.

Table 4 reveals that the survey non-response rate is higher at lower income quintiles. In turn, the item non-response rate is lowest among parents in the lowest income group. No income measure is available in the income registry for 18 of the 2963 households.

Table 4 Parental survey participation by income quintile, sample in grade 9 (age 15)

3 Empirical approach

Our empirical approach is to analyze regressions with children’s schooling outcomes as dependent variables and parental income as the independent variable. This relationship between parental income and children’s educational outcomes is a correlation and we do not attempt to estimate the causal relationship. There are many factors which may potentially explain the relationship, e.g., parental IQ may be related both to parental income and children’s educational outcomes. In our analysis, we examine whether the correlation between parental income and children’s educational outcomes is biased due to selective survey participation.

Based on administrative data only (for test scores, household income, and control variables), we explore the relationship between parental income and students’ academic performance as a measure of intergenerational mobility. We use parental surveys solely to construct hypothetical sub-samples, which suffer from different types of non-response. We run the same regressions based on responding and non-responding sub-groups. Comparing the coefficients for parental income between these regressions provides insight into whether a certain type of non-response biases the estimated relationship between income and schooling outcomes.

In the literature, the most commonly used measure of intergenerational mobility is the elasticity in a regression of the log of child income on the log of parental income. Authors typically do not add controls to this regression (see, e.g., Jäntti and Jenkins 2015; Corak 2013; Nybom and Stuhler 2015). They instead estimate the intergenerational mobility separately for boys and girls and measure lifetime income at specific ages of the parent and child. Our approach differs in some regards from the standard approach. First, we look at children’s educational outcomes rather than their income. Second, we control for various characteristics of the child (age, gender, indicators of low socio-economic status, and whether the child is a first or second-generation immigrant) instead of running the estimates separately for various characteristics.Footnote 19 In the Appendix, we also show the main tables of the paper without controls (Tables 13, 14). The results are qualitatively robust to this exclusion. In a quantitative sense, all coefficients become somewhat larger in the analyses without controls than in the analyses with controls.

4 Results

In this section, we analyze unit non-response bias and income item non-response. Furthermore, we report robustness checks.

4.1 Parental unit non-response in grade 6

In the sample assessed in grade 6 (age 12), we analyze individual unit non-response by parents. As mentioned, the results on institutional unit non-response by schools are displayed in the main tables but discussed in the Appendix.

The box plots in Fig. 3 provide a first insight into the test score distributions by income group. They allow a comparison of the median, the 25th and 75th percentile, as well as the span of Cito scores, between the different subgroups by response status. Differences between the groups can be observed for all income quintiles. Across all income quintiles, the median student whose parents do not respond performs worse than the median student whose parents respond. The same holds at the 75th percentile and, in the three lowest income quintiles, also for the 25th percentile of the distributions. A Kolmogorov–Smirnov test on the underlying distributions shows that this difference is significant for the third and fourth income quintile. Figure 3 confirms the finding shown in Table 1 that children of non-responding parents perform significantly worse than those of responding parents. However, these differences could be explained by differences with respect to background characteristics or income levels. In the following, we therefore explore whether the gap in schooling outcomes between children of responding and non-responding parents remains after controlling for demographics, and income. As a basic set of controls, we use students’ gender and age as well as an indicator for low socio-economic status.Footnote 20

Fig. 3
figure 3

Box plots of students’ Cito scores by survey response and income, baseline sample grade 6 (age 12). Box plots for Cito test results, displayed by survey response status and income quintile. The boxes are drawn around the median, indicated by the line in the box, and show the interquartile range from the 25th percentile to the 75th percentile. The whiskers show the span of the data points. Their maximum length is 1.5 times the interquartile range. Data points outside this span are usually displayed separately. In accordance with the policy on non-disclosure of individual data by Statistics Netherlands, the figure does not include those outside values (here: 24 observations). The graph is based on 2602 students whose parents participated in the survey, 726 students whose parents did not respond, and 371 students whose schools did not respond

Table 5 shows the results of the main regression specifications, displaying only the income coefficients. We run the same regression of students’ performance on the parental income quintile and control (1) for the full sample and (2) for a number of artificially restricted samples. In a first step, the full sample is split into students attending a non-responding school (NRS) or a responding school (RS). In a second step, the sub-sample attending responding schools is split into parental response status. This allows us to compare children of non-responding parents (NRP) to those of responding parents (RP).

Table 5 Relationship between students’ test scores and parental income, different samples (grade 6)

In this setup, comparing the NRP and RP samples shows the bias induced by individual unit non-response.Footnote 21 We repeat this analysis for three different schooling outcomes: (1) the total Cito score, (2) the fraction of correct tasks in sub-sections for language, and (3) the fraction of correct tasks in sub-sections for mathematics. All of these measures are standardized to a mean of zero and a standard deviation of one. Thus, the resulting coefficients can be interpreted as a change in standard deviations, associated with an increase of one quintile in income. P-values for the differences in the estimated income coefficients, between non-responding and responding parents, are provided in the second panel of Table 5.

All coefficients of the relationship between parental income quintile and students’ schooling outcomes are significant at the 1% level.Footnote 22 Regarding all three performance measures, the coefficients for the sample with children of responding parents exceeds that for the sample with children of non-responding parents in magnitude. However, none of these differences are significant. A similar empirical approach, based on the full sample and the use of interaction terms between different types of non-response and the income quintile, confirms these results (see Table 8). The only significant differences between coefficients of intergenerational mobility in this sample are found when analyzing institutional non-response (see Appendix).

4.2 Parental unit non-response and item non-response in grade 9

We apply the same empirical strategy to the sample assessed in grade 9 (average age 15). In this wave, the parental survey was sent out directly to parents, providing a direct measure of parental survey response. This parental survey included a question regarding household income, allowing us to further assess a potential bias due to specific item non-response.

In the following, we distinguish three groups: (1) children of parents who responded to the survey and provided a measure of household income, (2) children of survey responders who did not provide an income measure, and (3) children of survey non-responders.

The box plots presented in Fig. 4 show how achievement in the language test is distributed within the different income quintiles, separately by parental response status. In general, the graph reveals that the differences by participation status based on this test seem to be smaller than the differences based on the Cito test in grade 6 (compare to Fig. 3). A tendency towards positive selection into survey response is observed. However, among children of survey responders, particularly children of low-income parents who do not respond to the income question, outperform the children of those who do respond, at most points in the distribution. The analog box plots for achievement in the mathematics test in grade 9 are provided in the Appendix, in Fig. 7. Figs. 5, 6 additionally show the corresponding distribution plots by response status.

Fig. 4
figure 4

Box plots for students’ language test by survey response and income, sample grade 9 (age 15). Box plots for language test results in grade 9, displayed by survey response status and income quintile. The boxes are drawn around the median, indicated by the line in the box, and show the interquartile range, from the 25th percentile to the 75th percentile. The whiskers show the span of the data points. Their maximum length is 1.5 times the interquartile range. Data points outside this span are usually displayed separately. In accordance with the policy on non-disclosure of individual data by Statistics Netherlands, the figure does not include those outside values (here: 34 observations). The graph is based on 356 students whose parents participated in the survey, including the income question, 613 students whose parents responded to the survey without reporting household income and 1247 students whose parents did not return the survey

Applying the same approach used in the baseline sample, we examine whether unit and item non-response bias the estimated relationship between parental income and children’s schooling. Table 6 displays the income coefficients of the main regression specification for the full sample as well as artificially restricted sub-samples.

Table 6 Relationship between students’ test scores and parental income, different samples (grade 9)

First, children are split according to whether their parents did (SR) or did not respond to the survey (SNR). Then, children of survey respondents are distinguished by whether the parents did (IR) or did not fill in the included income question (INR). Comparing the coefficients of the SNR and SR samples provides a measure for unit non-response bias. A comparison between the INR and IR samples provides insight into whether item non-response induces a bias. This analysis is conducted for the language (1) as well as the mathematics achievement test (2), administered in grade 9 as part of the OML.Footnote 23 Both measures are standardized to a mean of zero and a standard deviation of one.

As can be seen in Table 6, the analysis based on the language test confirms the results from the baseline sample for language scores: no significant evidence is found that unit non-response is a source of bias in the estimated relationship between parental income and students’ language outcomes. For mathematics, we do find significant differences. The estimate reveals that the relationship is attenuated due to parental non-response. Regarding item non-response on income questions, we find that the estimated relationship is biased upwards for language test scores. However, the difference is not significant for mathematics test scores.Footnote 24

Surprisingly, the estimated coefficient for parental income in the sample of item non-responding parents is very small and insignificant. This observation may be explained by the contrast between the two tests. The stakes of the Cito test, used in the baseline sample, are high as they partly determine secondary school track placement. By contrast, the tests in grade 9 have no relevance for the students in terms of grades or placement.

The availability of a self-reported and an administrative measure of parental income also allows us to assess reporting bias in an intergenerational mobility setting: using survey-reported income to measure intergenerational mobility (IGM), and then comparing the (potentially) “biased” survey estimates with the “true” estimates using administrative measures of income. The results indicate that the relationship between parental income and children’s educational outcomes is biased when using self-reported measures of income. In our dataset, the relationship between self-reported income and educational outcomes is not significant. Relative to the relationship between administrative measures of income and educational outcomes, we can conclude that the results using self-reported income are attenuated both for language (p value .030) and for mathematics (p value .051).

4.3 Robustness checks

All of the above regression results rely on a corrected measure of household income, based on the total household income adjusted for number of household members. As a robustness check, we conduct all analyses also with the measure of total income. The results remain qualitatively similar but are somewhat weaker in magnitude and significance. Since adjustment by number of household members is highly relevant for the actual standard of living, this is not surprising. In support of this, for the sub-sample of parents with a known education level, we also find that the quintile of a household’s corrected income correlates more strongly with highest education level than with total income.

The error terms of the test scores are unlikely to be independent across students in the same environment. Attending the same school and having the same teacher could systematically influence student achievement. Therefore, we account for common factors of influence in the results shown above. To ensure that our results are not driven by clustering, we repeat all analyses with non-clustered standard errors. We do not find any major difference to the results discussed above.

Significant differences in the fraction of immigrants in groups by response status raise the question whether these could drive the reported results. In the Appendix, we show estimates excluding immigrants (Tables 15, 16). The coefficients are somewhat larger compared to the results based on the full sample including immigrants. However, in a qualitative sense, the results on the differences between the sub-groups by response status remain unchanged. Therefore, immigrants do not drive the conclusions we draw.

5 Potential mechanisms

Our results indicate that individual unit and item non-response by parents may bias estimates in opposite directions. In the following, we discuss potential mechanisms behind the underlying selection into unit and item non-response in our samples.

5.1 Selection into individual unit non-response

The magnitude of the intergeneration mobility coefficient in the group of responding parents is, across both age groups and for all outcome measures, consistently lower than in the group of non-responding parents. Thus, individual unit non-response could potentially lead to an overestimation of intergenerational mobility.

To get a better insight into the sorting of parents in terms of survey participation, we regress parental survey response on individual student characteristics. The results are presented in the Appendix in Table 11. In both surveys, grade 6 and 9, being a first or second-generation immigrant is associated with a lower probability of parental survey participation. Also, children’s academic performance is related to a higher probability of parental response. In primary school, Cito scores significantly predict parental response. In secondary school, the more recent performance measures are more important. Attending a secondary school track other than the lowest one is associated with a lower response rate among parents. However, this is conditional on all the other variables, and the coefficient on having low-educated parents (“low ses”) is negative and exceeds any of the coefficients on track. Again in the secondary school sample, parents in households not belonging to the lowest or the middle income quintile are more likely to respond to the survey. Effects regarding school location are no longer significant once a variable on the use of dialect is added. In both samples, speaking the local dialect well is associated with a higher probability of parental survey response.

All in all, in both surveys the parents of well-performing students who are not immigrants and who speak the local dialect well are more likely to answer. In the secondary school sample, socio-economic background and track placement play an important role as well.

5.2 Selection into income item non-response

Among parents who return the survey but do not answer questions on household income, the relationship between parental income and children’s language performance is weaker than among parents who do respond to the question. The selection into item non-response could shed some light on the potential mechanisms behind this bias.

While there is a positive selection into unit response with respect to parental education, parental income, and students’ test scores, this is different for item non-response. As can be concluded from the boxplot in Fig. 4 (as well as Fig. 7), children of low income item non-responders, in particular, outperform those of item responders.

An explanation that fits in with this pattern is that parents with low income but otherwise favorable characteristicsFootnote 25 may feel that they do not live up to expectations and are therefore reluctant to report their income. This is consistent with non-responders having atypical characteristics for the area they live in, as has been shown for the US census (Bee et al. 2016). Such a systematic item non-response pattern is particularly harmful for estimates of intergenerational mobility. Since the non-responders are atypical for the area they live in, imputation of missing data based on postal code area, as applied by Sabelhaus et al. (2015), is risky.

An additional difference can be observed in the mode of responding to the survey. The majority of item responders (88.9%) filled out the online survey to which a link was provided in the initial invitation letter, while only 22.5% of item non-responders did so. Most of them (77.5%) instead filled out the paper survey which was only provided in the reminder letter. This points towards a potential link between the mode of the survey, the willingness to respond to certain items, and potentially further characteristics.

A regression of parental item response on individual student and parent characteristics can be found in the Appendix in Table 12. Student performance measures only positively significantly predict parental item response if track and school location are not included. Parental item response is dominantly associated with belonging to the highest income quintileFootnote 26 and answering the survey upon the first request online.

6 Conclusion

This paper assesses the potential bias in estimates of intergenerational mobility, specified as the relationship between parental income and students’ school performance, combining administrative and survey data. We identify different sources of non-response—parental unit and item non-response—and show that they have to be considered separately because their effects on the estimates of intergenerational mobility are distinct.

Our main results indicate, firstly, that unit non-response at the household level attenuates the intergenerational relationship. Secondly, we find evidence that item non-response on the income question leads to an overestimation of the relationship between parental income and children’s test scores. In the analyses of both unit and item non-response, the point estimates for language and mathematics test scores point in the same direction but only one of the two relationships is significant.

These results are important both for the literature on intergenerational mobility but also for that in other contexts. Non-response is ubiquitous and substantial in all estimations based on survey data, e.g., among the articles we cite in our paper, non-response rates in other published studies on intergenerational mobility are found of up to 42%. Our results suggest that estimates of intergenerational mobility based on survey data need to be interpreted with caution because they may be biased by selective non-response. The direction of such bias is difficult to predict a priori. Bias due to unit and item non-response may work in opposing directions and may differ across outcomes.

Matching survey and administrative data enables us to directly assess the effects of non-response. A remaining downside of our data set is the imperfect match of survey and performance data to administrative data. Even though, at above 80% in the baseline sample, this match is relatively high, it is not perfect. The remaining selection may drive our results to some extent.

To our knowledge, this is the first study on bias induced by selective unit and item non-response in the important relationship between parental income and children’s school performance, as an early measure of intergenerational mobility. Obviously, the results could be driven by the specific setting of our study and as mentioned above, our results may partly be driven by incomplete matching to administrative data. Therefore, this study needs to be replicated in different contexts. This is feasible in countries where large administrative datasets are already available, for example in Scandinavia. The same approach can of course be used to assess the bias induced by non-response in any relationship of interest estimated based on survey data.