Introduction

Cesarean sections can save lives, but rates well above the World Health Organizationā€™s recommended 15% ceiling in most developed countries suggest that many procedures are unnecessary1. In the OECD, the rate of caesarean section has increased from 14.4% of deliveries in 1990 to 25.8% in 20092, growth that cannot be explained by increases in obstetric risk factors, including those associated with delayed and multiple infant pregnancy and maternal obesity3,4,5,6. Instead, this growth is believed to be related to increased maternal request and changes in clinical practice7. A concern is that many procedures are made without considering possible long-run implications, such as reduced fertility8. In this study, we concentrate on examining possible long-term implications for child cognitive development.

Cesarean birth may be directly and indirectly associated with negative child cognitive outcomes. The indirect association may occur through established links between cesarean birth and adverse child health outcomes, including asthma, type I diabetes, allergies9,10,11 and obesity12 that are also associated with impaired functioning and lower academic performance13, 14. The surgical nature of cesarean procedures also poses postnatal maternal health risks15, with potential knock-on effects for the childā€™s development through altered mother-child interactions16 and lower rates of breastfeeding17,18,19.

The direct association may occur through alterations to the infantā€™s gut microbiota. Unlike vaginally-born children whose gut is seeded by passing through the birth canal, the gut of cesarean-born children is seeded through contact with the motherā€™s skin and hospital surfaces. The result is long-term compositional differences in gut microbiota by mode of birth20,21,22,23,24, with differences observed up until age seven24. Recently uncovered chemical signaling from the gut microbiota to the central nervous system, affecting memory, motivation, mood and stress reactivity, raises questions about the long-term cognitive effects of disturbed microbiota composition at a sensitive time in brain development25, 26. Although causal impacts on child development are yet to be proven, altered signaling from disturbed gut microbiota is thought to be a possible driver of higher rates of cognitive disorders, especially autism spectrum disorder (ASD) and attention deficit hyperactivity disorder (ADHD), among cesarean-born children (see Curran et al.27 for a review). In animal studies, there is mounting (and replicated) evidence of long-term cognitive impacts from early microbiota disturbance. These studies show that rodents lacking all gut bacteria from birth (germ-free animals) have memory deficits28 and behavioral abnormalities29, 30 when compared to animals with normal gut microbiota. Interestingly, these cognitive deficits in rodents can be corrected through the colonization of the intestine via a fecal transplant from the control group, but only if it occurs early in life. The implication is that there is a critical window to correct any disruption to the gut microbiota, after which cognitive impacts are permanent25.

In this study, we estimate the relation between cesarean birth and child cognitive outcomes using data from the Longitudinal Study of Australian Children (LSAC), a nationally representative birth cohort surveyed biennially, and multivariate regression techniques. To the best of our knowledge, only Bentley et al.31 has examined the relation between cesarean birth and cognitive outcomes. Using Australian hospital records linked to teacher-based assessments in the first year of school, they found a significant negative association across several developmental domains. We build on this study and that of previous health studies in several ways. First, we build on the work of Bentley et al.31 by using mediation analysis to test to what extent any relation is associated with lower rates of breastfeeding and adverse child and maternal health outcomes. This is important because it helps to identify the importance of direct effects, such as those related to disturbed gut microbiota and helps identify how widespread any effects may be within the population. Second, LSAC contains rich longitudinal cognitive test information that enables us to gain an insight into longer-term outcomes that are closely related to early academic achievement, an important predictor of lifelong earnings and health. These include academic test results that measure: school readiness at 4ā€“5, vocabulary and comprehension at 4ā€“5, 6ā€“7 and 8ā€“9 years, problem solving at 6ā€“7 and 8ā€“9 years and national achievement in numeracy and literacy at 8ā€“9 years. Third, we more rigorously test the sensitivity of any relation to bias from confounding variables not observed in the data or ā€˜selection on unobserved covariatesā€™. We do this using two methods ā€” re-estimating on a sub-sample where selection on unobserved covariates is likely to be less of an issue (privately insured births without any observed perinatal risk factors) and by estimating lower-bound estimates under conservative assumptions about the magnitude of selection on unobserved covariates using the Oster32 technique. This helps to provide some guidance on whether estimated relation is plausibly causal.

Methods

Data

LSAC is an internationally recognized and widely-used longitudinal cohort survey of child health and development33, 34. There are multiple cohorts of LSAC that are designed for examining specific development periods. In this study, we use the LSAC ā€˜B cohortā€™, which has an initial sample of around 5,100 infants born between March 2003 to February 2004, drawn randomly from all registered national births at the time. The initial survey, conducted between 6 months and a year after birth, contains rich perinatal information about the mother and study child along with detailed information on the parentsā€™ background and current life circumstances. From the initial survey, children and their primary care giver are surveyed biennially. At the time of analysis, data were only available up until wave 5 (2014) of the survey when the children are 8ā€“9.

Of the initial 5,100 LSAC participants, we omit around 1,300 from the sample because of missing information by wave 5, due mainly to survey attrition. To examine the potential impact of attrition, for two measures that are observed in wave 3 (Peabody Picture Vocabulary and Who Am I?), we compare relations estimated using the full sample available at wave 3 and relations estimated at wave 3 on a restricted sample that remain in the survey until wave 5 (nā€‰=ā€‰3,666). Using a two-sample t-test, we find no evidence that results estimated results at wave 3 are different for the full and restricted samples (\({\chi }^{2}\)ā€‰=ā€‰0.54, p-valueā€‰=ā€‰0.46 and \({\chi }^{2}\)ā€‰=ā€‰1.64, p-valueā€‰=ā€‰0.20 respectively). The implication is that non-random attrition does not appear to be seriously biasing our results.

Whether participants in LSAC are cesarean born is identified in the initial survey by the primary care giverā€™s response to the question, ā€œwasā€¦the type of birth/ delivery method a cesarean?ā€ In our sample, approximately 30% of primary care givers report that their children were cesarean-born (between March 2003 and February 2004). To the extent that there is a stigma associated with elective procedures, a potential concern is that the rate of cesarean birth may be under-reported in LSAC (social desirability bias). However, this is not borne out in national statistics from hospital records that show comparable rates of cesarean birth ā€” 28% and 30% for 2003 and 2004 respectively35, 36. Compared to other OECD countries, Australiaā€™s rate is similar to that of the United States and Germany, but lower than in Italy (38%) and Brazil (54%) and higher than Finland, Norway and Sweden (around 17%)37.

Data availability

No data were specifically collected for this project. The data used was previously collected by the Australian Institute of Family Studies (AIFS) and was conducted in full accordance with all relevant guidelines and regulations as spelt out in the AIFS Ethics Committee approval. This includes obtaining consent for survey participation and use of anonymized information for research purposes, as approved by AIFS. AIFS provided anonymized data for this project under an individual license agreement. In the analysis and in preparing the manuscript, we have adhered to all requirements under the license agreement. Data can be made available upon request to the corresponding author, subject to approval from AIFS.

Outcomes

Measures of cognitive development

There are two types of cognitive development measures in LSAC. The first are scores from three interviewer-administered tests conducted between ages 4 and 9 that are part of the LSAC survey and the second are scores from a national standardized test in numeracy and literacy at age 8ā€“9 that are matched to LSAC participants. All cognitive measures are age-normalized and standardized with respect to the weighted sample mean and standard deviation (descriptive statistics in TableĀ 1 are age normalized, but unstandardized).

Table 1 Sample mean values for main control variables in LSAC B cohort.

The interviewer-administered cognitive tests from LSAC B are the Peabody Picture Vocabulary Test (PPVT)38; Who Am I? (WAI)39 and the Matrix Reasoning test (MR)40. The PPVT is an age appropriate vocabulary test designed to measure a childā€™s knowledge of the meaning of spoken words and their comprehension and ability to respond. The test was carried out in survey when the children were aged 4ā€“5, 6ā€“7 and 8ā€“9. WAI is an assessment of the childā€™s readiness for school and measures the childā€™s ability to perform a range of tasks, such as reading, writing, copying, and symbol recognition. WAI was only carried out when the children were 4ā€“5. Finally, MR is a test of problem solving ability, based on the Wechsler Intelligence Scale for Children. The test is age appropriate and was conducted when the children were aged 6ā€“7 and 8ā€“9.

The national standardized tests are from the National Assessment Program for Literacy and Numeracy (NAPLAN). NAPLAN is conducted in all Australian schools to measure performance in numeracy and literacy at grades 3, 5, 7 and 9, corresponding to ages 8ā€“9, 10ā€“11, 12ā€“13 and 14ā€“15. In the case of literacy, performance is measured over the domains of reading, writing, grammar and spelling. Each studentā€™s performance in NAPLAN, including their national ranking compared to similar-age students, is made available to schools and students to monitor performance. At the time of analysis, only NAPLAN data for grade 3 was available for the LSAC B cohort. For students who were not in grade 3 at the time of testing, usually because they commenced school at a later age than the rest of the cohort, we imputed their values. The imputation process involved two steps. The first step involved using the sample with observed grade 3 NAPLAN scores to estimate regression model relations between grade 3 NAPLAN scores and personal information (such as other interviewer-administered cognitive tests). In the second step, we use results from the first step to generate predicted NAPLAN values for those with missing scores, that is, we apply the estimated regression model relations to the personal characteristics of those with missing NAPLAN scores41. Imputing these values instead of omitting them makes little difference to our analysis (see TableĀ S1 of the online supporting material).

Mediators

In this study we measure the extent to which any relation between cesarean birth and cognitive development is mediated by lower rates of breastfeeding and adverse child and maternal health outcomes. We selected these variables because they have been previously associated with cesarean birth and are available in the data9,10,11,12,13,14,15,16,17,18,19. Adverse child health outcomes include measures of obesity and care giver reported diagnosis of asthma, ADD or ASD. While measures of asthma and obesity are available at the time of cognitive testing, measures for ADD and ASD are only available at ages 6ā€“7 and 8ā€“9 years. Because only around two-thirds of care givers answer questions related to ADD and ASD at age 6ā€“7, we use information at age 8ā€“9 years when the response rate is much higher. Obesity is measured by whether the childā€™s Body Mass Index (BMI) is above the upper-limit of normal range of 19.3 for those 4ā€“7 years and 23 for 8ā€“9 years. Breastfeeding and maternal health measures are from the initial sample (6ā€“12 months after birth). Breastfeeding is a binary measure of whether the child was breastfed at three months or not. Two self-reported postnatal maternal health measures were used: whether any depressive symptoms were experienced in the last 4 weeks (score of below 3 on a 6-point Kessler scale) and whether general health was reported as fair to poor (4 or 5 on a 5-point scale, where 1 is excellent health).

Controls

The analysis includes over 20 confounders grouped into two main categories (TableĀ 1): those related to perinatal risk factors and those related to the socio-economic advantage associated with cesarean-born children in Australia. Perinatal risk factors include the taking of medication during pregnancy for blood pressure or diabetes (proxies for pre-eclampsia and gestational diabetes respectively), the taking of antibiotic medication (a proxy for bacterial infection, which may also affect the development of the infantā€™s gut microbiome); a dummy variable for low birth weight (coded 1 if less than 2.5ā€‰kg; 0 otherwise); weeks of gestation; maternal age at birth; dummy variable for multiple infant pregnancy; length and head circumference of baby (z-scores); dummy variable for whether the baby was conceived using IVF treatment and a gender dummy. We include taking antibiotic medication as a control because it has been associated with changes to the infantā€™s gut microbiome42 and possibly the risk of cesarean birth, which means failure to control for it will lead to bias due to unobserved confounding. However, we stress that our results are not sensitive to the inclusion of this control, or other child and maternal health risks in the data (refer to results for the ā€˜low-risk privately insuredā€™ group in TableĀ 2). Length and head circumference z-scores are based on Centre for Disease Control and Prevention (CDC) growth charts and are age and gender-adjusted. We also estimated a model with an alternative treatment of low birth weight (below 1.5ā€‰kg) and a model on a sub-sample of births with gestational ages of 37ā€“40 weeks using binary dummy variable controls for each gestational week (excluding the reference category of 40 weeks). Results for these alternative models are much the same as those reported in TableĀ 2 (available upon request from the corresponding author).

Table 2 OLS regression estimates and Oster lower-bound estimates of the relations between cesarean birth and child cognitive (standard deviations).

Postnatal interventions such as the use of a ventilator and the use of intensive care were not included as controls because they may be considered an outcome of delivery mode. We also refrain from including prenatal risk factors available in the data that have a high rate of missing observations. These include an identifier for whether the mother is a regular smoker, maternal average consumption of more than two standard drinks a day and maternal body mass index outside of normal range (18.5 to 25). Excluding postnatal interventions and prenatal risk factors with high rates of missing observations makes little difference to the results (see TableĀ S2 of the online supporting material for model results with these factors included as additional controls).

In general, descriptive statistics of the control variables presented in TableĀ 1 show that cesarean birth is associated with higher perinatal risk factors and a socio-economic advantage, especially higher maternal education, fewer older siblings, lower rates of un-partnered birth and higher rates of private health insurance. Despite Australiaā€™s universally available free public health system, many high income earners in Australia choose to hold private health insurance for two reasons. First, it provides a greater coverage of medical treatments, especially for allied health services; and second, it enables them to circumvent the payment of an income-contingent levy that to help meet the cost of the public health system. Important in the context of this study, maternal requested cesarean birth (without any medical risk factors) is not covered by the public health system. While mothers without private health insurance can still elect for cesarean birth in a public hospital, this is uncommon because they would incur all medical costs. It is much more common for elective cesareans to occur in private hospitals under the cover of private health insurance. This explains the 11 percentage point higher rate of private health insurance among cesarean born children than among vaginally born children.

Statistical method

In our main analysis, we use OLS multivariate regression models for each of the cognitive measures to estimate the relation between cesarean birth and cognitive development. These models are of the following form:

$${Y}_{i}={\gamma }_{0}+{\gamma }_{1}C{S}_{i}+SE{S}_{i}{\gamma }_{2}+P{N}_{i}{\gamma }_{3}+{\upsilon }_{i}$$
(1)

where \(C{S}_{i}\) equals one if child i was cesarean-born, 0 otherwise, \(SE{S}_{i}\) and \(P{N}_{i}\) are vectors of family socio-economic status and perinatal characteristics respectively (from TableĀ 1) and \({\upsilon }_{i}\,\)is an error term.

To try and explain the importance of possible channels, we measure the extent to which \({\gamma }_{1}\)is mediated by lower rates of breastfeeding and adverse child and maternal health outcomes in the data. Mediating effects are estimated using the product of the coefficients method43.

Identification of the main parameter of interest, \({\gamma }_{1}\) in equation (1) and the mediating effects is complicated by potential unobserved confounding, which means that \({\gamma }_{1}\) may be biased by correlation between \(C{S}_{i}\) and \({\upsilon }_{i}\). The main potential sources of unobserved confounding may be missing controls for perinatal risk factors (such as oxygen deprivation during birth) and socio-economic advantage (such as greater household income) associated with cesarean-born children. The presence of the former will lead to an over-estimate of any true negative relation between cesarean birth and cognitive development; whereas the presence of the latter will lead to an under-estimate of any negative relation.

Without the possibility for randomization, a common approach for dealing with this form of bias is instrumental variables. This method relies on the presence of factors in the data that affect cognitive development only through altering the chances of cesarean birth. We are unaware of any strong candidates in our data and we instead concentrate on testing the sensitivity of our results to bias from the presence of selection on unobserved covariates. These are discussed in detail below.

Sensitivity to unobserved perinatal confounders

We use two approaches to test the sensitivity of our OLS estimates to bias from unobserved perinatal covariates. First, we re-estimate OLS relations (using equation (1)) on a sub-sample of 2,140 births that are free of any observed health risk that may lead to cesarean birth and that are privately insured; termed, ā€˜low-risk privately insuredā€™ group. The idea is that by restricting the sample to those that are more similar on observed covariates, we are reducing bias by also restricting differences in unobserved covariates. The ā€˜Low-risk privately insuredā€™ sub-sample are those that are not low birth weight (above 2,500 grams), full-term (38ā€“40 week gestation), singleton, conceived without IVF, whose mother took no blood pressure or diabetes medication during pregnancy and whose parents were privately insured in the year of birth. These models include all other covariates included in the full sample. We omitted those without private health insurance because they are more likely than those with private insurance to have a cesarean performed for medical reasons.

In the second approach, we use the Oster32 method, which like approaches proposed by Rosenbaum44 and Altonji45, bounds the relation under assumptions about selection on unobserved covariates (see Nghiema, Nguyen, Khanam and Connelly46 and Hanushek, Schwerdt, Woessmann and Zhang47 for recent applications of the Oster method). Under the Oster method, we estimate the lower bound of the OLS relation between cesarean birth and measures of child cognitive development using the following:

$${{\rm{\gamma }}}_{1,{\rm{lower}}}={\rm{f}}({{\rm{\gamma }}}_{1}-{{\rm{\gamma }}}_{1{\rm{u}}},\,{{\rm{R}}}^{2}-{{\rm{R}}}_{{\rm{u}}}^{2};\,{{\rm{R}}}_{{\rm{\max }}}^{2}-{{\rm{R}}}^{2},\,{\rm{\delta }}).$$
(2)

Variables \({{\rm{\gamma }}}_{1{\rm{u}}}\) and \({{\rm{R}}}_{{\rm{u}}}^{2}\) are results from the ā€˜uncontrolledā€™ model, which is equation (1) estimated without perinatal and socio-economic controls; \({{\rm{R}}}_{{\rm{\max }}}^{2}\) is theoretical maximum R-squared, or the maximum proportion of variation in the outcome variable that can be explained by the model; and \({\rm{\delta }}\) is the coefficient of proportionality, or the ratio of selection on unobserved covariates to selection on observed covariates:

$$\delta =\frac{cov({W}_{U},CS)}{cov({W}_{O},CS)}.\frac{var({W}_{O})}{var({W}_{U})},$$
(3)

where \({{\rm{W}}}_{{\rm{O}}},\,{{\rm{W}}}_{{\rm{U}}}\) are vectors of linear combinations of observed and unobserved covariates weighted by their true coefficients (or \(\,{W}_{O}=\sum _{j=1}^{{J}_{O}}{\omega }_{j}^{O}{\gamma }_{j}^{O}\) and \({W}_{U}=\sum _{j=1}^{{J}_{U}}{\omega }_{j}^{U}{\gamma }_{j}^{U}\)). Following Oster conventions, we set \({\rm{\delta }}=1,\) or estimate \({{\rm{\gamma }}}_{1,{\rm{lower}}}\) under the conservative assumption that selection on unobserved covariates is equal to selection on observed covariates and set \({{\rm{R}}}_{{\rm{\max }}}^{2}=1.3{{\rm{R}}}^{2}\). Setting \({{\rm{R}}}_{{\rm{\max }}}^{2}=1.3{{\rm{R}}}^{2}\) contrasts with the closely related Altonji45 method that assigns \({{\rm{R}}}_{{\rm{\max }}}^{2}=1\). Oster32 argues that assuming a value of 1 is unreasonable in the presence of measurement error. Under equation (2), the higher the proportion of variation in the outcome variable explained by the model (\({{\rm{R}}}^{2})\), the smaller the discrepancy in the estimates of \({\gamma }_{1}\,\)and \({{\rm{\gamma }}}_{1,{\rm{lower}}}\).

For any estimated negative relation between cesarean birth and cognitive development, the larger the discrepancy between estimates of \({{\rm{\gamma }}}_{1,{\rm{lower}}}\) and \({\gamma }_{1}\), the more unreliable \({\gamma }_{1}\) as an estimate of the true relation.

Results

The estimated OLS relations between cesarean birth and child cognitive outcomes (equation (1)) for all births in our sample (nā€‰=ā€‰3,666) are reported in TableĀ 2. The estimated coefficients for the control variables in this model can be found in TablesĀ S3a and S3b of online supporting material. Generally speaking, the signs of these control variable coefficients are as expected, with the largest effects associated with maternal socio-economic factors. Specifically, we find that child cognitive outcomes are positively associated with mothers that are college educated (3-year bachelor degree), who give birth at an older age, who are partnered, who have private health insurance, are employed and have fewer previous births. The estimated coefficients of perinatal risk factors are generally insignificant or smaller and of mixed sign. Specifically, we find that taking blood pressure medication during pregnancy is associated with significant lower cognitive outcomes, while greater head circumference and body length is associated with positive cognitive outcomes. Consistent with previous studies, we find evidence, albeit only for select child cognitive outcomes, of a positive association with gestational age31, 48 and a negative association with low body weight (less than 2.5ā€‰kg)49. For the former, the association is only significant for school readiness at 4ā€“5 and for the latter; the association is only significant for school readiness at 4ā€“5, vocabulary at 4ā€“5 years and vocabulary at 6ā€“7 years. In alternative models estimated on 37ā€“40 week births with individual gestational week dummy variables, we find few significant differences in outcomes between 37, 38 and 39 weeks relative to the reference case of 40 weeks for any of the outcomes.

Turning to estimates in TableĀ 2, we observe significant negative relations between cesarean birth and measures of child cognitive development, up to a tenth of a standard deviation. The magnitude of the relations is similar across all measures, but only those with (NAPLAN) grammar, numeracy, reading, and writing at age 8ā€“9, problem solving (MR) and vocabulary (PPVT) at age 4ā€“5 are statistically significant at the 0.1 level or higher. To put the size of these relations into perspective, a tenth of a standard deviation is similar in magnitude to the estimated relation between gender and reading in NAPLAN at age 8ā€“9 (reported in TableĀ S5a) and effects estimated from improving teacher quality by one standard deviation and reducing average grade 3 class sizes by ten50.

Results from the first stage of our mediating analysis suggest that cesarean birth is significantly associated with lower rates of breastfeeding and higher rates of obesity and ADD (see TableĀ S4 of online supporting material). Second stage results show that breastfeeding is significantly associated with higher cognitive performance, whereas ADD, ASD and obesity are significantly associated with lower levels of cognitive performance. Poor maternal general health is significantly associated with lower vocabulary at 4ā€“5 years, but otherwise maternal health is not significantly linked to any of the cognitive measures.

Combining these results, breastfeeding, obesity and ADD are found to significantly mediate the relation between cesarean birth and child cognitive outcomes, although the effects size and significance vary (TableĀ 3). Individually, the largest mediating effect is through reduced chances of breastfeeding, which explains 0.008 percentage points out of the 0.076 percentage point difference (or around 11%) of the gap in grade 3 NAPLAN writing scores. The total mediating effect, generated from regressions when all of the mediators are included together, account for between 25% for reading (pā€‰=ā€‰0.052) and 29% for numeracy (pā€‰=ā€‰0.021) of the estimated difference in cognitive outcomes by delivery mode. This still leaves at least 70% of the relations unexplained by these mediators.

Table 3 Mediating effects of breastfeeding and adverse child and maternal health outcomes on cognitive development.

Sensitivity test results

The estimated relations using the low-risk privately insured sample are generally larger than those estimated using the entire sample. A possible explanation is that by limiting the sample to privately-insured births we are reducing the unobserved confounding from the socio-economic advantage associated with cesarean-born children, which in the main analysis under-states the true effects of cesarean birth.

The Oster coefficients in TableĀ 2 represent the possible lower-bound of any relation between cesarean birth and cognitive development measures under the conservative assumption that selection on unobserved covariates is the same as selection on observed covariates. An alternative interpretation is that the Oster parameter is the magnitude of the relation if adjustment from unobserved covariates is the same as adjustment from observed covariates in the model. With the exception of school readiness at 4ā€“5 years and problem solving at 6ā€“7 years, the Oster coefficients are much the same as the OLS regression coefficients estimated on the entire sample. Results of these sensitivity tests suggest that the estimated OLS relations are unlikely to be entirely driven by bias from unobserved confounding.

Discussion

We find a negative relation between cesarean birth and a range of cognitive outcomes measured from ages 4 to 9. These relations are established after controlling for the socio-economic advantage associated with cesarean birth in Australia and a rich set of perinatal risk factors that (from meta-analyses)9, 11, 27 are commonly used in child health studies. Our results are consistent with results from the only previous study that we are aware of by Bentley et al.31, which found cesarean-born children had a 14% higher risk of being identified as being developmentally high risk at school starting age. While the magnitude of our estimated difference in outcomes is not large, up to a tenth of a standard deviation in national test scores in numeracy, they are large enough to warrant action. A tenth of a standard deviation in national test scores is comparable in size to differences related to gender, class size and teacher quality that are the focus of policy effort. Importantly, much of the estimated relations (around 70%) are found to be unrelated to lower rates of breastfeeding and adverse child and maternal health, including the diagnosis of child cognitive disorders.

We make two important contributions to the Bentley et al.31 study. First, our sensitivity analysis suggests that bias from unobserved confounding is unlikely to explain the results completely and that causal relations are plausible. This does not mean that we have established causal relations because bias from unobserved confounding is still possible. Potential perinatal risks not controlled for include inheritable genetic traits, such as a lack of maternal height, that may drive both cesarean birth (due to a relatively small pelvis (cephalopelvic disproportion))51 and child cognitive outcomes52. Second, our estimated relations persist long-term and are not confined to children with health problems. Bentley et al.31 used perinatal records linked to school starting-age measures of cognition and could not examine longer-term outcomes or quantify mediating effects. By showing that only a small part of the cognitive gap is explained by mediating factors, our results leave open the possibility that direct mechanisms, such as disturbed gut microbiota, may be important. However, we cannot rule-out the possibility that at least some of the residual effect is due to measurement error, for example, under-reporting of the presence of health conditions by the primary care giver, or that the mediating effects are biased by unobserved confounding.

Evidence presented in this study should motivate more research. Because experimentation is unsuitable, future studies may focus instead on instrumental variables estimation using large-scale linked hospital and child development administrative records. The instrumental variables approach exploits natural experiments, that is, random events that lead to variation in assignment to treatment without directly affecting outcomes. A limitation of this method is that the results only have a local treatment interpretation and cannot be generalized to those unaffected by the random event that led to assignment.

With the above caveats in mind, the key message to medical practitioners is to take a precautionary approach when formulating birth plans, especially when there are no apparent elevated health risks from vaginal birth. Informing mothers of the risks and benefits of cesarean birth should be a priority, which may be formalized by incorporating education sessions into practitioner guidelines.