Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Population-scale single-cell RNA-seq profiling across dopaminergic neuron differentiation

Abstract

Studying the function of common genetic variants in primary human tissues and during development is challenging. To address this, we use an efficient multiplexing strategy to differentiate 215 human induced pluripotent stem cell (iPSC) lines toward a midbrain neural fate, including dopaminergic neurons, and use single-cell RNA sequencing (scRNA-seq) to profile over 1 million cells across three differentiation time points. The proportion of neurons produced by each cell line is highly reproducible and is predictable by robust molecular markers expressed in pluripotent cells. Expression quantitative trait loci (eQTL) were characterized at different stages of neuronal development and in response to rotenone-induced oxidative stress. Of these, 1,284 eQTL colocalize with known neurological trait risk loci, and 46% are not found in the Genotype–Tissue Expression (GTEx) catalog. Our study illustrates how coupling scRNA-seq with long-term iPSC differentiation enables mechanistic studies of human trait-associated genetic variants in otherwise inaccessible cell states.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Experimental design and cell type heterogeneity in pooled differentiations of iPSCs to a midbrain cell fate.
Fig. 2: Reproducible variation in differentiation trajectories.
Fig. 3: A gene expression signature in iPSCs is associated with neuronal differentiation efficiency.
Fig. 4: Mapping of cis-eQTL in 14 distinct cell contexts (cell types–conditions) for six dominant cell types identified across midbrain differentiation.
Fig. 5: Colocalization analysis of eQTL with 25 neuro-related GWAS traits.

Similar content being viewed by others

Data availability

Managed access data from scRNA-seq are accessible in the European Genome–phenome Archive (EGA, https://www.dev.ebi.ac.uk/ega/) under the study number EGAS00001002885 (dataset EGAD00001006157). Open access scRNA-seq data are available in the European Nucleotide Archive (ENA) under the study ERP121676 (https://www.ebi.ac.uk/ena/browser/view/PRJEB38269). Processed single-cell count data and eQTL and colocalization summary statistics are available on Zenodo at https://zenodo.org/record/4333872. The two iPSC single-cell datasets are available on Zenodo (https://zenodo.org/record/3625024) and GEO (GSE118723) for the datasets described in Cuomo et al.2 and Sarkar et al.38, respectively. iPSC bulk RNA-seq data from Bonder et al.37 are available on the EGA (study ID, EGAS00001000593, https://www.ebi.ac.uk/ega/studies/EGAS00001000593) and the ENA (ERP007111, https://www.ebi.ac.uk/ena/browser/view/PRJEB7388). Chip genotypes for HipSci lines are available from the EGA (EGAS00001000866) and the NCBI (PRJEB11750).

Code availability

All scripts are available in the following github repository: https://github.com/single-cell-genetics/singlecell_neuroseq_paper/. The standalone predictor for neuronal differentiation capacity is available at https://github.com/single-cell-genetics/singlecell_neuroseq_paper/tree/master/differentiation_prediction_model/. The eQTL mapping pipeline is available at https://github.com/single-cell-genetics/limix_qtl/.

References

  1. Kilpinen, H. et al. Common genetic variation drives molecular heterogeneity in human iPSCs. Nature 546, 370–375 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Cuomo, A. S. E. et al. Single-cell RNA-sequencing of differentiating iPS cells reveals dynamic genetic effects on gene expression. Nat. Commun. 11, 810 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Strober, B. J. et al. Dynamic genetic regulation of gene expression during cellular differentiation. Science 364, 1287–1290 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Schwartzentruber, J. et al. Molecular and functional variation in iPSC-derived sensory neurons. Nat. Genet. 50, 54–61 (2018).

    Article  CAS  PubMed  Google Scholar 

  5. Alasoo, K. et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet. 50, 424–431 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. D’Antonio-Chronowska, A., D’Antonio, M. & Frazer, K. In vitro differentiation of human iPSC-derived retinal pigment epithelium cells (iPSC-RPE). Bio-protocol 9, e3469 (2019).

  7. Banovich, N. E. et al. Impact of regulatory variation across human iPSCs and differentiated cells. Genome Res. 28, 122–131 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Volpato, V. et al. Reproducibility of molecular phenotypes after long-term differentiation to human iPSC-derived neurons: a multi-site omics study. Stem Cell Rep. 11, 897–911 (2018).

    Article  CAS  Google Scholar 

  9. Nguyen, Q. H. et al. Single-cell RNA-seq of human induced pluripotent stem cells reveals cellular heterogeneity and cell state transitions between subpopulations. Genome Res. 28, 1053–1066 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Mitchell, J. M. et al. Mapping genetic effects on cellular phenotypes with ‘cell villages’. Preprint at bioRxiv https://doi.org/10.1101/2020.06.29.174383 (2020).

  11. Osborn, T. & Hallett, P. J. Seq-ing markers of midbrain dopamine neurons. Cell Stem Cell 20, 11–12 (2017).

    Article  CAS  PubMed  Google Scholar 

  12. Stoddard-Bennett, T. & Pera, R. R. Stem cell therapy for Parkinson’s disease: safety and modeling. Neural Regen. Res. 15, 36–40 (2020).

    Article  PubMed  Google Scholar 

  13. Kriks, S. et al. Dopamine neurons derived from human ES cells efficiently engraft in animal models of Parkinson’s disease. Nature 480, 547–551 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Xiong, N. et al. Mitochondrial complex I inhibitor rotenone-induced toxicity and its potential mechanisms in Parkinson’s disease models. Crit. Rev. Toxicol. 42, 613–632 (2012).

    Article  CAS  PubMed  Google Scholar 

  15. Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, P10008 (2008).

    Article  Google Scholar 

  16. Kang, H. M. et al. Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat. Biotechnol. 36, 89–94 (2018).

    Article  CAS  PubMed  Google Scholar 

  17. Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. La Manno, G. et al. Molecular diversity of midbrain development in mouse, human, and stem cells. Cell 167, 566–580 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Park, C.-H. et al. Acquisition of in vitro and in vivo functionality of Nurr1-induced dopamine neurons. FASEB J. 20, 2553–2555 (2006).

    Article  CAS  PubMed  Google Scholar 

  20. Ramonet, D. et al. PARK9-associated ATP13A2 localizes to intracellular acidic vesicles and regulates cation homeostasis and neuronal integrity. Hum. Mol. Genet. 21, 1725–1743 (2012).

    Article  CAS  PubMed  Google Scholar 

  21. Arenas, E., Denham, M. & Villaescusa, J. C. How to make a midbrain dopaminergic neuron. Development 142, 1918–1936 (2015).

    Article  CAS  PubMed  Google Scholar 

  22. Welch, J. D. et al. Single-cell multi-omic integration compares and contrasts features of brain cell identity. Cell 177, 1873–1887 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Cummings, K. J. & Hodges, M. R. The serotonergic system and the control of breathing during development. Respir. Physiol. Neurobiol. 270, 103255 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Campbell, J. N. et al. A molecular census of arcuate hypothalamus and median eminence cell types. Nat. Neurosci. 20, 484–496 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Sloan, S. A. et al. Human astrocyte maturation captured in 3D cerebral cortical spheroids derived from pluripotent stem cells. Neuron 95, 779–790 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Zhang, Y. et al. Purification and characterization of progenitor and mature human astrocytes reveals transcriptional and functional differences with mouse. Neuron 89, 37–53 (2016).

    Article  CAS  PubMed  Google Scholar 

  27. Bertrand, N., Castro, D. S. & Guillemot, F. Proneural genes and the specification of neural cell types. Nat. Rev. Neurosci. 3, 517–530 (2002).

    Article  CAS  PubMed  Google Scholar 

  28. Lacomme, M., Liaubet, L., Pituello, F. & Bel-Vialar, S. NEUROG2 drives cell cycle exit of neuronal precursors by specifically repressing a subset of cyclins acting at the G1 and S phases of the cell cycle. Mol. Cell. Biol. 32, 2596–2607 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Sherer, T. B. et al. Mechanism of toxicity in rotenone models of Parkinson’s disease. J. Neurosci. 23, 10756–10764 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Knönagel, H. & Karmann, U. Autologous blood transfusions in interventions of the pelvis using the cell saver. Helv. Chir. Acta 59, 485–488 (1992).

    PubMed  Google Scholar 

  31. Cannon, J. R. et al. A highly reproducible rotenone model of Parkinson’s disease. Neurobiol. Dis. 34, 279–290 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. D’Antonio-Chronowska, A. et al. Association of human iPSC gene signatures and X chromosome dosage with two distinct cardiac differentiation trajectories. Stem Cell Rep. 13, 924–938 (2019).

    Article  Google Scholar 

  33. Ye, W., Shimamura, K., Rubenstein, J. L., Hynes, M. A. & Rosenthal, A. FGF and Shh signals control dopaminergic and serotonergic cell fate in the anterior neural plate. Cell 93, 755–766 (1998).

    Article  CAS  PubMed  Google Scholar 

  34. He, Z. & Yu, Q. Identification and characterization of functional modules reflecting transcriptome transition during human neuron maturation. BMC Genomics 19, 262 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  35. Lancaster, M. A. et al. Guided self-organization and cortical plate formation in human brain organoids. Nat. Biotechnol. 35, 659–666 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Müller, F.-J. et al. A bioinformatic assay for pluripotency in human cells. Nat. Methods 8, 315–317 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  37. Bonder, M. J. et al. Identification of rare and common regulatory variants using population-scale transcriptomics of pluripotent cells. Nat. Genet. https://doi.org/10.1038/s41588-021-00800-7 (2021).

  38. Sarkar, A. K. et al. Discovery and characterization of variance QTLs in human induced pluripotent stem cells. PLoS Genet. 15, e1008045 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Miller, D. J. & Fort, P. E. Heat shock proteins regulatory role in neurodevelopment. Front. Neurosci. 12, 821 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Bartelt-Kirbach, B. et al. HspB5/αB-crystallin increases dendritic complexity and protects the dendritic arbor during heat shock in cultured rat hippocampal neurons. Cell. Mol. Life Sci. 73, 3761–3775 (2016).

    Article  CAS  PubMed  Google Scholar 

  41. Shimura, H., Miura-Shimura, Y. & Kosik, K. S. Binding of tau to heat shock protein 27 leads to decreased concentration of hyperphosphorylated tau and enhanced cell survival. J. Biol. Chem. 279, 17957–17962 (2004).

    Article  CAS  PubMed  Google Scholar 

  42. Wilhelmus, M. M. M. et al. Small heat shock proteins inhibit amyloid-β protein aggregation and cerebrovascular amyloid-β protein toxicity. Brain Res. 1089, 67–78 (2006).

    Article  CAS  PubMed  Google Scholar 

  43. Tucci, S. Brain metabolism and neurological symptoms in combined malonic and methylmalonic aciduria. Orphanet J. Rare Dis. 15, 27 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Urbut, S. M., Wang, G., Carbonetto, P. & Stephens, M. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nat. Genet. 51, 187–195 (2019).

    Article  CAS  PubMed  Google Scholar 

  45. GTEx Consortium et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

    Article  PubMed Central  Google Scholar 

  46. Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  47. Kory, N. et al. SFXN1 is a mitochondrial serine transporter required for one-carbon metabolism. Science 362, eaat9528 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  48. Palmer, G., Horgan, D. J., Tisdale, H., Singer, T. P. & Beinert, H. Studies on the respiratory chain-linked reduced nicotinamide adenine dinucleotide dehydrogenase. XIV. Location of the sites of inhibition of rotenone, barbiturates, and piericidin by means of electron paramagnetic resonance spectroscopy. J. Biol. Chem. 243, 844–847 (1968).

    Article  CAS  PubMed  Google Scholar 

  49. Betarbet, R. et al. Chronic systemic pesticide exposure reproduces features of Parkinson’s disease. Nat. Neurosci. 3, 1301–1306 (2000).

    Article  CAS  PubMed  Google Scholar 

  50. Ma, D. K., Ponnusamy, K., Song, M.-R., Ming, G.-L. & Song, H. Molecular genetic analysis of FGFR1 signalling reveals distinct roles of MAPK and PLCγ1 activation for self-renewal of adult neural stem cells. Mol. Brain 2, 16 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Stachowiak, E. K. et al. Cerebral organoids reveal early cortical maldevelopment in schizophrenia-computational anatomy and genomics, role of FGFR1. Transl. Psychiatry 7, 6 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. International Stem Cell Initiative. Assessment of established techniques to determine developmental and malignant potential of human pluripotent stem cells. Nat. Commun. 9, 1925 (2018).

    Article  Google Scholar 

  53. Tsankov, A. M. et al. A qPCR ScoreCard quantifies the differentiation potential of human pluripotent stem cells. Nat. Biotechnol. 33, 1182–1192 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Bock, C. et al. Reference maps of human ES and iPS cell variation enable high-throughput characterization of pluripotent cell lines. Cell 144, 439–452 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Kajiwara, M. et al. Donor-dependent variations in hepatic differentiation from human-induced pluripotent stem cells. Proc. Natl Acad. Sci. USA 109, 12538–12543 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Hu, S. et al. Effects of cellular origin on differentiation of human induced pluripotent stem cell-derived endothelial cells. JCI Insight 1, e85558 (2016).

    Article  PubMed Central  Google Scholar 

  57. Lancaster, M. A. & Knoblich, J. A. Generation of cerebral organoids from human pluripotent stem cells. Nat. Protoc. 9, 2329–2340 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).

  59. Ferri, A. L. M. et al. Foxa1 and Foxa2 regulate multiple phases of midbrain dopaminergic neuron development in a dosage-dependent manner. Development 134, 2761–2769 (2007).

    Article  CAS  PubMed  Google Scholar 

  60. Andersson, E. et al. Identification of intrinsic determinants of midbrain dopamine neurons. Cell 124, 393–405 (2006).

    Article  CAS  PubMed  Google Scholar 

  61. Loo, L. et al. Single-cell transcriptomic analysis of mouse neocortical development. Nat. Commun. 10, 134 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Ren, J. et al. Single-cell transcriptomes and whole-brain projections of serotonin neurons in the mouse dorsal and median raphe nuclei. eLife 8, e49424 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Huang, K. W. et al. Molecular and anatomical organization of the dorsal raphe nucleus. eLife 8, e46464 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  64. Okaty, B. W. et al. A single-cell transcriptomic and anatomic atlas of mouse dorsal raphe neurons. eLife 9, e55523 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  65. Mercurio, S., Serra, L. & Nicolis, S. K. More than just stem cells: functional roles of the transcription factor Sox2 in differentiated glia and neurons. Int. J. Mol. Sci. 20, 4540 (2019).

    Article  CAS  PubMed Central  Google Scholar 

  66. Wu, Y., Liu, Y., Levine, E. M. & Rao, M. S. Hes1 but not Hes5 regulates an astrocyte versus oligodendrocyte fate choice in glial restricted precursors. Dev. Dyn. 226, 675–689 (2003).

    Article  CAS  PubMed  Google Scholar 

  67. Wiese, S., Karus, M. & Faissner, A. Astrocytes as a source for extracellular matrix molecules and cytokines. Front. Pharmacol. 3, 120 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Redies, C. Cadherins in the central nervous system. Prog. Neurobiol. 61, 611–648 (2000).

    Article  CAS  PubMed  Google Scholar 

  69. Haghverdi, L., Lun, A. T. L., Morgan, M. D. & Marioni, J. C. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 36, 421–427 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Raudvere, U. et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. 47, W191–W198 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Lippert, C., Casale, F. P., Rakitsch, B. & Stegle, O. LIMIX: genetic analysis of multiple traits. Preprint at bioRxiv https://doi.org/10.1101/003905 (2014).

  72. Casale, F. P., Rakitsch, B., Lippert, C. & Stegle, O. Efficient set tests for the genetic analysis of correlated traits. Nat. Methods 12, 755–758 (2015).

    Article  CAS  PubMed  Google Scholar 

  73. Ongen, H., Buil, A., Brown, A. A., Dermitzakis, E. T. & Delaneau, O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics 32, 1479–1485 (2016).

    Article  CAS  PubMed  Google Scholar 

  74. Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

All data for this study were generated under the Open Targets project OTAR039. J.J. was supported by a postdoctoral fellowship from Open Targets, A.S.E.C. was supported by a PhD fellowship from the EMBL International PhD Programme, and D.D.S. was supported by a postdoctoral fellowship from the EMBL Interdisciplinary Postdoctoral Programme. M.A.L. was funded by the Medical Research Council (MC_UP_1201/9). N.K. and D.J.G. were funded by the Wellcome Trust grant WT206194. F.T.M. is a New York Stem Cell Foundation, Robertson Investigator and is supported by the New York Stem Cell Foundation (NYSCF-R-156), the Wellcome Trust and Royal Society (211221/Z/18/Z) and the Chan Zuckerberg Initiative (191942) and by the NIHR Cambridge BRC. J.C.M. acknowledges core support from EMBL and Cancer Research UK (C9545/A29580). O.S. is supported by core funding from EMBL and the DKFZ, as well as the BMBF, the Volkswagen Foundation and the EU (810296). We thank the MRC Metabolic Diseases Unit Imaging Core Facility for assistance with imaging. We thank the staff at the Cellular Generation and Phenotyping and Sequencing core facilities at the Wellcome Sanger Institute and the imaging core facility of the Wellcome–MRC Institute of Metabolic Science. We thank H. Kilpinen and P. Puigdevall Costa for useful discussions regarding data analysis.

Author information

Authors and Affiliations

Authors

Consortia

Contributions

The main analyses and data preparation were performed by J.J., D.D.S. and A.S.E.C. N.K. performed the colocalization analysis. Cell culture experiments were performed by J.J., J.H., J.S., D.P. and M.A., and M.A.L. performed the experimental work on the organoid dataset. J.J. and M.P. oversaw the cell culture experiments. E.M. and M.G. processed GWAS summary statistics for colocalization analysis. J.J., D.D.S., A.S.E.C., J.C.M., F.T.M., O.S. and D.J.G. wrote the manuscript; N.K. assisted in editing the manuscript; J.J., F.T.M., O.S. and D.J.G. conceived and oversaw the study.

Corresponding authors

Correspondence to John C. Marioni, Florian T. Merkle, Daniel J. Gaffney or Oliver Stegle.

Ethics declarations

Competing interests

D.J.G. and E.M. were employees of Genomics PLC, and D.D.S. was an employee of GSK at the time the manuscript was submitted.

Additional information

Peer review information Nature Genetics thanks Kristen Brennand and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 scRNA-seq clustering and cell type annotation.

Cells at each time point pooled across lines were clustered using Louvain clustering, after normalization and batch correction using Harmony (Methods). Subsequently, clusters were annotated to cell types using known marker genes; when two clusters showed the same gene set enrichment they were assigned to the same cell type identity (Methods). a, UMAPs of cells sampled at each time point and coloured by cluster identity. b, Same UMAPs as in a, with cells this time coloured by cell type annotation. c, Heatmap showing average expression profiles of canonical marker genes across the identified cell types (as in b, excluded dopaminergic neuronal markers; expression scaled between 0 and 1 for each gene). d,e, Heatmaps similar to c, showing average expression profile of marker genes across the identified cell types for d) dopaminergic neurons using literature-curated markers e) cortical hem/Cajal retzius cells using markers (Methods). Legend: Astro: Astrocyte-like, DA: Dopaminergic neurons, Epen1/2: Ependymal-like 1/2, FPP: Floor Plate Progenitors, NB: Neuroblasts, P_FPP: Proliferating Floor Plate Progenitors, P_Sert: Proliferating serotonergic-like neurons, Sert: Serotonergic-like neurons, U_Neur: Unknown Neurons.

Extended Data Fig. 2 Cell type proportions across time points and definition of neuronal differentiation efficiency at day 52.

a, Heatmap of the cell proportion matrix. The colour code in the first bar indicates the assignment of lines to pools. Rows (that is cell line, pool combinations) were hierarchically clustered according to their Euclidean distance (as in Fig. 2b). Cell proportions were estimated for each cell type and time point for all combinations of cell lines, considering 10 pools with at least 10 cells at all time points (138 lines). b, Proportion of variance explained by each principal component calculated from the cell proportions matrix from a. c, Comparison of the first principal component (PC1) with the sum of fractions of dopaminergic and serotonergic-like neurons present on day 52. d, UMAP of the scRNA-seq profiles used for the cell proportions matrix from a, with cells coloured by the loading of PC1 (left) and PC2 (right) of the cell proportion matrix. Legend: Astro: Astrocyte-like, DA: Dopaminergic neurons, Epen1: Ependymal-like 1, FPP: Floor Plate Progenitors, NB: Neuroblasts, P_FPP: Proliferating Floor Plate Progenitors, Sert: Serotonergic-like neurons, U_Neur1: Unknown Neurons 1.

Extended Data Fig. 3 Prediction of differentiation failure from iPSC gene expression.

a, Histogram of neuronal differentiation efficiencies across cell lines. The dashed line denotes the threshold to define differentiation success or failure (that is efficiency=0.2). b, Effect of pooling on neuronal differentiation efficiency. Shown is a scatterplot of neuronal differentiation efficiency, estimated from independent single-line differentiations (x-axis) vs differentiation efficiency defined from pooled data from the corresponding lines (y-axis). Bars connecting cyan and blue points indicate differentiation efficiencies for replicates of the same cell line in different pools (cyan points), and the average of those replicates (blue points). For cell lines differentiated in multiple pools, average differentiation efficiencies are shown by blue points. The Pearson R and p-value were computed from the average values (blue points) only. c, Precision-recall curve for a logistic regression model trained to predict differentiation failure from iPSC gene expression data (Methods). Shown is precision versus recall, as assessed using leave-one-out cross validation. d, Area under the precision-recall curve (AUPR) for models as presented in c, when considering alternative threshold values to define differentiation failure. e, Histogram of the predicted differentiation based on iPSC gene expression for 812 HipSci cell lines. The threshold used to define potent differentiators corresponds to 35% recall, 100% precision, when using 0.2 as threshold (as in a, b). f, Cross validation of differentiation outcome prediction. The dataset was split in half (pools 1–8, 9–17) to define independent training and test fractions. All processing steps (merging, clustering, batch correction, etc.) were performed separately for these two fractions, following identical steps and parameter settings to those used for the main analysis. We trained a predictive model using data from pools 1–8, and assessed its performance on pools 9–17. Only cell lines not contained in pools 1–8 were considered for performance assessment.

Extended Data Fig. 4 Predicted neuronal differentiation capacity across replicate iPSC lines derived from the same donor.

a,b, Variance component analysis of neuronal differentiation efficiency. a, Variance component breakdown of a model that explains neuronal differentiation efficiency as a function of cell line, pool, sex, age and noise (n = 230; fitted using lme4). b, In order to assess the effect of XCI status, we fit an analogous variance component model as in a, however considering only female lines (n = 115), and explaining neuronal differentiation efficiency as a function of cell line, pool, XCI status, age, and noise. c, Histogram of predicted neuronal differentiation efficiency based on iPSC gene expression for 812 HipSci cell lines. The vertical line indicates the prediction threshold that corresponds to 35% recall, 100% precision (c.f. Extended Data Fig. 3). d, Scatter plot of predicted neuronal differentiation efficiency for two replicate lines from the same donor. Shown are data from n = 271 donors contained in HipSci with RNA-seq data from two independent reprogramming events. Replicate 1 is chosen as the line with the lower predicted score. Colours indicate three categories of donors, according to the concordance of predicted neuronal differentiation capacity: both lines predicted to fail (blue, n = 13), both lines predicted to be potent differentiators (green, n = 209), discordant predictions, with one potent and one failing differentiator (yellow, n = 49). To assess whether this was significantly different from what we would expect for any two lines taken by chance, we performed a chi square test comparing the expected frequencies for any two given lines (based on the overall results) and the observed frequencies for pairs of lines from the same donor, obtaining a non-significant result (p = 0.1991). e, Bulk RNA-seq expression of UTF1 and TAC3 for the two replicate lines for the same donor, stratified by the categorisation as in d. In the box plots, the middle line is the median and the lower and upper edges of the box denote the first and third quartiles.

Extended Data Fig. 5 Analysis of iPSC scRNA-seq data reveals a subpopulation characterised by expression of predictive marker genes associated with lower differentiation efficiency.

a, UMAP overview of the dataset. iPSC scRNA-seq data from (Cuomo et al. 2020) were analysed following the analogous batch adjustment and clustering steps as applied to the neuronal differentiation data, identifying 5 clusters. b, UMAP as in a, coloured by the squared correlation coefficient R2 between correlation of bulk expression and differentiation efficiency (R values are indicated, as in Fig. 3b) and log fold change between one cluster and all others. c, Violin plots of gene expression for selected pluripotency genes (NANOG, SOX2, POU5F1) as well as marker genes that are upregulated and downregulated respectively in cluster 2 (UTF1, TAC3, from Fig. 3). d, Scatter plot of the proportion of cluster 2 cells between replicate experiments (based on n = 23 lines differentiated in two separate pools in Cuomo et al. paper 2020). LOESS curve and 95% confidence interval are included. e, Scatter plot between neuronal differentiation efficiency (x-axis) and the proportion of cells assigned to cluster 2 (y-axis) analogous to Fig. 3f, however using computational estimates of the proportions of cluster 2 cells based for a larger set of HipSci lines (using Decon-cell, based on bulk RNA-seq, n = 182; Methods).

Extended Data Fig. 6 eQTL mapping strategies and sharing between eQTL maps.

a, Distribution of the number of cells per cell line with scRNA-seq available for eQTL mapping for each context (cell type-condition). Dots correspond to individual cell lines (number of cell line per context ranging between 104 and 173). b, Number of genes with at least one eQTL (that is eGenes) for each context (cell type-condition) detected using either a traditional linear model (coral) or using a linear mixed model that accounts for heterostochastic noise due to variation in the number of cells assayed for each line (seagreen; Methods). c, Sharing of eQTL signal between 14 eQTL maps across all contexts (cell type-conditions), as estimated using MASHR (Methods). d, Distribution of the number of contexts (cell type-conditions) in which a given eQTL is identified (from 1 to 14, lfsr < 0.05, quantified using MASHR). Legend: Astro: Astrocytes-like; DA: Dopaminergic neurons, Epen1: Ependymal-like1, FPP: Floor Plate Progenitors, P_FPP: Proliferating Floor Plate Progenitors, Sert: Serotonergic-like neurons.

Extended Data Fig. 7 eQTL mapping robustness across methodologies.

a, Scatter plot of the first two principal components of the kinship matrix, revealing no evidence for pronounced population structure or relatedness between lines. b, Genomic location of eQTL lead variants relative to normalized gene coordinates, considering 1,024 eQTL identified in DA day 52 untreated cells (using Model 0, see below). c, Scatter plot of effect size estimates (left) and negative log p-values (right) for eQTL lead variants (FDR < 5%), comparing the model considered in this study (Model 0, x-axis) versus 4 alternative eQTL models (Model 1–4, y-axis). Inlined is Pearson’ R. Shown are results obtained on day 52 untreated DA cells, comparing the following models: Model 0: y = PC1:15 + SNP + 1/n + noise (1,024 eGenes), Model 1: y = pool + sex + SNP + 1/n + noise (1 pool per line selected; (608 eGenes, 574 of which also in Model 0), Model 2: y = pool + sex + SNP + K + noise (320 eGenes, 312 of which shared with Model 0), Model 3: y = PCs + K + noise (471 eGenes, 457 of which shared with Model 0), Model 4: y = pool + SNP + K + 1/n + noise (856 eGenes, 734 of which shared with Model 0).

Extended Data Fig. 8 eQTL and colocalisation in relation to GTEx.

a, Fraction of GTEx brain eGenes that could be assessed in each of the considered contexts (cell type-conditions; Methods). b, Fraction of GTEx brain eQTL that were replicated in this study (nominal p < 0.5; fraction relative to the set of assessed genes from a). c, Figure analogous to main text Fig. 4c, additionally including eQTL counts from a pseudobulk eQTL analysis (top red dot on the left, red square on the right; calculated using cells from all day 52 cells untreated pooled). d, Figure analogous to main text Fig. 5a, additionally including colocalisation results from a pseudobulk eQTL analysis (using cells from all day 52 cells untreated pooled). In the box plots, the middle line is the median and the lower and upper edges of the box denote the first and third quartiles, while the violin plots show the distribution. Legend: Astro: Astrocytes-like; DA: Dopaminergic neurons, Epen1: Ependymal-like1, FPP: Floor Plate Progenitors, P_FPP: Proliferating Floor Plate Progenitors, Sert: Serotonergic-like neurons.

Supplementary information

Supplementary Information

Supplementary Figs. 1–7 and Methods

Reporting Summary

Supplementary Tables

Supplementary Tables 1–11

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jerber, J., Seaton, D.D., Cuomo, A.S.E. et al. Population-scale single-cell RNA-seq profiling across dopaminergic neuron differentiation. Nat Genet 53, 304–312 (2021). https://doi.org/10.1038/s41588-021-00801-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-021-00801-6

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research