Abstract

   Parkinson’s disease (PD) is the second-most prevalent neurodegenerative
   disorder, characterized by the loss of dopaminergic neurons (mDA) in
   the midbrain. The underlying mechanisms are only partly understood and
   there is no treatment to reverse PD progression. Here, we investigated
   the disease mechanism using mDA neurons differentiated from human
   induced pluripotent stem cells (hiPSCs) carrying the ILE368ASN mutation
   within the PINK1 gene, which is strongly associated with PD.
   Single-cell RNA sequencing (RNAseq) and gene expression analysis of a
   PINK1-ILE368ASN and a control cell line identified genes differentially
   expressed during mDA neuron differentiation. Network analysis revealed
   that these genes form a core network, members of which interact with
   all known 19 protein-coding Parkinson’s disease-associated genes. This
   core network encompasses key PD-associated pathways, including
   ubiquitination, mitochondrial function, protein processing, RNA
   metabolism, and vesicular transport. Proteomics analysis showed a
   consistent alteration in proteins of dopamine metabolism, indicating a
   defect of dopaminergic metabolism in PINK1-ILE368ASN neurons. Our
   findings suggest the existence of a network onto which pathways
   associated with PD pathology converge, and offers an inclusive
   interpretation of the phenotypic heterogeneity of PD.

   Subject terms: Stem-cell differentiation, Parkinson's disease,
   Parkinson's disease, Differentiation, Cell death in the nervous system
     __________________________________________________________________

   Gabriela Novak et al. utilize scRNA-seq to investigate expression
   profiles in iPSC-derived midbrain dopaminergic neurons from Parkinson’s
   disease patients or healthy controls. Their results suggest a core
   molecular network associated with Parkinson’s disease pathology, and
   provide a future resource for investigation of this critical disorder.

Introduction

   Parkinson’s disease (PD) is one of the most prevalent neurological
   disorders, second only to Alzheimer’s disease, with a prevalence of
   1.8%, among persons over the age of 65 and 2.6% in the 85 to 89 age
   group^[38]1–[39]3. As the average age of the population increases, PD
   is expected to pose an increasing burden to society. PD is
   characterized by the death of the midbrain dopaminergic (mDA) neurons
   found in the substantia nigra region of the brain, which are
   selectively sensitive to Parkinson’s disease-associated neuronal cell
   death^[40]4–[41]7. This results in the development of motor deficits,
   including bradykinesia, rigidity, and tremor, but many patients also
   develop non-motor symptoms, such as depression or dementia^[42]8.
   Unfortunately, current treatments only temporarily ameliorate the motor
   symptoms and do not reverse or slow down the progression of PD^[43]4.

   Most of our understanding of PD pathology at the molecular level is
   based on mutations known to cause PD, although these account for only
   3–5% of PD cases, the remaining cases being idiopathic^[44]2. Despite
   the small fraction of cases they explain, these mutations provide an
   important window into the underlying molecular mechanisms of PD because
   they identify pathways which, when disrupted, are able to cause the
   disease. Many of these mutations converge on mitochondrial homeostasis,
   repair, and mitophagy. Hence, mitochondrial dysfunction likely plays a
   key role in the pathophysiology of PD^[45]9. An important group of
   these mutations lies within the PINK1 gene. The PINK1 protein is
   expressed ubiquitously throughout the brain, in all cell types, where
   it localizes to the mitochondrial membrane^[46]10. PINK1 is a
   mitochondrial ubiquitin kinase and, together with the cytosolic
   ubiquitin ligase PARKIN, it targets damaged mitochondria for
   degradation via mitophagy, performing a mitochondrial quality control
   function needed to prevent accumulation of damaged mitochondria, which
   otherwise results in neuronal cell death^[47]11–[48]13. The ILE368ASN
   mutation interferes with this process by reducing the interaction of
   PINK1 with its chaperone, HSP90, and destabilizing PINK1 at the
   mitochondrial membrane^[49]11. It also reduces its ubiquitin kinase
   activity through the deformation of its substrate-binding pocket and
   substrate misalignment^[50]11. Even though multiple publications have
   shown the involvement of PINK1 in mitophagy, its function is much
   broader. The targets of this kinase are involved in many cellular
   functions, including neuronal maturation^[51]14, neurite
   outgrowth^[52]15, suppression of mitophagy^[53]16, and cell cycle
   modulation^[54]17. The broader impact on these pathways of
   loss-of-function mutations in this important kinase has not yet been
   fully elucidated^[55]18.

   One of the main obstacles to the study of Parkinson’s disease is the
   death of mDA neurons and the resulting shortage of available postmortem
   samples. By the time of diagnosis, 60% of these neurons have
   disappeared and about 90% die by the late stage of the disease^[56]6.
   One approach is to study PD-associated mutations in animal
   models^[57]19, but human-like mutations in animals often do not lead to
   the development of comparable pathology due to species differences in
   expression of key genes^[58]20,[59]21. Thankfully, the development of
   cellular reprogramming allows nowadays for the conversion of somatic
   cells into induced pluripotent stem cells (iPSCs), which can
   subsequently be differentiated into neurons. This enables us to
   generate iPSCs from the skin cells of PD patients^[60]22 and
   differentiate them into mDA neurons carrying disease-associated
   mutations^[61]23–[62]25. Differentiating mDA neurons from iPSCs
   provides an almost unlimited source of neurons that allow for deep
   phenotyping and the elucidation of the cellular mechanisms underlying
   PD pathology.

   Here, we generated iPSCs from the fibroblasts of a patient homozygous
   for the PD-associated mutation ILE368ASN (p.I368N) in the PINK1
   gene^[63]2. We used an optimized differentiation protocol to
   specifically generate mDA neurons, as this cell type displays a unique
   susceptibility to cell death in PD;^[64]23,[65]25,[66]26 the effect of
   PD on other types of DA neurons is variable^[67]6,[68]27.

   The mDA neurons are unique and distinct from other DA neurons. Their
   development diverges from that of other DA neurons even before they
   commit to neural fate^[69]28. During early neural development, neural
   tube stem cells generate neurons and glia, the two basic building
   blocks of the brain. While other DA neurons follow this direct path,
   which is determined by the expression of the PAX6 transcription factor,
   mDA neurons develop from radial glial cells of the floor plate and
   their development is driven by early exposure to high levels of the SHH
   transcription factor^[70]29, which prevents expression of the PAX6
   transcription factor^[71]30 and sets these cells on an entirely
   different developmental path^[72]25,[73]28. As a result, mDA neuronal
   precursors follow a very unique signaling cascade, leading to the
   expression of a transcriptome that greatly differs from that of other
   DA neurons^[74]21,[75]25–[76]28,[77]31–[78]35. Their distinct identity
   is reflected in their function and current research indicates that this
   leads to their unique susceptibility to death in PD, which in turn has
   been directly associated with the classic movement symptoms of the
   disease^[79]6,[80]26–[81]28,[82]36. This is also supported by the
   observation that gene expression differences between murine and human
   mDA neurons, which translate into subtle functional differences, lead
   to incomplete PD phenotype in animal models^[83]19,[84]21.

   To investigate the disease mechanisms linked to the PINK1 mutation, we
   performed extensive single-cell RNA sequencing (SC-RNAseq) analysis
   using Drop-Seq^[85]37 at four different timepoints during mDA neuron
   differentiation^[86]23–[87]25. We generated four pairs of samples, each
   consisting of a PINK1-ILE368ASN and a control (17608/6^[88]38) cell
   line differentiated in parallel. The pairs were differentiated in
   succession so that they would be at a different stage of
   differentiation on the collection day (Fig. [89]1). This also means
   that they represent four independent biological replicates. Pairwise
   differential expression analysis was then performed between the
   PINK1-ILE368ASN and control cell line of each pair, with a constraint
   that genes must be strongly and consistently dysregulated in all pairs,
   hence at all timepoints, to be considered in our analysis. The reasons
   for this are listed in the discussion section. Using databases of known
   protein-protein interactions, we show that these genes form a network
   and that its members directly interact with all 19 protein-coding PARK
   genes associated with PD. This suggests that other PD-associated
   mutations may also be acting through this common network of genes.
   Furthermore, the pathway most strongly associated with the genes of
   this network is the Parkinson’s disease KEGG pathway. Subsequent
   proteomics analysis of differentiated cells confirmed the manifestation
   of the transcriptional modifications at the protein level. Our results
   point to the existence of a common disease mechanism that potentially
   underlies idiopathic PD and may represent a unifying perspective on PD
   progression that will guide future intervention strategies.

Fig. 1. Experimental design.

   [90]Fig. 1
   [91]Open in a new tab

   a Fibroblasts were used to generate human induced pluripotent stem
   cells (iPSCs), which were then used to generate mDA neurons.
   Differentiation was initiated concurrently in a PINK1-ILE368ASN and a
   control cell line, at three different timepoints, to obtain cells which
   reach different stages of differentiation on the same collection day
   (generating four independent pairs). The samples were collected and
   processed for SC-RNAseq at the same time to avoid batch effects.
   “P + 1” indicates that the iPSCs were passaged before new
   differentiation was initiated. Since D10 was not used in the pairwise
   analysis, we indicated “P + 2” between D15 and D6 differentiation
   initiation. b Heatmap illustrating the transitions in gene expression
   from iPSC markers (MYC and POU5F1-OCT3/4), to genes associated with mDA
   differentiation (PTCH1, FZDZ, HES1, OTX2, SLIT1, and LMX1A), and
   finally to an early expression of mature mDA markers (DCX and DDC).
   This is discussed in more detail in the text and in Figs. [92]3 and
   [93]4. The gene expression matrix used here consists of 4495 cells
   (39,194 genes). Colors correlate to normalized counts (z-score,
   centered, and scaled) of the indicated gene.

Results

   We performed a systematic differential expression analysis at a
   single-cell resolution between an iPSC line carrying the PD-associated
   ILE368ASN mutation in the PINK1 gene and age- and sex-matched control
   cell line (control 1–2 in ref. ^[94]38) during their parallel
   differentiation into mDA neurons (Fig. [95]1 and Supplementary
   Tables [96]1, [97]2). After preprocessing and quality-filtering, we
   used 4495 cells and 18,097 genes in our downstream analysis (Methods).
   For data integration, we performed a network analysis to identify the
   underlying key mechanisms of PD progression.

   Fibroblasts were isolated from a 64-year-old male with PD symptom onset
   at 33 years of age who was homozygous for the ILE368ASN
   (P.I368N/P.I368N) mutation in the PINK1 gene (Coriell Institute, Cat.
   No. ND40066). The fibroblasts were confirmed to have a normal karyotype
   (Supplementary Fig. [98]1). Reprogramming was done at Yale Human
   Embryonic Stem Cell Core (New Haven CT) using the Sendai virus. The
   normal karyotype of iPSCs was confirmed (Supplementary Fig. [99]2 and
   Supplementary Tables [100]9, [101]10). Their iPSC status was
   ascertained by staining for the POU5F1 (also known as
   OCT4)^[102]39–[103]42 and the TRA-1-80^[104]42,[105]43 iPSC markers
   (Fig. [106]2a), by expression of key iPSC status markers using
   sc-RNAseq (Fig. [107]2b), and by the TaqMan iPSC Scorecard
   Assay^[108]44,[109]45, which also confirmed the trilineage potential of
   the cell line^[110]44 (Fig. [111]2c).

Fig. 2. Classification of iPSC status.

   [112]Fig. 2
   [113]Open in a new tab

   a Immunocytochemistry (ICC). Staining for the iPSC markers POU5F1 (more
   commonly known as OCT3/4) and TRA-1-60 of iPSC colonies, prior to
   differentiation. DAPI was used to stain cell nuclei as a reference. b
   Expression of genes known to indicate iPSC status (MYC and POU5F1) and
   of genes identified by a differential expression analysis between iPSCs
   and differentiating cells (also see Supplementary Fig. [114]12). Colors
   correlate to normalized counts (z-score, centered, and scaled) of the
   indicated genes. TDGF-1 is expressed in iPS cells of high
   stemness;^[115]41 L1TD1, USP44, POLR3G, and TERF1 are essential for the
   maintenance of pluripotency in human stem cells;^[116]47–[117]50
   IFITM1, PRDX1, DNMT3B, DPPA4, and LIN28A and are associated with
   stemness^[118]51–[119]53,[120]137,[121]138. c Results of Scorecard
   analysis of iPSCs and embryonic bodies (EBs)^[122]44,[123]45. iPSCs are
   expected to show high expression of self-renewal genes and low
   expression of mesoderm, ectoderm, and endoderm markers. EBs are cells
   at an early stage of spontaneous differentiation. Scorecard analysis of
   EBs determines the iPSC line’s potential to differentiate into the
   three germ layers, hence, EBs are expected to express few or no
   self-renewal genes and to show expression of some mesoderm, ectoderm
   and endoderm markers: Ecto±, Meso±, Endo±.

Single-cell RNAseq (sc-RNAseq) analysis reveals gene expression panel for
direct classification of iPSCs’ stemness or pluripotency

   Staining for OCT/TRA proteins and Scorecard are common approaches for
   determining iPSC status (reviewed by Smith and Stein)^[124]42. Here we
   show that a panel of genes indicative of iPSC status is readily
   detectable by single-cell analysis and can be used to indicate iPSC
   status directly in the cells used in an sc-RNAseq experiment, rather
   than by staining or expression analysis of an independent sample, which
   in some cases may not reflect the iPSC status of the experimental
   sample. Furthermore, this may be useful in cases where the samples are
   no longer available, such as for data obtained from an sc-RNAseq data
   repository. In our dataset, we quantified the expression of genes
   commonly used to ascertain iPSC status (MYC^[125]46 and
   POU5F1^[126]39–[127]42) and showed that these can be readily detected
   by sc-RNAseq analysis (Fig. [128]2b). However, sc-RNAseq analysis comes
   with some limitations. In particular, it is not able to detect genes
   which are naturally expressed at low levels. We, therefore, created a
   list of genes associated with high stemness, i.e., expressed
   selectively in iPSCs exhibiting full stem cell phenotype, which are
   readily detectable in sc-RNAseq data, creating a link between an iPSC
   state characterized by standard techniques and a signature visible in
   sc-RNAseq data. The heatmap of top genes differentially expressed
   during the transition between iPSC and subsequent differentiation
   stages shows that the iPSCs express several genes associated with
   stemness (Fig. [129]2b). For instance, we detected the expression of
   TDGF-1, which was shown to be expressed by stem cells with the highest
   expression of stemness markers^[130]41. Additional genes expressed by
   the iPSCs were L1TD1, USP44, POLR3G, and TERF1 (essential for the
   maintenance of pluripotency in human stem cells^[131]47–[132]50), as
   well as IFITM1, DPPA4, and PRDX1 (associated with
   stemness^[133]51–[134]53).

   Based on our observations, we propose that the following panel of genes
   should provide a reliable indication of stemness in single-cell
   experiments: MYC (cMyc), POU5F1 (Oct4), LIN28A, TDGF-1, L1TD1, USP44,
   POLR3G, and TERF1 (Fig. [135]2b).

In vitro differentiation of iPSC-derived mDA neurons recapitulates the in
vivo process

   As stated by Bjorklund & Dunnett “expression of TH is not in itself
   sufficient to prove that a neuron is catecholaminergic, let alone
   dopaminergic”^[136]35. Hence, we made great effort to confirm that the
   neurons generated by our protocol display a true mDA phenotype.

   To confirm that our differentiation protocol (Supplementary
   Table [137]1) recapitulates the in vivo mDA differentiation path, we
   identified genes that are essential and specific to the in vivo mDA
   differentiation process (OTX2, EN1, LMX1B, LMX1A, FOXA2, MSX1, NR4A2,
   PITX3, and others) (Supplementary
   Table [138]3)^[139]25–[140]28,[141]33–[142]35 and analyzed their
   expression during the development of the control cell line at
   timepoints D0 (iPSCs), D6, D10, D15, D21 D26, D35, and D50, which
   represent the major developmental steps of the protocol (Fig. [143]3).

Fig. 3. In vitro differentiation of iPSC-derived mDA neurons recapitulates
the in vivo process.

   [144]Fig. 3
   [145]Open in a new tab

   a To illustrate the maturation of neuronal morphology and mDA status,
   differentiated neurons were stained at D25 and D35 for a neuronal
   marker MAP2 (red) and mDA markers (green): TH, PITX3, LMX1A, and DAT.
   While D25 neurons show short processes and low expression of mDA
   markers, D35 neurons show much longer axons and well-defined expression
   of mDA markers (green/red overlap resulting on orange/yellow). b
   Quantitation of mDA markers TH, ALDH1A1, and LMX1A, using absolute
   quantitation via qPCR. Each timepoint represents three independently
   differentiated biological replicates, amplified in duplicate. Standard
   error (SE) bars are the SE of biological replicates. The expression
   levels are standardized to total RNA and to the expression of the
   housekeeping gene GAPDH (see Methods). c Heatmap showing the expression
   of genes known from the literature to be involved and necessary for mDA
   neuron differentiation (Supplementary Table [146]3). Colors correlate
   to normalized counts (z-score, centered, and scaled) of the indicated
   genes. d The mDA differentiation gene expression profile recently
   published by Ásgrímsdóttir and Arenas (2020)^[147]21 was used to show
   the progression during differentiation, from iPSCs to radial glia
   (Rgl), to progenitors (Prog) and neuroprogenitors (NProg), and to early
   mDA neurons (DA). Genes used to determine the expression modules are
   listed in Supplementary Table [148]3. e Proportions of cells expressing
   the various phenotypes illustrated in (d). The gene expression matrix
   obtained by SC-RNAseq used here consists of 4495 cells (see Methods
   section).

   We first imaged cells at timepoints Day 25 and Day 35, as at this stage
   the cells should have developed mDA characteristics. Staining for key
   mDA protein markers TH, PITX3, LMX1A, and DAT, with MAP2 as a neuronal
   marker, confirmed the mDA phenotype (Fig. [149]3a). (The co-expression
   of these mDA markers with TH is shown in Fig. [150]4a.) At D25,
   neuronal cells possess only short processes and generally low mDA
   marker expression, but by D35 their axons are far longer and mDA marker
   expression is more defined and more robust. The mRNA expression of TH,
   LMX1A, and ALDHA1A was further validated by qPCR, and the trajectory of
   these genes’ expression indicated the development of mDA
   characteristics by Day 21 (Fig. [151]3b), in agreement with the imaging
   results at Day 25 (Fig. [152]3a).

Fig. 4. Classification of mDA status.

   [153]Fig. 4
   [154]Open in a new tab

   a TH positive neurons co-express mDA markers PITX3, LMX1A, and DAT in
   control (top) and PINK1 cell line (bottom), at D35. For images of
   individual targets see Supplementary Fig. [155]13 and for
   colorblind-friendly images see Supplementary Fig. [156]14. b Based on
   our SC-RNAseq data, cell lines cluster according to differentiation
   stage, indicating that gene expression is very homogenous between the
   control and the PINK1-ILE368ASN cell lines, which allows for the
   detection of even subtle alteration induced the presence of the PINK1
   mutation. c Trajectory of expression of TH and KCNJ6 (GIRK2), two mDA
   neuron markers. At D21 neurons begin to show TH expression, together
   with an expression of other mDA markers, which indicates that they are
   becoming early postmitotic mDA neurons^[157]25. Similar observations
   can also be made from an expression heatmap shown in Supplementary
   Fig. [158]12. The scale represents normalized counts.

   To characterize the differentiation process in more detail, we
   performed the sc-RNAseq analysis at eight timepoints of the
   differentiation process. Analysis of differentially expressed genes
   across timepoints revealed the expression of specific differentiation
   stage modules (Supplementary Fig. [159]12), in accordance with known in
   vivo stage-specific expression patterns (Fig. [160]3c and Supplementary
   Table [161]3). For example, in vivo, the development of mDA phenotype
   depends on the early high expression of Sonic Hedgehog (SHH), followed
   by the induction of Wnt signaling and the expression mDA-specific
   downstream pathways^[162]25,[163]26,[164]28 (Fig. [165]3c and
   Supplementary Table [166]3). Consistent with these in vivo
   differentiation steps, PTCH1, a receptor for SHH, and FZD7, a receptor
   for Wnt proteins (Fig. [167]3c) were among the highest-expressed genes
   on day 6 (D6) of the differentiation protocol. The presence of EN1 as a
   key entity was confirmed by qPCR (Supplementary Fig. [168]15), as its
   expression level was too low for detection by sc-RNAseq. The sc-RNAseq
   analysis again revealed that at Day 21 many factors that are specific
   to the mDA differentiation path, such as TCF12, ALCAM, PITX2, ASCL1,
   and DDC^[169]27,[170]54–[171]57, were among the most highly expressed
   genes (Fig. [172]3c and Supplementary Fig. [173]12). Overall, these
   observations confirm that our in vitro differentiation protocol does
   indeed recapitulate the in vivo differentiation of mDA neurons and
   produces genuine mDA neurons (PAX6-, ALDH1A1+, PITX3+, KCNH6/GIRK2+,
   NR4A2/NURR1+, and LMX1A+), rather than other types of DA neurons (PAX6+
   and ALDH1A3+) (Table [174]1 and Supplementary Fig. [175]3).

Table 1.

   DA neuron heterogeneity: mDA and non-mDA markers.
   Dopaminergic neuron type TH DDC AADC SLC6A3 DAT SLC18a2 VMAT PAX6 Other
   A8–10 midbrain dopaminergic neurons + + + + - ALDH1A1
   A11 periventricular nucleus (PVN) + + - + - ALDH1A3
   A12 arcuate nucleus (endocrine) + + + + - ALDH1A3 Dlx1
   A13 medial zona incerta + + - + +* ALDH1A3 Dlx1, SST
   A14 preoptic periventricular nucleus + + - + +* ALDH1A3
   A15 preoptic & endopeduncular area + - - +? + ALDH1A3
   A16 periglomerular cells, olfactory bulb + + + - + ALDH1A1
   A17 interplexiform cells in the retina + + + NKR
   [176]Open in a new tab

   When studying PD, it is important to ascertain that the DA neurons are
   in fact mDA neurons. In an in vitro differentiation system, simple
   marker combinations that normally distinguish mDA neurons, such as
   positional and anatomical information, are missing. We relied instead
   on molecular markers culled from the literature to monitor our
   differentiation protocols. (* A13 and A14 PAVH express PAX6 transiently
   during development. PAX6 is expressed early in development, whereas the
   remaining markers are expressed later and are markers of mature DA
   neurons. “?” indicates that the literature regarding the expression is
   not unanimous.)^[177]25,[178]34,[179]35,[180]139–[181]144.

   However, this assessment of the differentiation process of human mDAs
   was mostly based on the pattern of mDA differentiation gene expression
   in murine brains. We, therefore, compared our data with the recently
   outlined pattern of gene expression during human mDA^[182]21,[183]58.
   The pattern of gene expression during our in vitro differentiation of
   human iPSCs into mDA neurons matched the pattern of human mDA
   differentiation^[184]21 more closely than that of murine neurons,
   confirming the validity of our differentiation protocol.

   Using the gene expression groups associated with different stages of
   maturation, from radial glia (Rgl), progenitors (Prog), to neural
   progenitors (NProg), and finally to mDA neurons (DA)^[185]21
   (Supplementary Table [186]3), we could characterize the differentiation
   trajectory by the level of gene expression (Fig. [187]3d). We then used
   these gene groups to characterize individual cells with respect to
   their most likely cell type and determined the population dynamics by
   the percentage of cell types present at each timepoint (Fig. [188]3e).
   The analysis showed fast differentiation from iPSC state to a neuronal
   lineage by Day 6, and the subsequent maturation towards an mDA
   phenotype starting at Day 21, accompanied by the increasing prevalence
   of DA phenotype, from 20% at Day 21, to 28% at Day 26, and 61% at D35,
   after which it seemed to stabilize (Fig. [189]3d). This
   characterization further confirms that early mDA differentiation is
   achieved around Day 21.

The PINK1-ILE368ASN mutation is associated with persistently dysregulated
expression of nearly 300 loci

   To investigate the effect of the PINK1 mutation on mDA development, we
   differentiated the control and the PINK1-ILE368ASN cell lines in
   parallel (Figs. [190]1, [191]4) and focused on the early
   differentiation period, to increase our chances of finding the direct
   effects of PINK1-ILE368ASN, as described below. Co-staining of TH
   positive neurons with the midbrain dopaminergic markers PITX3, LMX1A,
   and DAT in both the control and PINK1 cell lines identified neurons at
   day D21 as early postmitotic mDA neurons^[192]25 with clearly neuronal
   morphology and no major differences between the cell lines
   (Fig. [193]4a).

   To investigate potential underlying mechanisms of the PINK1 mutation,
   differential expression between the two, in parallel differentiated,
   cell lines at each timepoint was determined and genes that were
   identified as differentially expressed at all four timepoints were
   identified. Each timepoint is an independent biological replicate,
   initiated at a different time and with cells of a different passage
   number. Control and PINK1- ILE368ASN cells co-clustered together based
   on their differentiation stage, from iPSCs, to day 6 (D6), D15, and D21
   (Fig. [194]4b), indicating that overall RNA expression was specific to
   differentiation stages, and rather uniform between cell lines, which
   was amenable to the identification of subtle gene expression
   differences due to the presence of a mutation in the PINK1- ILE368ASN
   cell line.

   The PINK1- ILE368ASN cells at D10 showed low viability, hence the D10
   timepoint was not included in the pairwise analysis. After
   preprocessing and quality-filtering (Methods and Supplementary
   Fig. [195]4), a total of 4495 cells (2518 control and 1977 PINK1 cells)
   and 18,097 genes were included in our analysis. UMAP analysis of the
   single-cell data revealed that gene expression was rather similar
   between the cell lines and mainly defined by differentiation stage,
   rather than by cell line origin (Fig. [196]4b). In accordance with the
   staining results (Figs. [197]3a, [198]4a), we observed the onset of
   expression of the mature mDA markers TH and KCNH6 (also known as GIRK2)
   on D21 (Fig. [199]4c).

   The analysis of pairwise differential expression at each timepoint
   (adjusted p values (p[adj]) <0.01 fold changes (FC) >0.1)
   (Fig. [200]5a) identified 14 genes that were upregulated and 13 genes
   that were significantly downregulated in the PINK1-ILE368ASN cell line
   compared to control (Fig. [201]5b and Table [202]2, indicated by “X”).
   Because iPSCs are very different from differentiating neuronal
   precursors, we next tested whether including iPSCs had
   disproportionately affected the results by excluding neuron-specific
   genes. Repeating the analysis on D6, D15 and D21 only identified 28
   genes that were upregulated and 27 genes that were downregulated at all
   three timepoints, including all genes previously identified
   (Table [203]2). As expected, excluding iPSCs resulted in the
   identification of a broader range of genes because genes that are
   differentially expressed only in the neuronal lineage were previously
   excluded due to the requirement that DEGs be dysregulated at all
   timepoints. However, both sets are equally valuable, as genes
   dysregulated even in iPSCs are likely to participate in systemic PD
   pathology, regardless of cell type, and may be relevant to a broader
   spectrum of PD pathology than the death of mDA neurons. Interestingly,
   most of the differentially expressed genes are already linked to PD,
   other PD mutations, or neurodegeneration (Table [204]2).

Fig. 5. Differentially expressed genes (DEGs) in a cell line homozygous for a
mutation in the PINK1 gene, compared to a control cell line, at three
timepoints during the differentiation of mDA neurons (D6, D15, and D21).

   [205]Fig. 5
   [206]Open in a new tab

   a Heatmap of the top DEGs. Each column corresponds to a timepoint for
   either control or PINK1 cells; each row shows the expression of one
   gene in individual cells. Colors correlate to normalized counts
   (z-score, centered, and scaled) of the indicated genes. b Top DEGs. The
   minimum fold change was increased to highlight the top differentially
   expressed genes. We identified the top 56 genes as our group A
   (Table [207]2); here we show the top five upregulated genes (left Venn
   diagram) and the top three downregulated genes (right Venn diagram). c
   Enrichment analysis performed using the STRING^[208]59 database. The
   top KEGG pathway associated with this dataset is Parkinson’s disease.
   The other three KEGG pathways identified were Spliceosome, Huntington’s
   disease, and Thermogenesis. Details are listed in Supplementary
   Data [209]5. The gene expression matrix used for the downstream
   analysis consisted of 4495 cells (39,194 genes) and differential
   expression analysis resulted in 292 DEGs, which were used to perform
   the enrichment analysis.

Table 2.

   The top genes dysregulated consistently in PINK1 vs. control cells
   across differentiation stages.
   Upregulated in PINK1 Downregulated in PINK1
   GENE incl. iPSCs excl. in STRING PD association Ref. GENE incl. iPSCs
   excl. in STRING PD association Ref.
   [210]AC009245.3 X Pseudogene ACTN1 PD ^[211]145
   ADGRG7 X rare var., mito ^[212]146 C6ORF48 X
   BBS2 CCDC144NL.AS1 X RNA
   CALM2P2 X Pseudogene CD59 X PD ^[213]147
   CALR PD ^[214]148 COMT PD ^[215]149
   CECR1 CRYZ X GWAS, PD Gene ^[216]63
   CMTM8 GWAS, PD ^[217]63 CYFIP1 X (via mTOR) ^[218]150
   EFCAB2 X microarray ^[219]151 DLK1 PD ^[220]152
   FOS rat, L-DOPA ^[221]153 ENAH GWAS, LRRK2 ^[222]154
   GOPC PARK7 (DJ-1) ^[223]63 EXOC5 Parkinson Dis.Map ^[224]155
   HNRNPC X binds Parkin ^[225]101 GPC3 reduced in DJ-1 mut. ^[226]62
   MALAT1 PD ^[227]156 HSPA8 X PD, LRRK2 ^[228]157
   MINOS1P3 X Pseudogene LGI1 PD ^[229]119
   MLF1 via HTRA2 ^[230]158 LMAN1 Parkin transloc. ^[231]104
   MORF4L1P1 X Pseudogene MYL12A binds Parkin ^[232]101
   MT-CYB mito ^[233]159 MYL6 X interacts with LRRK2 ^[234]160
   MTRNR2L1 X binds Parkin ^[235]101 NIPA2 X tremor ^[236]161
   NAP1L5 OSBPL8 via ZNF746, Biogrid ^[237]162
   NLRP2 X inflammasome ^[238]163 PALLD PD ^[239]164
   PTGR1 PGK1 X PD ^[240]165
   RP11.692N5.2 X Pseudogene RANBP1
   RP4.765C7.2 X Pseudogene SLC25A4 binds Parkin ^[241]101
   RSRP1 SNHG5 X RNA via miR-205, LRRK2 ^[242]166
   S100A6 X PD ^[243]167 SNHG8 RNA
   TCEAL5 X TYW3 X
   TSPYL5 Ubiquit. ^[244]168 ZNF37A X
   ZNF280D X GWAS* ^[245]169 ZNF880 X
   ZNF528
   ZNF528.AS1 RNA Gene
   [246]Open in a new tab

   Pairwise differential expression analysis of each timepoint (iPSCs, D6,
   D15, and D21), resulted in 14 genes that were upregulated and 13 genes
   that were downregulated in the PINK1-ILE368ASN cell line, compared to
   control (p_val_adj <0.01 and abs(avg_logFC) >0.1); these genes are
   marked with “x” in column “Incl. iPSCs”. Twenty-nine additional genes
   were identified in an analysis that included D6, D15, and D21, but not
   iPSCs. The “Excluded” column explains why a gene was not included in
   the protein–protein interaction network. These 56 top DEGs are later
   referred to as Group A. The gene expression matrix used for the
   downstream analysis consisted of 4495 cells (39,194 genes). *
   rs11060180.

   For an alternative definition of differentially expressed genes (DEGs),
   we used the maximum adjusted p value in a pairwise combination as an
   adjusted p value, and the average fold change that occurred in the
   pairwise comparison as a fold change threshold. With this approach, we
   retained only genes dysregulated in the same direction at all
   timepoints. This analysis led to 151 DEGs (named Group B, Supplementary
   Table [247]4), which included the previously identified genes of Group
   A, and of which 65 were upregulated and 86 downregulated compared with
   controls (p[adj] < 0.01 and FC > 0.1). Taking the mean of FC of the
   different timepoints enhanced the identification of DEGs because it
   reduced the effect of the variability between pairs due to their
   different differentiation states. Repeating the same analysis for the
   four timepoints (iPSCs, D6, D15, and D21), but taking into account only
   the absolute degree of change in iPSCs, yielded 172 genes (Group C,
   Supplementary Table [248]5). Repeating the analysis using only
   timepoints D6, D15, and D21 identified a total of 286 DEGs (Group D)
   (also see Fig. [249]6a and Supplementary Data [250]1). Together, when
   all analyses were pooled, we obtained 292 DEGs (six genes in Group C
   depended on the inclusion of iPSCs and did not appear in Group D, see
   Supplementary Data [251]1).

Fig. 6. Network analysis.

   [252]Fig. 6
   [253]Open in a new tab

   a Protein–protein interaction network based on known interactions
   available through the STRING^[254]59 and GeneMANIA^[255]61 databases.
   Only strong interactions were retained, predicted interactions or text
   associations were omitted (see Methods). Betweenness centrality was
   used to illustrate the relative importance of each node within the
   network through the level of its connectedness to other proteins. The
   larger the circle, the more partners the node is connected to. The
   colors represent the four DEG sets, with the top 56 DEGs (group A) in
   light blue, group B in purple, group C in dark green, and group D in
   light green. Each set consists of genes of the previous set plus
   additional genes identified by the new parameters. CHCHD2 (pink, part
   of group B) is a DEG, which has recently been identified as a PARK
   gene. Random selection of genes from genes detected by sc-RNAseq did
   not lead to a network formation (Supplementary Fig. [256]5). b DEGs
   which play a role in ubiquitination. Additional functional pathways are
   listed in Supplementary Fig. [257]7 and Supplementary Data [258]3.
   Specific connections to ubiquitination proteins are shown in
   Supplementary Fig. [259]8. c Based on the literature, 68% of the DEGs
   of this network are already known to be associated with PD (for
   references see Supplementary Data [260]4). Supplementary Fig. [261]9