Graphical abstract graphic file with name fx1.jpg [88]Open in a new tab Highlights * • CGD gene therapy efficiency depends on the level of chronic inflammation * • Loss of gene-corrected HSCs correlates with the level of interferon score in HSCs * • Chronic inflammation drives HSCs exhaustion and increased myeloid progenitors * • Machine learning identified 51 predictive markers of engraftment failure __________________________________________________________________ Sobrino et al. report heterogeneous efficacy of a gene therapy approach for CGD patients. Transcriptomic exploration at the single-cell level shows lower HSC frequency and elevated myeloid progenitor frequency in the most severe patients. This study unravels 51 markers in HSCs predictive of the engraftment failure of gene-corrected cells. Introduction Chronic granulomatous disease (CGD) is a recessive inborn error of immunity[89]^1^,[90]^2 caused by loss-of-function (LOF) mutations in the X-linked or autosomal genes that encode the five components of the NADPH oxidase complex.[91]^3 The complex’s membrane-bound catalytic core is a heterodimer of gp91^phox and p22^phox, encoded respectively by the X-linked CYBB gene and the autosomal CYBA gene. The regulatory part is a cytosolic heterotrimer composed of p40^phox, p47^phox, and p67^phox, encoded respectively by NCF4, NCF1, and NCF2.[92]^4 Following phagocyte activation, the NADPH oxidase complex assembles on the phagosomal membrane and produces reactive oxygen species (ROS). Patients with CGD suffer from specific, recurrent, invasive, life-threatening bacterial and fungal infections.[93]^5^,[94]^6 Prominent inflammatory manifestations (particularly affecting the respiratory and gastrointestinal tracts) are also common, especially in patients with the X-linked form of the disease.[95]^7^,[96]^8 In some patients, CGD is revealed by these inflammatory manifestations. Others present initially with unexplained granulomatosis, which is associated with a poor prognosis.[97]^8^,[98]^9 Patients routinely receive antimicrobial prophylaxis and, eventually, anti-inflammatory treatments according to the clinical manifestations. The only widely available, curative treatment is allogeneic hematopoietic stem cell transplantation (HSCT).[99]^10^,[100]^11^,[101]^12^,[102]^13 After conditioning, CD34^+ hematopoietic stem and progenitor cells (HSPCs) are transplanted; engraftment of the most immature hematopoietic stem cells (HSCs) in the bone marrow (BM) then enables full immune reconstitution. In the absence of a compatible donor for HSCT, gene therapy (GT) is a treatment option. Although several research groups have developed GT protocols for CGD, the previous clinical trials have been compromised by the absence of stable engraftment of the gene-corrected cells.[103]^14 In the first trials (without a conditioning regimen), the lack of engraftment was probably due to the absence of a selective advantage for the transduced cells.[104]^14^,[105]^15 The use of a low-intensity conditioning regimen in the subsequent trials with gammaretroviral vectors resulted in temporary engraftment, although insertional mutagenesis favored the development of myelodysplastic syndromes in a few patients.[106]^16^,[107]^17 More recently, significant improvements were achieved after the gammaretroviral vectors were replaced by self-inactivating lentiviral vectors in which a chimeric internal promoter drove gp91^phox expression specifically in myeloid cells.[108]^18 The combination of this new-generation vector with a full busulfan-based myeloablative conditioning regimen resulted in significantly better clinical and biological outcomes and a better safety profile (i.e., the absence of GT-related adverse events) in two trials (a trial in London sponsored by Généthon [[109]ClinicalTrials.gov identifier: [110]NCT01855685] and Kohn et al.'s investigator-led trial in the United States [[111]NCT02234934][112]^19). In 9 of the 13 treated patients, the stably engrafted cells cured the underlying X-linked CGD (X-CGD). Several recent studies have reported that chronic inflammation harms HSPCs in patients with CGD and patients with other conditions. Mice and humans with X-CGD have low HSC counts in the BM.[113]^20 Furthermore, human HSCs from patients with CGD showed rapid exhaustion after in vitro culture. In the presence of high levels of pro-inflammatory cytokines (such as IL-1β), mouse HSCs showed increased cycling and a lower long-term engraftment potential.[114]^20 Elevated levels of IL-18 and interferon γ (IFN-γ) have been observed in inflamed tissue from patients with CGD.[115]^21 Here, we report the results of a phase I/II clinical trial of GT (based on a G1XCGD lentiviral vector and gene-modified HSPCs) in four patients with X-CGD lacking a human leukocyte antigen (HLA)-compatible donor for HSCT. A fifth patient was included in the clinical trial but was not treated because the investigational medicinal product (IMP) did not meet the release criteria. The degrees of cell engraftment and clinical efficacy varied markedly from one patient to another. To fully understand the molecular alterations in HSCs associated with the success or failure of GT for CGD, we profiled the transcriptome of HSPCs at the single-cell level. We found that the gene-corrected cell engraftment defect observed in two patients was correlated with the upregulation of the IFN pathway. Last, we identified a set of biomarkers (including IFI44L and CEBPB) that were predictive of GT failure. Results Clinical presentation of patients We performed a nonrandomized, open-label, phase I/II clinical study ([116]ClinicalTrials.gov identifier: [117]NCT02757911) including five patients with X-CGD (referred to hereafter as P1 to P5). The patients had a severe deficiency in gp91^phox protein, due to a mutation in the CYBB gene and the absence of NADPH oxidase activity. Four patients received autologous CD34^+ cells transduced with a lentiviral GT vector after cryopreservation of the IMP. One patient (P3) was not treated because the level of CD34^+ cell transduction did not meet the release specification. The patients’ age at the time of GT ranged from 8 to 28 years. All four treated patients had severe X-CGD-related infections, some of which were active at the time of GT. Three of the patients (P1, P2, and P5) also had severe inflammatory manifestations ([118]Table 1). Prior to GT, all four patients received standard antimicrobial/antifungal prophylaxis and (for those with inflammatory features) long-term anti-inflammatory treatments. Table 1. Clinical features of the patients with X-linked CGD before and after GT, and characteristics of the infused, gene-corrected, autologous cell product P1 P2 P3 P4 P5 Age at GT (years) 8 19 6 23 28 X-CGD (CYBB) mutation c.779C>G; (p.Pro260Arg); exon 7 c.1083G>A; (p.Trp361X); exon 9 c.736C>T; (p.Gln246X); exon 7 c.469C>T; (p.Arg157X); exon 5 c.736C>T; (p.Gln246X); exon 7 Follow-up post-GT (months) 60 36 not infused 48 24 HSPC source BM + MPB MPB MPB MPB MPB HSPCs infused (10^6/kg) BM 3; MPB 5.86 3.24 not infused 15.67 14.27 VCN drug product BM 0.6; MPB 1.2 1.26 0.35 1.73 1.57 Transduction adjuvant: PGE2 − + − + + Busulfan conditioning, AUC (ng × h/mL) 85,478 77,330 not infused 71,973 73,890 DHR^+ neutrophils (%) At 12 months 14 1 not infused 33 <1 At last follow-up 27 <1 not infused 47 1 Treatments ongoing at GT antimicrobial prophylaxis, steroids, enteral nutrition, sleeping O[2] treatment antimicrobial prophylaxis, steroids antimicrobial prophylaxis, steroids antimicrobial prophylaxis, antifungal treatment antimicrobial prophylaxis, hydroxychloroquine Infectious history (before GT) multiple deep abscesses (S. marcescens), recurrent pneumonitis tibial osteomyelitis, liver abscess, Actinomyces hepatic abscess with portal hypertension mucormycosis, Aspergillosis, Clostridium severe invasive pulmonary aspergillosis, pneumonitis, salmonellosis, folliculitis, cervical adenitis aspergillosis, pneumonitis, salmonellosis, genitourinary Campylobacter, osteitis (S. marcescens) Inflammatory history (before GT) corticodependent inflammatory colitis, pulmonary granuloma corticodependent, long-lasting granulomatous cystitis resistant to multiple treatments early colitis, granulomatous gastritis folliculitis long-lasting severe colitis resistant to multiple treatments Clinical follow up after GT lung scan improvement, progression/relapse of gut inflammation requiring anti-inflammatory treatment decreased corticosteroids; at month 11, pulmonary aspergillosis infection, gastric hemorrhage; at month 24, lymphadenitis; at 3.5 years, MMUD HSCT, septic shock, deceased not infused marked improvement in thoracic fungal lesions; disappearance of folliculitis; off treatment clinically stable; antimicrobial prophylaxis; at month 7, submandibular lymphadenopathy [119]Open in a new tab AUC, area under the curve; BM, bone marrow; MPB, mobilized peripheral blood; GT, gene therapy; PGE2, prostaglandin E2; HSCT, hematopoietic stem cell transplantation; VCN, vector copy number; MMUD, mismatched unrelated donor; MRI, magnetic resonance imaging. In particular, P1 (8 years of age at the time of GT) presented with many deep abscesses, gut inflammation, and severe lung disease with infectious and inflammatory components; hence, P1 was receiving oxygen therapy and enteral nutrition in addition to steroids and antimicrobial prophylaxis. P2 and P5 (respectively 19 and 28 years of age at the time of GT) had similar clinical profiles, with very severe, long-lasting, corticoresistant inflammation and typical CGD-associated infections. Since infancy, P2 had presented with treatment-resistant granulomatous cystitis. He also had a history of tibial osteomyelitis and actinomycotic abscesses of the liver with portal hypertension, requiring surgery. P5 presented with long-lasting severe colitis that was refractory to various anti-inflammatory treatments, together with pulmonary aspergillosis, osteitis, and Campylobacter and Salmonella infections. In contrast to the other patients, P4 (23 years of age at the time of GT) did not have a history of inflammation but presented with life-threatening, invasive, treatment-resistant pulmonary aspergillosis, Salmonella infections, cervical adenitis, and folliculitis. In view of P4’s critical condition and the absence of other treatment options, compassionate-use GT was authorized; at that time, the gene correction process was under optimization by the addition of prostaglandin E2 (PGE2) (see below). Manufacturing and characteristics of the IMP IMPs were manufactured from HSPCs harvested from the BM (P1) or via leukapheresis (P1 to 5) after granulocyte colony-stimulating factor (G-CSF)/plerixafor cell mobilization. The gene-corrected cells were infused after targeted myeloablative conditioning (median [range] area under the curve for total exposure to busulfan: 75,610 [71,973–85,478] ng × h/mL) ([120]Table 1 and [121]STAR Methods). The infused CD34^+ cell doses ranged from 3.0 to 15.67 × 10^6/kg. The infused IMPs are described in detail in [122]Table 1, and the transduction procedure is summarized in [123]Figure 1A (see also [124]STAR Methods). Figure 1. [125]Figure 1 [126]Open in a new tab Patient follow-up after gene therapy, and clinical improvements (A) Clinical trial scheme. MPB, mobilized peripheral blood; D, day; IMPs, investigational medicinal products; TD, transduction. ∗Corticoid treatment only for P1; ∗∗PGE2, prostaglandin E2 only for P2, P4, and P5. (B) The VCN per cell, measured as a guide to changes over time in the level of gene marking in the treated patients’ CD15^+ neutrophils. (C) Percentage of neutrophils positive for dihydrorhodamine (DHR) 123 after phorbol myristate acetate stimulation, for each patient and at different time points post-GT. (D) gp91^phox membrane expression on neutrophils pre-GT and post-GT (last follow-up) for each patient, as measured with flow cytometry. (E) An axial chest CT scan (lung window) of P1 pre-GT and 48 months afterward (post-GT). Before GT, P1 presented with large ground-glass opacities, corresponding to pulmonary granulomas. Thoracic MRI (with a T2 fat-saturation sequence) of P4 before GT (repetition time [RT] 14,118 ms; echo time [ET] 95.48 ms) and 3 months after GT (RT 10,610 ms; ET 86.72 ms). For P4, the arrow indicates lesions caused by Aspergillus infection. P1 received a final IMP containing genetically modified CD34^+ HSPCs sourced from BM and mobilized peripheral blood (MPB) (G-CSF + plerixafor-mobilized leukapheresis), as specified in the initial protocol design. For P2, a low yield of CD34^+ cells after BM harvest prevented gene correction, and so the unmodified cell product was cryopreserved. He underwent two subsequent aphereses leading to below-specification levels of gene transduction. As a consequence, the transduction protocol was modified by the addition of PGE2 (which reportedly favors HSC transduction and repopulation).[127]^22 The addition of this transduction adjuvant considerably improved the level of lentiviral transduction (see [128]STAR Methods and [129]Figure S1). The following three procedures (for P4, P2, and P5) were therefore performed with the optimized protocol, starting from G-CSF + plerixafor (P4, P5)- or plerixafor-only (P2)-mobilized leukapheresis. Using classical HSPC phenotyping, we did not detect differences in the HSC frequency (defined as CD34^+Lin^−CD38^−CD133^+CD90^+CD45RA^− cells), neither in the IMP ([130]Figures S2A and S2C) nor in the apheresis before engineering ([131]Figures S2A and S2B), and we showed that the patients had received similar doses of HSCs per kilogram ([132]Figure S2D). Indeed, the addition of PGE2 during the transduction step was associated with a significantly greater vector copy number (VCN) both in pre-clinical tests and in the IMPs (p = 0.0022 and p = 0.0357, respectively) ([133]Figures S1A and S1B). Moreover, transcriptomic analysis of PGE2-treated and control HSPCs from P2 and P4 showed that the addition of this adjuvant was associated with a less inflammatory expression profile ([134]Figure S1C). For the four treated patients, the median (range) VCN in the IMP was 1.42 (0.99–1.73). Clinical outcomes After a myeloablative conditioning regimen, patients were infused with the IMP containing genetically modified HSPCs. The infusion was well tolerated in all cases. The only adverse events were related to the conditioning (e.g., mucositis), rather than the IMP. P1 presented with Staphylococcus epidermidis sepsis 6 days after infusion of the IMP, and P2 presented with cytolysis and cholestasis 14 days after infusion. These adverse events resolved after engraftment, and hematopoietic reconstitution was satisfactory for all patients; the median (range) time to neutrophil and platelet engraftment was 18.5 days (15–21 days) ([135]Figure S3A). As of January 2022, the median (range) follow-up period was 42 months (24–60). For all treated patients, the VCN in neutrophils ranged from 0.17 to 0.96 in the first month post-GT. P1 showed an initial decrease in the level of gene marking, which stabilized at around 10%–15% after a few months ([136]Figure 1B). Although this level was not optimal, it provided P1 with clinical benefit, particularly with regard to the regression of infectious manifestations and as shown by the post-GT lung scan results ([137]Figure 1E); this enabled P1 to discontinue nocturnal oxygen therapy, enteral nutrition, steroids, and antimicrobial prophylaxis. However, the inflammatory manifestations continue to progress (particularly in the gut and lung), requiring the recent introduction of Janus kinase (JAK) 1 and 2 inhibitors. In P2 and P5, a progressive decrease in the engraftment of gene-corrected cells was observed 2 to 3 months after GT, and the patients regressed to their pre-GT condition ([138]Figure 1B). Similar results were observed for monocytes, B cells, natural killer (NK) cells, and T cells. The level of gene marking was lower in T cells, given the absence of T cell depletion during the conditioning ([139]Figure S3B). Due to the recurrence of inflammation and infections, P2 underwent HSCT with an unrelated, partially matched donor (1 of 10 HLA alleles was mismatched) 3.5 years after GT. Twenty-seven days after the HSCT, P2 developed ultimately fatal septic shock. Seven months after infusion of the IMP, P5 presented with submandibular lymphadenopathy that resolved progressively with oral antibiotic treatment. The patient continued to receive antimicrobial prophylaxis, and his clinical condition is currently stable. Shortly after GT, P4’s life-threatening lung aspergillosis resolved completely, with ad integrum healing of the lytic costal erosions as early as 4 months post-GT ([140]Figure 1E). This stable, clinical benefit was associated with the presence of functional circulating neutrophils (50% of the normal proportion) ([141]Figure 1C). P4 resumed his education and is now working full time. He discontinued all treatments 2 months post-GT. The gene marking results were correlated with the results of the functional oxidative burst (dihydrorhodamine 123 [DHR]) assay; the proportion of positive neutrophils stabilized at 27% and 47% of neutrophils, for P1 and P4, respectively ([142]Figure 1C). At last follow-up, the expression of gp91^phox protein in CD15^+ neutrophils were significantly enhanced for P1 and P4 ([143]Figure 1D). The analysis of vector integration over time in the patients highlighted the polyclonal reconstitution of both peripheral blood mononuclear cells (PBMCs) and neutrophils ([144]Figure S4). The mean (range) number of unique integration sites at last follow-up was 3,252 (119–10,650) in PBMCs and 4,865 (158–15,344) in neutrophils. Lower values were observed for P2 and P5, due to the progressive loss of gene-corrected cells. Integrations close to oncogenes (such as MECOM [MDS/EVI1]) previously targeted by gammaretroviral vectors were present at a low frequency (below 2%) in all patients and did not increase over time. A low frequency of HSCs and a high frequency of myeloid progenitors To understand the interindividual differences in engraftment, we analyzed transcriptomic differences in PBMCs and HSPCs from patients versus healthy donors (HDs). This analysis highlighted the upregulation of the type 1 and 2 IFN response pathway in PBMCs and in HSPCs ([145]Figures S5A and S5B, respectively). We also used the ROMA method[146]^23 to quantify the activity of sets of genes in individual samples. This analysis did not reveal any interindividual differences among the patients’ PBMCs. However, P2’s and P5’s HSPCs had a higher IFNα score, and a more intense IFNγ score, relative to the other patients. In contrast, P4 (the patient with the best engraftment of gene-corrected cells) displayed an only slightly higher IFNα score ([147]Figures S5C and S5D). To further explore these interindividual differences in HSPC subpopulations, we performed single-cell transcriptomic analyses to determine the transcriptional profiles of 53,412 MPB HSPCs from the four treated patients with CGD and four HDs ([148]Figure 2A). Due to the continuum between the various stem and progenitor cells, the use of classical clustering analysis to distinguish the different cell types remains challenging.[149]^24 For this reason, and to preserve the strength of the single-cell resolution, we used an automated cell annotation method, Cell-ID,[150]^25 and 16 different BM HSPC reference signatures ([151]Table S1).[152]^26 We developed a single-cell RNA-sequencing (RNA-seq) pipeline for HSPC data processing and analysis as depicted in [153]Figure S6. Figure 2. [154]Figure 2 [155]Open in a new tab Low HSC frequency and elevated myeloid progenitor frequency, revealed by single-cell HSPC transcriptional mapping (A) Unsupervised analysis of 53,412 cells from merged CD34^+ MPB from four patients with CGD and four HDs, represented as two-dimensional UMAP plots. Each individual cell in our dataset was annotated using the Cell-ID method and reference BM HSPC signatures ([156]Table S1). HSC, hematopoietic stem cell; HSC-enriched, hematopoietic stem cell enriched; MPP, multipotent progenitors; MLP, multipotent lymphoid progenitors; ImP1 and ImP2, immature myeloid progenitors; NeutroP0, NeutroP1, NeutroP2, and NeutroP3, neutrophil progenitors; MonoDCP, monocyte and dendritic cell progenitors; BcellP, B cell progenitors; MEP1 and MEP2, megakaryocyte and erythrocyte progenitors; EryP, erythroid progenitors; MkP, megakaryocyte progenitors; EoBasMastP, eosinophil, basophil, and mast cell progenitors; NA, not annotated. (B) UMAP plots of 30,225 HSPCs from four HDs (left) and of 23,187 HSPCs from four patients with CGD (right), showing the HSC, BcellP, NeutroP0, and MonoDCP subpopulations. (C) Bar plots showing the percentages of HSC, BcellP, NeutroP0, and MonoDCP subpopulations in each patient and each HD. The HSC frequency was significantly lower in CGD patients than in the HDs (odds ratio [OR] = 0.53, 95% confidence interval [CI] = 0.33–0.46, p = 2 × 10^−1, using logistic regression). We also observed lower HSC frequencies in P2 and P5 than in P1 and P4 (OR = 0.58, 95% CI = 0.48–0.70, p = 3.78 × 10^−8, using logistic regression). The frequencies of the other cell types are shown in [157]Figure S7D. To define the most immature HSC subpopulation, we used a diffusion mapping approach to determine the origin in the trajectory map ([158]Figures S7A–S7C). We then compared the frequency of each subpopulation in the patients with that in the HDs ([159]Figures 2B and 2C). We showed that CGD patients were likely to have around half the number of HSCs found in HDs (odds ratio [OR] = 0.53; p = 2.2 × 10^−16, using logistic regression). Importantly, patients with engraftment failure (P2 and P5) had an even lower proportion of HSCs than patients with successful engraftment (P1 and P4) (OR = 0.58, p = 3.78 × 10^−8, using logistic regression). Moreover, P2 presented a high frequency of B cell progenitors and monocyte/dendritic cell progenitors ([160]Figure 2C). P5 had a high proportion of neutrophil progenitors (NeutroP0; [161]Figure 2C) and a low proportion of immature progenitors (ImP1; similar to common myeloid progenitors, [162]Figure S7D). No differences in frequencies of the various HSPC cell types were observed for the other two patients (P1 and P4), relative to HDs ([163]Figure S7D). An aberrant HSC profile, with a mixture of HSC/neutrophil signatures To further explore the abnormally large NeutroP0 subpopulation in P5, we used the Cell-ID method to identify cells that simultaneously matched multiple cell-type signatures. Thus, by testing each individual cell against the 16 reference signatures, each individual cell could be identified as simultaneously displaying signatures from several cell types ([164]Figure S8A). In fact, the majority of cells (ranging from 61% to 90%) are matching two or more signatures, while cells matching a single signature range between 6% and 26% of the total population ([165]Figure S8B). For example, the NeutroP0^Match population ([166]Figure S8C) encompassed not only the NeutroP0^ID population ([167]Figure S8D) but also cells displaying other top signatures, yet showing significant enrichments for the NeutroP0 gene signature. An UpSet plot of the various mixed signatures showed that there were 19 distinct combinations of the NeutroP0 signature with other cell types in P5 but only three distinct combinations in HDs (black arrow, [168]Figure S8E). We therefore looked further at the most frequent combinations in P5 that comprised NeutroP0 signatures ([169]Figure 3). This analysis revealed that 438 cells matched the NeutroP0, MPP, and All HSC signatures. This mixed signature (depicted in black in the uniform manifold approximation and projection [UMAP] plot, [170]Figure 4A) was found in P5 but not in the other patients or the HDs. In this pathological subpopulation, the top genes (defined using Cell-ID) included the CEBPβ transcription factor ([171]Figure 4B), which is typically expressed more in committed myeloid progenitors. In HD cells, CEBPβ was expressed from the NeutroP3 stage onward. In P5’s cells, CEBPβ was expressed in the most immature HSC subpopulation and in early HSC progenitors and the MPP population ([172]Figure 4C). These results suggest strongly that P5 not only presents a large NeutroP0 population but also has a strong alteration in the most immature HSC state, with aberrant expression of the CEBPβ myeloid factor. Figure 3. [173]Figure 3 [174]Open in a new tab Identification of HSPC subpopulations displaying mixed signatures UpSet plots showing the number of cells significantly matching one or more cell types (see [175]Figure S8A): All HSC, MPP, MonoDCP, BcellP, and NeutroP0 signatures and Others (i.e., the 11 other HSPC cell-type signatures) are shown. The top 12 lineage combinations are shown for each patient and for two HDs. A full UpSet plot covering the 16 cell types is shown in [176]Figure S8D. The arrows indicate cells identified as simultaneously displaying signatures for NeutroP0 and other cell type(s). P5 presented a larger number of NeutroP0 cells with mixed signatures compared with other patients and HDs. Dotted lines highlight the predominant cell types in P5 (NeutroP0) and P2 (MonoDCP and BcellP). Figure 4. [177]Figure 4 [178]Open in a new tab HSCs with an altered state and aberrant CEBPB expression (A) UMAP plots of cells significantly matching the All HSC and NeutroP0 signatures for each HD (n = 4) and CGD patient (n = 4). Cells matching All HSC and NeutroP0 cell types are shown in black, and cells matching with only one cell type are shown in red (All HSC) and green (NeutroP0). (B) UMAP visualization of CEBPB mRNA expression in each HD (n = 4) and CGD patient (n = 4). Normalized expression is represented by a color-coded gradient. (C) Boxplots of CEBPB mRNA expression in the HSC, HSC-enriched, MPP, NeutroP0, NeutroP1, NeutroP2, and NeutroP3 populations, in each HD (n = 4) and CGD patient (n = 4). In P2, the Cell-ID analysis highlighted a large number of cells with mixed BcellP-MonoDCP signatures ([179]Figure 3). This mixed signature was also detected (albeit to a lesser extent) in HDs and in the other patients ([180]Figure S9A). The gene signature detected with Cell-ID in this mixed BcellP-MonoDCP population evidenced the expression of genes known to have a role in B cell and dendritic cell lineages (e.g., IRF8, SPIB, TLR7, and TNFRSF17) and that were not detected in the equivalent population of HDs (in red in [181]Figures S9B and S9C). These results emphasized the closeness of the relationship between the BcellP and the MonoDCP lineages, both of which were unusually prominent in P2. The interferon pathway score highlights HSC alterations correlated with poor engraftment To better understand the molecular changes in the most immature HSCs, we used a model-based analysis of single-cell transcriptomics (MAST) to define differentially expressed genes (DEGs) in patients versus HDs.[182]^27 We identified 369 DEGs in the most immature HSC subpopulation and then tested for functional enrichment using the Molecular Signature Database.[183]^28 The IFNα, IFNγ, and TNFα pathways were more prominent in patients with CGD than in HDs ([184]Figure 5A). Figure 5. [185]Figure 5 [186]Open in a new tab An elastic-net model identified the inflammatory genes in HSCs that are predictive of engraftment failure after GT (A) MAST identified 369 DEGs in the HSCs in CGD patients (n = 4) versus HDs (n = 4). The top 10 pathways (in terms of p value, identified using a hypergeometric test and MSigDB and Hallmark gene sets) among the 369 DEGs in the HSCs are shown. In each pathway, genes that are upregulated in CGD (relative to HDs) are shown in red, and those that are downregulated in CGD are shown in blue. The false discovery rate (−log10 adjusted p value) is shown for each pathway. The numbers of upregulated and downregulated genes in each pathway are also shown. (B) UMAP plot of the interferon γ response pathway enrichment score for each HD (n = 4) and CGD patient (n = 4), determined with Cell-ID for each cell (see [187]STAR Methods). Red arrow indicates the HSC at the apex of the UMAP. (C) Percentage of HSCs presenting a significant interferon γ enrichment score (p < 0.01). (D) MAST identified 1,136 DEGs in the HSCs in each individual CGD patient versus HDs. Heatmap showing the 61 DEGs in the Hallmark interferon γ response signature. Genes identified as predictors in the following model are shown in bold (see [188]Figure S10). The color code shows the logFC for each gene in a patient’s HSCs (versus HDs). (E) Boxplot showing predicted engraftment scores per patient (elastic-net model, see [189]Figure S10), corresponding to the distribution of the 50 mean probabilities per patient for HSCs (test datasets). A probability <50% corresponds to a prediction of engraftment failure, and a probability >50% corresponds to a prediction of engraftment success. (F) The network of predicted interactions between the 51 ISGs and transcription factors in HSCs selected by the elastic-net model as being predictive of the engraftment failure observed in P2 and P5 (see also [190]Figures S10E and S10F). The network was generated using StringDB and clustered with k-means (k = 3); dotted lines show the edges between clusters. (G) Boxplot of the expression by HSCs of 10 representative ISGs (IFI44L, LGALS3BP, LY6E, EIF2AK2, STAT2, ISG15, IRF9, MX1, MX2, and SAMD9L) identified by the elastic-net model as being predictive of engraftment failure in P2 and P5 and shown for each HD (n = 4) and CGD patient (n = 4). Cell-ID allowed us to compare individual cell signatures with well-defined gene sets (such as the Hallmarks collection) associated with particular biological states or processes.[191]^28 The highest IFNγ response scores were found for P2 and P5, especially in most immature HSCs at the apex of the UMAP (red arrow, [192]Figure 5B). In contrast, HSCs in P1 and P4 (the patients with the best correction and engraftment) did not present with significant IFNγ response scores, as for the HDs ([193]Figure 5C). Patient P4 showed a significant IFNγ score but in more committed cells. To further understand these interindividual differences and DEGs, we also performed a MAST analysis for each individual patient’s HSCs. Sixty-one of the DEGs were part of the IFNγ pathway ([194]Figure 5D). The degree of deregulation was higher in P2 and P5 than in P1 and P4, in accordance with the IFN pathway enrichment score, which did not reach significance level in P1 and P4 in the HSC population ([195]Figure 5C). We hypothesize that a higher level of IFNγ activation could lead to HSC exhaustion. To test this hypothesis and to determine which IFNγ pathway together with transcription factor genes might contribute significantly to engraftment failure, we took advantage of the large number of samples provided by the single-cell RNA-seq transcriptomic profiling with 469 individual patient HSCs. Furthermore, we made use of a machine learning approach (elastic-net logistic regression) to predict graft ability for each individual cell[196]^29 ([197]STAR Methods and [198]Figure S10). We used Monte Carlo cross-validation with 50 iterations that demonstrate good predictive power, as shown by the scores for each patient ([199]Figures 5E and [200]S10A–S10D, median accuracy 0.97, median area under the curve 0.99, for 50 cross-validation models). This approach identified a set of 51 IFN genes and transcription factors as being predictive of the engraftment defect in P2 and P5 ([201]Figures 5F, [202]S10E, and S10F). The IFN-stimulated genes (ISGs) included IFI44L, MX1, STAT2, IRF9, and SAMD9L, all of which were significantly upregulated in P2’s and P5’s HSC subpopulation ([203]Figure 5G). In P1 and P4 (patients with successful engraftment), these genes were expressed to the same extent as in HDs or only slightly more. The model also selected predictive transcription factors, which interacted in a functional protein association network ([204]Figure 5F) linking CEBPB (already identified in P5) with other factors, such as JUND, SREBF1, and MAFG. Taken as a whole, these transcriptomic data identified specific biomarkers in CGD HSCs. Elevated inflammatory pathway activity was predictive of poor engraftment. HSC exhaustion revealed by impaired xenotransplantation of HSPCs from patients with severe CGD To further understand the changes in HSCs associated with defective engraftment in patients with severe CGD, we evaluated xenotransplantation in a humanized NOD-SCID-γc^−/− (NSG) mouse model. The transplanted HSPCs came from P4 and P5, whose IMPs were similar. Using an aliquot of the patient’s IMP, we infused engineered HSPCs into NSG mice (P4, 4.3 × 10^5 cells; P5, 3.5 × 10^5 cells; n = 4 mice per patient). As controls, we infused nontransduced cord blood (CB) (2.7 × 10^5 cells, n = 3) and a sample of MPB transduced with the same protocol as for the IMP (3.7 × 10^5 cells, n = 5). We then analyzed engraftment in the BM and spleen after 16 weeks. The mean level of BM chimerism was 35% in P5 recipients and 60% in P4 recipients; this difference was statistically significant (p = 0.0286; [205]Figure 6A). P5 recipients had a lower absolute human CD45^+ (hCD45) cell count than P4 recipients, although the difference did not reach statistical significance. Human BM HSPCs (defined as CD45^+CD34^+ cells) were also significantly less frequent in P5 recipients than in P4 recipients ([206]Figure 6B). These results were confirmed by a chimerism analysis of the spleen ([207]Figure 6C). Although P4 and P5 recipients had similar levels of gene correction in the IMP (1.73 and 1.58, respectively), the VCN in hCD45 cells from BM and spleen was significantly lower after transplantation, especially for P5 recipients (a mean value of 0.25 versus 1.05 for P4; p = 0.0286) ([208]Figures 6D and 6E). Similarly, the level of correction (estimated from the gp91^phox protein expression by the hCD45 cells) was significantly lower in P5 recipients than in P4 recipients (p = 0.0286, [209]Figure 6F). Figure 6. [210]Figure 6 [211]Open in a new tab Xenotransplantation of the patients’ corrected HSPCs in a humanized mouse model HSPCs from P4’s and P5’s IMPs were infused into NOD-SCID-γc^−/− (NSG) mice (n = 4 per group, VCN[P4] = 1.7, VCN[P5] = 1.6). Nontransduced CB samples (n = 3) and transduced MPB samples (n = 5, VCN[MPB] = 3.47) were used as controls, with the same (clinical) ex vivo cell engineering protocol. Mean ± SD. Engraftments in the BM and spleen were analyzed 16 weeks after transplantation. Mann-Whitney test was performed for the statistical analysis; ns, not significant; ∗p < 0.05. (A) Left, human chimerism (% hCD45^+/[hCD45^+ + mCD45^+]) in the BM 16 weeks after transplantation (P4, n = 4; P5, n = 4; MPB, n = 5; CB, n = 3). The level of chimerism was significantly lower in P5 than in P4 (p = 0.0286), also in comparison with P5/MPB (p = 0.0159) and P4/MPB (p = 0.0159). Right, the number of human CD45^+ cells in the BM was significantly lower in P5 than in MPB (p = 0.0317). (B) Frequency of human CD45^+CD34^+ cells in the BM. The frequency was significantly lower in P5 than in P4 (p = 0.0286) and also in comparison with P5/MPB (p = 0.0159). (C) Left, human chimerism (% hCD45^+/[hCD45^+ + mCD45^+]) in the spleen 16 weeks after transplantation, in the same mice. The level of chimerism was significantly lower in P5 than in MPB (p = 0.0159). Right, the number of human CD45^+ cells in the spleen. The level of chimerism was significantly lower in P5 than in P4 (p = 0.0286). (D) The VCN per cell was measured to assess the level of gene marking in total BM, using a droplet digital PCR (ddPCR) technique. The VCN was significantly lower in P5 than in P4 (p = 0.0286). (E) The VCN per cell was measured to assess the level of gene marking in the total spleen, using a ddPCR technique. The VCN was significantly lower in P5 than in P4 (p = 0.0286). (F) The frequency of gp91^phox^+ cells in human CD45^+ BM. The frequency was significantly lower in P5 than in P4 (p = 0.0286) and also in comparison with P5/MPB (p = 0.0159). These in vivo experiments demonstrated that P5’s HSPCs had a lower engraftment ability and gp91^phox expression than their counterparts from P4; this finding was in line with the corresponding clinical outcomes in the GT trial. Taken as a whole, our results showed that P5’s HSPCs presented with a chronic inflammatory profile and molecular alterations that strongly impaired their functional capacity. Discussion Our present results revealed that a severe inflammation score can profoundly alter the HSCs and compromise the effectiveness of GT in patients with CGD. GT remains a potentially curative treatment option for patients with CGD who lack an HLA-compatible donor for HSCT and do not present exacerbated inflammatory markers. We observed the engraftment of gene-corrected cells in two patients (P1 and P4), leading to a complete remission in P4 and to an intermediate level in P1 that was sufficient for clinical benefit. The significant, stable correction of HSPCs has been maintained for more than 4 years now and is correlated with neutrophilic NADPH oxidase activity. In contrast, P2 and P5 progressively lost the corrected cells. This was also observed for four child patients in two other GT clinical trials.[212]^19 The results of an in-depth single-cell transcriptomic analysis in our patients suggested that this defect might be linked to (1) a high inflammation score in the most immature HSCs and (2) the upregulation of specific biomarkers not currently detectable by classical immunophenotyping. The fact that HLA-identical HSCT gives excellent outcomes in patients with CGD (i.e., low graft failure and mortality rates) suggests that the HSC niche in the BM microenvironment is not significantly altered. In a multicenter study of allo-HSCT in 712 patients with CGD, the estimated overall survival and event-free survival rates at 3 years were 85.7% and 75.8%, respectively.[213]^10 These results suggest that the HSC niche is mostly normal and that exposure to chronic inflammation caused an intrinsic HSC alteration. Using single-cell transcriptome profiling, we identified specific inflammatory signatures (including IFNγ and IFNα responses) in CGD HSCs and myeloid progenitors (monocyte, dendritic cell, and neutrophil progenitors). Moreover, the two patients with the highest inflammation scores presented a high frequency of myeloid progenitors and a low frequency of immature HSCs. Our results are in line with the increased myeloid differentiation observed in response to various inflammatory emergency signals, such as IL-1,[214]^30 IFN, and lipopolysaccharide.[215]^31^,[216]^32^,[217]^33 Indeed, microbial infections and other stimuli (e.g., metabolic stress) can drive HSCs out of dormancy and favor proliferation and myeloid differentiation (facilitating the host’s defense). This emergency granulopoiesis is initiated by the key transcription factor CEBPβ.[218]^34 Signaling through G-CSF and STAT3 can induce a switch from CEBPα-dependent steady-state granulopoiesis to CEBPβ-dependent emergency granulopoiesis.[219]^35 The impact of persistent inflammation has also been reported in a mouse model of X-CGD, with greater HSC proliferation and differentiation toward myeloid lineages.[220]^20 P2 displayed an expansion of both MonoDC progenitors and B cell progenitors, which share a number of markers. These findings are reminiscent of the pro-B cell progenitor expansion that occurs after IFN stimulation[221]^36 and the reprogramming of myeloid lineages in a context of inflammation.[222]^37^,[223]^38 Even though acute inflammation can be beneficial, it is known that long-lasting, chronic inflammation strongly impairs HSC function. Whereas acute treatment with IFNα can promote the proliferation of murine HSCs, chronic IFNα activation compromises the HSCs’ repopulating activity.[224]^39 Several studies have shown that elevated IFN signaling in chronic infection is the primary cause of HSC exhaustion and depletion.[225]^39^,[226]^40^,[227]^41^,[228]^42^,[229]^43 A pathway analysis of the most immature HSCs indicated that the higher IFN score in P2 and P5 might be responsible for the loss of gene-corrected cells and for HSC exhaustion. In contrast, the intermediate level of IFN pathway activity in P1 and P4 might have helped to maintain a beneficial response and the HSCs’ repopulating ability. By taking advantage of regularized logistic regression and the large number of cells provided by single-cell analyses, we identified a set of 51 IFN genes and transcription factors that were upregulated specifically in P2 and P5 (including IFI44L, STAT2, IRF9, MX1, SAMD9L, and CEBPB) and that appeared to be predictive of defective gene-modified HSC engraftment. These transcriptomic alterations and biomarkers appear to be specific to the HSC compartment, as no significant differences in the inflammation score were observed in PBMCs, while they were detected in the global HSPC population. In view of clinical applicability, the targeted gene expression analysis of the predictive genes on the HSPC population should provide a reliable test to assess HSC fitness before patient enrollment. It has been shown that IFN pathway activation in HSCs involves STAT1 and IRF9 signaling pathways[230]^44 by forming the DNA-binding STAT1-STAT2-IRF9 ternary complex ISGF3, which then activates ISGs.[231]^45 The strong activation of the IFN pathway observed in patients with CGD resulted in marked overexpression of STAT1, STAT2, and IRF9 genes, especially in P2. This patient displayed a high frequency of monocyte/dendritic cell progenitors with strong inflammatory profile but also the upregulation of several stress-induced factors (such as JUND or SREBF1) in HSCs, which might have been responsible for the functional defects.[232]^46^,[233]^47 This situation was reminiscent of HSC exhaustion through chronic IFN pathway activation.[234]^39 Based on the strong activation of the Jun/Fos pathway, we cannot exclude the synergic contribution of additional inflammatory pathways participating in HSC exhaustion, such as the TLR4/TRIF pathway.[235]^33 P5 had a large neutrophil progenitor population and aberrant expression of CEBPB very early in the HSC differentiation process. G-CSF mobilization is known to induce CEBPβ in myeloid progenitors[236]^34^,[237]^35 and therefore could contribute to the exacerbated myeloid skewing in patient P5. However, G-CSF did not trigger similar skewing in P1 and P4 and the used of plerixafor alone as the mobilizing agent in P2 was not sufficient to avoid HSC exhaustion, suggesting that the main cause resides in the HSC intrinsic alteration driven by chronic inflammation. The epigenetically inscribed infection history is known to make HSCs more responsive to secondary stimulation.[238]^48 However, chronic lipopolysaccharide stimulation drives HSC exhaustion and dysfunction.[239]^49 We hypothesize that in P2 and P5, chronic IFN stimulation epigenetically blocked HSCs in an aberrant state and thus drove exhaustion. One of the downstream markers observed in both patients was sterile α motif domain-containing protein 9-like encoded by SAMD9L, an ISG-induced gene in which mutations are known to predispose to pancytopenia and myeloid malignancies. Indeed, gain-of-function (GOF) mutations in SAMD9L have been reported in people with ataxia pancytopenia syndrome.[240]^50 The antiproliferative effect of these GOF mutations led to greater DNA damage and apoptosis (responsible for BM hypocellularity). A secondary mutation (monosomy 7) would favor the development of myelodysplastic syndromes.[241]^51 Enhanced expression of SAMD9L (correlating with the higher IFN scores in P2 and P5) might therefore contribute to HSC exhaustion in a context of chronic IFN activation. Kohn et al. reported a higher frequency of stable correction and engraftment, for which there are no obvious clues to explain the difference with the present study.[242]^19 Engraftment failure does not seem to correlate with patient age, since in our case it was mainly observed in adult patients. To be noted is their higher frequency of the use of fresh cells compared with cryopreserved cell products that could constitute an additional stress factor for HSCs, already compromised by the chronic inflammation. The sometimes poor transduction ability in CGD HSPCs also prompted us to optimize the transduction procedure by adding PGE2; this adjuvant is known to favor HSC homing, survival, proliferation, and repopulation ability.[243]^22^,[244]^52^,[245]^53 PGE2’s pro-inflammatory role during vasodilatation, vascular leakiness, and pain has long been known,[246]^54 but this compound can also mediate anti-inflammatory effects,[247]^55 as also shown in our transcriptomic analysis. Despite its indubitable benefit, this short course of PGE2 was not enough to counter the HSC alteration induced by chronic inflammation in P2 and P5. Moreover, PGE2 does not completely restore transduction efficiency, which is lower than in HDs, probably due to the upregulation of genes encoding restriction factors like MX1, MX2, and IFITM3.[248]^56^,[249]^57 The expression of these factors by HSCs during an innate immune response inhibited lentiviral entry but could be overcome by exposure to cyclosporine H[250]^58 or other transduction enhancers that are currently being investigated. This aspect might be important in the further development of GT in the context of inflammatory diseases. Chronic inflammation in CGD might eventually favor the emergence of mutated clones with a proliferative advantage; in turn, this might lead to tumor events[251]^59 and so further highlights the need to control hyperinflammation. The impaired repopulating ability of CGD HSCs has been previously reported in a mouse model of X-CGD exposed to a high IL-1 concentration. Pre-treatment of X-CGD mice with anakinra (an IL-1R antagonist) improves HSC engraftment.[252]^20 More recently, p38MAPK (a downstream target of IL-1β) was identified in a CRISPR-Cas9 screening step as a druggable target for increasing HSC engraftment. Ex vivo culture of CGD HSPCs in the presence of a p38MAPK inhibitor increased chimerism significantly (1.5-fold).[253]^60 Inhibition of the JAK/STAT pathway would be another way to target the hyperactivated IFN pathway. Given that several studies have described encouraging results for JAK1 inhibition in type I interferonopathies,[254]^61^,[255]^62^,[256]^63 this approach could also be considered for controlling inflammation before HSPC harvesting in patients with CGD and thus avoiding HSC exhaustion. If ex vivo treatments are enough to improve HSC engraftment rate or if we need to control in vivo the inflammation before HSPC harvesting is now under deep investigation. Together with the results of a recently published study on GT for CGD, our present findings show that GT is a potentially curative treatment option in patients with CGD lacking an HLA-compatible donor. However, the specific clinical and cellular characteristics of good candidates for GT (notably with regard to the inflammatory background) need to be taken into account. Our present study identified an IFN-pathway-related transcriptional signature that was specific to HSCs from patients with poor engraftment. The present results might open the way to (1) specific anti-inflammatory treatments for patients prior to HSPC harvesting, (2) the optimization of ex vivo HSPC engineering, and (3) identification of predictive biomarkers for validating the GT product prior to infusion. Limitations of the study Statistical power to understand the loss of gene-corrected cells has been obtained thanks to single-cell RNA-seq and machine learning approaches on hundreds of individual cells. Still, exploring the efficiency of new therapeutics in rare immune deficiencies remains highly challenging due to low patient sample size. The 51 identified predictive markers will be useful for tracking patient HSPC status. Further HSPC transcriptomic profiling on additional CGD patients along with immunosuppressive treatments (such as JAK inhibitors) will provide confirmation of the most relevant biomarkers. STAR★Methods Key resources table REAGENT or RESOURCE SOURCE IDENTIFIER Antibodies __________________________________________________________________ CD3 antibody (clone BW264/56) FITC Miltenyi Biotec Cat#130-080-401; RRID: AB_244231 CD56 antibody (clone B159) APC BD Biosciences Cat# 555518; RRID: AB_398601 CD19 antibody (clone LT19) PE Miltenyi Biotec Cat# 130-091-247; RRID: AB_244223 StraightFrom® Whole Blood CD15 MicroBeads, human Miltenyi Biotec Cat# 130-091-058 CD14 antibody (clone M5E2) PercPCy5.5 BD Biosciences Cat# 550787; RRID: AB_393884 CD2 antibody FITC BD Biosciences Cat# 347404; RRID: AB_2868849 CD3 antibody FITC BD Biosciences Cat# 345763; RRID: AB_2811220 CD4 antibody FITC BD Biosciences Cat# 345768; RRID: AB_2868797 CD8 antibody FITC BD Biosciences Cat# 345772; RRID: AB_2868800 CD14 antibody FITC BD Biosciences Cat# 345784; RRID: AB_2868810 CD15 antibody FITC BD Biosciences Cat# 332778; RRID: AB_2868627 CD16 antibody FITC BD Biosciences Cat# 335035; RRID: AB_2868680 CD19 antibody FITC BD Biosciences Cat# 345776; RRID: AB_2868804 CD20 antibody FITC BD Biosciences Cat# 345792; RRID: AB_2868818 CD33 antibody FITC BD Biosciences Cat# 345798; RRID: AB_2868822 CD56 antibody FITC BD Biosciences Cat# 345811; RRID: AB_2868832 CD235a antibody FITC BD Biosciences Cat# 559943; RRID: AB_397386 CD133 antibody (clone 293C3) PE Miltenyi Biotec Cat# 130-090-853; RRID: AB_244346 CD34 antibody (clone 581) PECY7 Beckman Cat# A21691 CD38 antibody (clone HIT2) APC BD Biosciences Cat# 555462; RRID: AB_398599 CD45RA antibody (clone T6D11) APCVIO770 Miltenyi Biotec Cat# 130-096-604; RRID: AB_2660986 CD90 antibody (clone 5E10) BV421 BD Biosciences Cat# 562556; RRID: AB_2737651 CD45 human antibody (clone HI30) BV421 Sony Biotechnology Cat# 2120160 CD34 antibody (clone 581) APC-Cy7 Sony Biotechnology Cat# 2317570 gp91^phox antibody (clone 7D5) FITC MBL Bio Cat# D162-4; RRID: AB_591390 CD45 murin antibody (clone 30-F11) APC BD Biosciences Cat# 559864; RRID: AB_398672 __________________________________________________________________ Biological samples __________________________________________________________________ Gene Therapy patient peripheral blood sample Necker’s Hospital Biotherapy Clinical Investigation Center Mobilized peripheral blood HemaCare Cord blood Saint Louis’s Hospital __________________________________________________________________ Chemicals, peptides, and recombinant proteins __________________________________________________________________ SCF CellGenix 1018-050 FLT3-L CellGenix 1015-050 TPO CellGenix 1017-050 IL-3 CellGenix 1002-050 PGE2 Prostin E2 10mg/ml Pfizer X-Vivo 20 medium Lonza BESP1058F Sulfate de protamine, Protamine Choay 10 000UAH- 10ml Sanofi Busulfan Sigma B1170000 __________________________________________________________________ Critical commercial assays __________________________________________________________________ DNeasy kit Qiagen RNeasy micro kit Qiagen Chromium Single Cell 3’ GEM kit v3 10X Genomics __________________________________________________________________ Experimental models: Organisms/strains __________________________________________________________________ Mouse: NSG NOD.Cg-Prkdcscid Il2rgtm1WjI/SzJ The Jackson Laboratories Strain code 614 __________________________________________________________________ Oligonucleotides __________________________________________________________________ Primer Alb Forward GCTGTCATCTCTTGTGGGCTGT Primer Alb Reverse ACTCATGGGAGCTGCTGGTTC Probe Alb VIC- CCTGTCATGCCCACACAAATCTCTCC -TAMRA Primer HIV Forward CAGGACTCGGCTTGCTGAAG Primer HIV Reverse TCCCCCGCTTAATACTGACG Probe HIV FAM- CGCACGGCAAGAGGCGAGG -TAMRA Primer Alb Forward GCTGTCATCTCTTGTGGGCTGT Primer Alb Reverse ACTCATGGGAGCTGCTGGTTC Probe Alb VIC-CCTGTCATGCCCACACAAATCTCTCC-QSY Primer HIV Forward TCCCCCGCTTAATACTGACG Primer HIV Reverse CAGGACTCGGCTTGCTGAAG Probe HIV FAM-CGCACGGCAAGAGGCGAGG-IowaBlackFQ __________________________________________________________________ Deposited data __________________________________________________________________ Bulk RNAseq data Biostudies, EMBL-EBI [257]https://www.ebi.ac.uk/biostudies/studies/S-BSST958 Single cell RNAseq data Biostudies, EMBL-EBI [258]https://www.ebi.ac.uk/biostudies/studies/S-BSST959 Full code and post-processed single cell data Zenodo [259]https://doi.org/10.5281/zenodo.6580036 __________________________________________________________________ Software and algorithms __________________________________________________________________ Flowjo software (version 10.8) FlowJo [260]https://www.flowjo.com Prism software (version 9) GraphPad [261]https://www.graphpad.com R Studio Software (version 4.0.4) R Core Team [262]https://www.R-project.org/ Cell Ranger (version 3.0.2) 10X Genomics [263]https://support.10xgenomics.com/single-cell-gene-expression/softwa re/pipelines/latest/installation Cell-ID method Github [264]https://github.com/RausellLab/CelliD __________________________________________________________________ Others __________________________________________________________________ CliniMACS system Miltenyi Biotech FACSCanto BD Biosciences Viia 7 Applied Biosystems [265]Open in a new tab Resource availability Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Marina Cavazzana ([266]m.cavazzana@aphp.fr). Materials availability This study did not generate new unique reagents. Experimental model and subject details Study design and investigational therapy The study, sponsored by Genethon and registered under [267]NCT02757911, is a phase I/II non-randomized monocentric open-label study conducted at Hôpital Necker-Enfants Malades in Paris under the responsibility of Pr. S. Blanche. Inclusion and exclusion criteria are detailed in the protocol synopsis (see [268]Data S1, [269]supplemental information). In brief, eligible participants were male patients with X-CGD, aged 23 months and older, with molecular diagnosis including DNA sequencing and absent or substantially reduced (>70%) biochemical activity of NADPH-oxidase, and absence of 10/10 human leukocyte antigen-matched donor (sibling or unrelated). The primary objectives included the evaluation of safety and of efficacy by the biochemical and functional reconstitution of the progeny of engrafted cells and the stability of these effects at 12 months. The secondary objectives included clinical efficacy and longitudinal evaluation of augmented immunity against bacterial and fungal infections; assessment of hematopoietic stem cell transduction by the G1XCGD lentiviral vector and engraftment of the gene-modified cells. The study aimed to include up to 5 patients and the data analysis was planned to be mostly descriptive (see Consort diagram, [270]Data S2, [271]supplemental information). During the study, a Data Safety Monitoring Board (DSMB) reviewed the data on a regular basis. The study, still ongoing, includes a follow-up of 2 years to evaluate safety and immune reconstitution parameters, followed by a 3 year period to assess the long-term safety and efficacy. The protocol and informed consent documents were reviewed and approved by the french institutional ethical review board « Ile-de-France V » and by the french National Agency for the Safety of Medicines and Health Products (ANSM) following an agreement from the French Ministry of Research and Innovation for the use of genetically-modified organisms. The drug product consists of autologous CD34^+ cells transduced with the G1XCGD lentiviral vector. Following informed consent and eligibility confirmation, CD34^+ cells were recovered from bone marrow or by apheresis, were transduced ex vivo by the G1XCGD lentiviral vector and were frozen until administration to the patient. Patients received busulfan myeloablative conditioning with pharmacokinetic monitoring. After a wash-out period of 24-48 hours the drug products were infused intravenously (i.v.) through a central venous line. Patient vital signs and clinical condition were monitored closely during and after the infusion for adverse reactions. Patients All the patients were treated in the Pediatric Immunohematology Department or the Adult Hematology Department at Necker Children's Hospital (Paris, France). The IMP was manufactured in the hospital's Cell and Gene Therapy Laboratory at the Biotherapy Department. The follow-up included regular patient visits in the Center of Clinical Investigations at Necker Enfants Malades Hospital and laboratory tests. This included clinical status assessment, adverse event recording, immune cell hematological reconstitution, gene marking in cell subpopulations (VCN analyses), gp91^phox expression in specific cell subsets, and the DHR oxidative burst assay used to assess the activity of NADPH oxidase. Additional cell characterization assays were performed on an ad hoc basis. Healthy donors Mobilized peripheral blood (MPB) samples from healthy donors were provided by HemaCare (Northridge, CA, USA). CD34^+ cells were mobilized with G-CSF and plerixafor (for HD1-2) or with plerixafor only (for HD3-4). HD5-7 were mobilized with G-CSF, and the CD34^+ cells were harvested and immunoselected in the Department of Biotherapy at Necker Children's Hospital. The HDs provided written, informed consent to the use of their samples for research purposes, and their data were anonymized. No nominative data concerning the donor were sent to the investigators. Cord blood was obtained from a biological resource center (Centre Ressources Biologiques (CRB)) – Banque de Sang de Cordon) at Saint-Louis Hospital (Paris, France). HSPCs were isolated using standard Ficoll density gradient centrifugation and then magnetic selection with anti-CD34^+ antibody. Blood samples from HD8-12 were obtained from the French Blood Establishment (Etablissement Français du Sang, Paris, France; reference: C CPSL UNT-N°18/EFS/032). Again, the HDs provided written, informed consent to the anonymous use of their samples for research purposes. PBMCs were isolated using standard Ficoll density gradient centrifugation. Mouse experiments and xenotransplantation assays All animal procedures were approved by the animal care and use committee at the University of Paris (Paris, France; February 16^th, 2021) and the French Ministry of Agriculture (APAFIS#29592-2020120216106476). The procedures were performed in accordance with European Union (EU) Directive 2010/63/EU. NOD-SCID-γc^-/- strain (NSG) mice were obtained from Charles River Laboratories. 3.5 to 4.2x10^5 engineered HSPCs from a patient’s IMP were injected into 16 NSG mice previously conditioned with one dose per day of busulfan at 15 mg/kg (45 mg/kg in total). Engraftment in BM, spleen and thymus were analyzed after 16 weeks, using flow cytometry. The antibodies used are described in the [272]key resources table. Culture conditions for the MPB control were the same as in the clinical trial: 18 h of pre-activation in a cytokine cocktail (SCF: 300ng/ml, FLT3L: 300ng/ml, TPO: 100ng/ml, IL3: 20ng/ml), addition of 10 μM PGE2 2h before transduction, and then 2 rounds of transduction with the gp91^phox clinical vector and. The VCN in the MPB control sample (3.47) was measured in a ddPCR assay (see below). The cells from the CB control were not transduced or cultured. The VCN in the mice was assessed by ddPCR assay on a Bio-rad QX200 ddPCR System. Quantitative PCR on droplets was performed using TaqMan PCR Master Mix for probes in an Applied Biosystems SimpliAmp thermocycler, using a standard protocol. 80 ng of total gDNA, 900 nM primers and 250 nM probes were used in a total volume of 17 μL for absolute quantification with the droplet reader. The primers and probes sequences used are related in the Key Ressources Tables. Fluorescence in PCR-positive droplets was quantified according to a Poisson distribution, and VCNs were calculated according as (PSI∗2)/(ALB). Method details G1XCGD lentiviral vector production The lot of clinical-grade G1XCGD lentiviral vector was manufactured at Yposkesi (Evry, France) under good manufacturing practices using identical processes as lots produced for other studies.[273]^19 Briefly, the G1XCGD vector was produced by transient transfection of 293T cells with plasmids encoding the G1XCGD transfer cassette, HIV-1 gag/pol, HIV-1 rev and the VSV-G glycoprotein. The viral particles were purified from culture supernatants following clarification, ion exchange chromatography, tangential flow filtration and gel filtration steps, formulated in X-Vivo 20 medium (Lonza), aliquoted and cryopreserved at < -70°C. The lot of vector used to treat patients titered 3.7x10^9 infectious genomes (IG) ml^-1 measured at Genethon by qPCR using HCT116 colon carcinoma cells, and 2.1x10^4 HIV1 P24 core antigen (P24) ml^-1 as measured by ELISA assay. The lot of vector tested negative for replication-competent lentivirus and was conform to all release specifications. G1XCGD lentiviral vector-modified CD34^+ cell product manufacture Cells were manufactured onsite at the Cell and Gene Therapy Laboratory in the Biotherapy department of Necker-Enfants Malades Hospital in Paris. Patient’s cells were obtained from apheresis procedures after a mobilization regimen based on G-CSF and plerixafor, and when appropriate from bone marrow (BM) harvest under general anaesthesia. A back-up harvest of at least 3x10^6 unmanipulated CD34^+ cells per kg was retained in case of failure of hematopoietic reconstitution following gene therapy. As repetitive G-CSF mobilizations could have a detrimental effect on HSC fitness,[274]^64^,[275]^65 after two unsuccessful collections and engineering of P2 HSPC, a Plerixafor-only mobilization was performed for the third apheresis to avoid further detrimental effects of G-CSF treatment. The transduction procedure is summarized in [276]Figure 1A. Briefly, CD34^+ cells were immunoselected using the CliniMACS system (Miltenyi Biotec). The selected CD34^+ cells were pre-activated into cell culture in serum-free medium (X-vivo20, Lonza) with recombinant human cytokines (stem cell factor at 300 ng/ml, flt-3 ligand at 300 ng/ml, thrombopoietin at 100 ng/ml and interleukin-3 at 20 ng/ml, CellGenix). On the next two successive days, the G1XCGD lentiviral vector was added to the cell culture at a final concentration of 1x10^8 IG.ml^-1. PGE2 was added as transduction adjuvant for the manufacturing of P2 (third manipulation), P4 and P5 IMP. The following day the cells were removed from culture, washed and cryopreserved, to allow the release quality controls. The IMP was infused after targeted myeloablative conditioning with a minimum CD34^+ cell dose to be administered of 3x10^6 cells/kg body weight. Determination of the VCN Genomic DNA was extracted from HSPCs in the IMP (14 days after transduction) and during the follow-up from sorted neutrophils, monocytes, T cells, B cells and NK cells and on total PBMCs using a DNeasy Kit (Qiagen). The VCN was determined in a quantitative PCR assay (Viia 7, Applied Biosystems) and the PSI and ALB human probes (see [277]key resources table). The DHR assay Neutrophils were stimulated with phorbol myristate acetate to induce superoxide anion production, according to the manufacturer’s protocol. The non-fluorescent dye DHR is reduced by H[2]O[2] and then converted into fluorescent rhodamine, which is quantified using flow cytometry. Isolation of mononuclear cells In line with the trial protocol, BM and/or MPB was collected for IMP manufacturing and peripheral blood was sampled regularly during the follow-up period. Mononuclear cells were isolated from PB or MPB using standard Ficoll density gradient separation. The absolute lymphocyte count was determined using Trucount Tubes (BD Bioscience). Flow cytometry The neutrophil subpopulation was purified from PB on a column with magnetic beads and fluorochrome-coupled anti-CD15 antibodies. Monocytes, T cells, B cells and NK cells were sorted on a cell sorter (FACSAria II, BD Biosciences), using fluorochrome-coupled antibodies against CD14, CD3, CD19, and CD56. Total PBMCs were surface-stained for gp91^phox, using an anti-flavocytochrome b558 7D5 clone (human) mAb-FITC (MBL Bio) and gating for neutrophils. The patients' HSPCs were characterized using a multilabeled panel with the antibodies listed in the [278]key resources table: lineage cocktail, CD34, CD133, CD38, CD90, and CD45RA. Staining was analyzed with a FACSCanto II cell analyzer. Analysis of vector integration sites Integration site were identified on PBMC and neutrophil samples, using the S-EPTS/LM-PCR protocol, an advanced version of EPTS/LM-PCR,[279]^66 and thereafter analyzed using the GENE-IS tool suite.[280]^67 Bulk RNA-seq RNA was isolated using RNeasy Micro Kit (Qiagen) with a DNase step. RNA integrity and concentration were assessed using capillary electrophoresis and the Fragment Analyzer (Agilent). RNA-seq libraries were prepared from 100 ng of total RNA, using the Universal Plus mRNA (Nugen-Tecan). The amplified cDNA produced was sequenced on a NovaSeq6000 system (Illumina). There were ∼50 million reads per library. The raw read counts were normalized with DESeq2 package, based on the library size and testing for differential expression between conditions.[281]^68 Coding genes were extracted from gencodeV30. Next, the noise filter was used to retain only genes that had at least one sample with an expression value greater than 20 before the pathway enrichment analysis. Normalized enrichment scores were calculated for all deregulated coding genes, using GSEA software.[282]^69 Gene set enrichment was investigated with MSigDB, using a hypergeometric test on a pre-filter dataset (p<0.05 and fold-change (FC) >1.2 or <1/-1.2). The output false discovery rate had to be below 0.05. Representation and quantification of module activity (ROMA) was applied to DEGs in PBMCs and HSPCs. ROMA calculate a module score for a set of samples and is based on the simplest single-factor linear model of gene regulation whose first principal component approximates the expression data.[283]^23 Single-HSPC RNA-seq Library preparation Frozen HSPCs from each individual were thawed and resuspended in PBS + 1% BSA. The cell preparation was loaded onto a Chromium Single-Cell Chip (10x Genomics) for co-encapsulation with barcoded Gel Beads at a target capture rate of ∼7000 individual cells per sample. Captured mRNAs were barcoded during cDNA synthesis, using the Chromium Single-Cell 3^’reagents v3 (10x Genomics) according to the manufacturer’s instructions. All samples were processed simultaneously with the Chromium Controller (10x Genomics), and the resulting libraries were prepared in parallel in a single batch. We pooled all the libraries for sequencing in a single SP Illumina flow cell. Libraries were sequenced with 28 read 1 cycle containing cell-identifying barcodes and unique molecular identifiers (UMIs), 8 i7 index cycles, and 91 read 2 cycles containing transcript sequences on an Illumina NovaSeq 6000 (Illumina). Sequencing reads were demultiplexed and aligned with the human reference genome (GRCh38), using the CellRanger pipeline v3.1. The pipeline for data processing and analysis of scRNAseq data is shown in [284]Figure S6. Integration and data pre-processing Empty droplets were excluded with DropletUtils package, with an FDR threshold of 0.01. Cells with more than 15% of mitochondrial genes and less than 3000 UMI were removed. As HSPCs differ in their maturity (translating into difference in expression abundance from one cell to another), the expression matrix for each sample was normalized using deconvolution rather than standard library size methods.[285]^70 The gene expression was then restricted to protein-encoding genes. The 6000 highly variable genes were found with the Seurat FindVariableFeatures function and its default parameters. Quantification and statistical analysis Single-HSPC RNA-seq Cell-ID annotation of individual cells, using HSPC reference signatures Cell-ID is a robust statistical method for gene signature extraction and cell identity recognition on the basis of single-cell RNA-seq data.[286]^25 It incorporates a multiple correspondence analysis and simultaneously represents cells and genes in low-dimension space. The genes are then ranked by their Euclidean distance from each individual cell, which provides unbiased per-cell gene signatures. Using published data,[287]^26 Cell-ID, and the 200 most specific genes, we extracted 16 reference signatures: HSC, MPP, MLP, ImP1, ImP2 (corresponding to common myeloid progenitors), NeutroP0, NeutroP1, NeutroP2, NeutroP3, MonoDCP (corresponding to granulocyte-monocyte progenitors), BcellP, MEP1, MEP2, EryP, MkP, and EoBasMastP. The Cell-ID method defines the gene ranking in each cell in the dataset (53,412 cells in total), evaluates whether a cell accurately matches a particular reference signature, and determines the cell's identity (Cell ID) on the basis of the top p-value (p<0.01) ([288]Figure S8A). The enrichment score is based on the -log[10](p-value). HSC identification and the diffusion map Since the annotated population was enriched in HSCs (corresponding to 30% of all HSPCs), we combined diffusion map analysis (for determining the differentiation trajectory ([289]Figure S7A)) with an analysis of the enrichment strength for Velten et al.'s HSC signature (to determine the origin of the diffusion map ([290]Figure S7B) and isolate the most immature HSC subpopulation, corresponding to 3% of the total HSPC ([291]Figure S7C)). The other annotated HSCs are referred to as "HSC-enriched". The "All HSC" subpopulation includes the most immature HSCs and the HSC-enriched subpopulations. Following cell-type annotation, samples were integrated by applying the Harmony package using the first 30 principal component axis and the default parameter as input. The Cell-ID score for signaling pathway enrichment The Cell-ID method was used to assess the statistical enrichment of individual-cell gene signatures vs. signaling pathway gene sets (such as Hallmark gene sets, MSigDB collections, v7.5.1) based on hypergeometric test p-values with Benjamini–Hochberg correction for the number of tested gene signatures. Enrichment scores were calculated as the -log[10](p-value) in the test. A cell was considered to be enriched in a given pathway when the score was >2 (p<0.01). Cell-ID identification of mixed signatures, and UpSet plots representation To further understand the heterogeneity and diversity of cell state among the cells, we took advantage of Cell-ID enrichment system to identify cells that were significantly enriched (p<0.01) for several reference signatures. These cells were then represented on an UpSet plot with the UpSetR package for all labels ([292]Figure S8D) or on selected labels such as NeutroP0, BcellP, MonoDCP, MPP, All HSC and others ([293]Figure 3). Identification of deregulated genes with MAST In order to identify DEGs in HSC subpopulations, we made use of the MAST approach[294]^27 ([295]https://github.com/RGLab/MAST) based on statistical models tailored to single-cell data, allowing inference for genes with sparse expression. These models can handle a more complex variance structure, such as expected correlations between cells derived from the same individual. DEGs were identified with a Hurdle model (implemented with MAST v1.16.0) testing the 6000 highly variable genes in the dataset, and adjusting with a cellular detection rate parameter that correspond to the number of genes detected in a cell. To analyze the enrichment pathway, we applied a hypergeometric test for pathway enrichment using Hallmark geneset, MSigDB database. Elastic-net logistic regression for the identification of predictive markers We used an elastic-net logistic regression model[296]^29^,[297]^71^,[298]^72 with the glmnet package in order to predict the HSCs’ ability to engraft or not. We constructed the model at the cell level and defined the capacity of engraftment based on the patient it belongs to (success for the 304 HSCs of P1 and P4, failure for the 165 cells of P2 and P5). The genes of interest used as variables were the pre-selected 239 IFNa, IFNg genes and transcription factors differentially expressed in at least one patient ([299]Figure S10). We performed cross-validation using the caret package to determine the optimal lambda (6.87e-05) and alpha (0.7) parameters on a training dataset composed of 75% of patients’ HSCs. Then we performed a Monte Carlo cross validation by randomly splitting the HSCs in a training set (75% of the cells) and a test set (25% remaining HSCs) 50 times with the tuned parameters to check the stability of the results. We obtained AUC values between 0.98 and 1, and accuracy between 0.95 and 1 ([300]Figures S10A–S10C). Using those models, we determined the selection’s frequency of each gene and put a threshold at 70% to select the most informative ones ([301]Figure S10D). 78 significantly contributing factors were selected as engraftment predictors with this method. We then concentrated on the 51 factors with a negative estimate (i.e. corresponding to detrimental factors for engraftment that were upregulated in P2 and P5) ([302]Figures S10E and S10F). The network of genes selected by the elastic-net model as being detrimental for engraftment was visualized using StringDB ([303]Figure 5F). Other statistical analysis Statistical analyses in [304]Figures 6 and [305]S1 were performed using GraphPad Prism9 software (GraphPad, La Jolla, CA, USA) as indicated in the figure legend. Additional resources Here, we report the results of a Phase I/II clinical trial of GT ([306]NCT02757911) in four patients with X-CGD lacking a human leukocyte antigen (HLA)-compatible donor for HSCT, sponsored by Genethon. A fifth patient was included in the clinical trial but was not treated because the investigational medicinal product (IMP) did not meet the release criteria. Acknowledgments