Abstract

   Relapsed pediatric B-cell acute lymphoblastic leukemia (B-ALL) remains
   one of the leading causes of cancer mortality in children. We performed
   Hi-C, ATAC-seq, and RNA-seq on 12 matched diagnosis/relapse pediatric
   leukemia specimens to uncover dynamic structural variants (SVs) and 3D
   chromatin rewiring that may contribute to relapse. While translocations
   are assumed to occur early in leukemogenesis and be maintained
   throughout progression, we discovered novel, dynamic translocations and
   confirmed several fusion transcripts, suggesting functional and
   therapeutic relevance. Genome-wide chromatin remodeling was observed at
   all organizational levels: A/B compartments, TAD interactivity, and
   chromatin loops, including some loci shared by 25% of patients. Shared
   changes were found to drive the expression of genes/pathways previously
   implicated in resistance as well as novel therapeutic candidates, two
   of which (ATXN1 and MN1) we functionally validated. Overall, these
   results demonstrate chromatin reorganization under the selective
   pressure of therapy and offer the potential for discovery of novel
   therapeutic interventions.

   Subject terms: Acute lymphocytic leukaemia, Computational biology and
   bioinformatics
     __________________________________________________________________

   In this study the authors utilise Hi-C, ATACseq, and RNA-seq data from
   12 diagnosis/relapse matched leukemia specimens to reveal 3D genome
   organization as a critical factor in reshaping the transcriptional
   output of leukemia cells under the selective pressures of therapy.

Introduction

   Relapsed pediatric B-cell acute lymphoblastic leukemia (B-ALL) remains
   one of the leading causes of cancer mortality in children^[48]1,[49]2.
   While five-year survival rates for newly diagnosed pediatric B-ALL have
   improved and now approach 90%, up to 20% of patients will suffer
   relapse and face a dismal prognosis^[50]3. Although recent therapeutic
   approaches to treating relapsed B-ALL seem encouraging, leukemia
   sub-clones continue to emerge through the selective pressures of
   therapy^[51]4. Therefore, investigating the mechanisms responsible for
   cancer recurrence and therapy resistance is crucial for preventing and
   treating B-ALL relapse.

   Previous studies have focused on the discovery of relapse-enriched
   somatic mutations and copy number changes, which proved to be fruitful
   in the identification of mutations involved in resistance to one or
   more drugs used in treatment^[52]5–[53]8. Our recent work characterized
   the evolution of the epigenetic landscape of B-ALL from diagnosis to
   relapse. While changes in the epigenome varied widely across patients,
   we were able to identify shared relapse-specific superenhancers along
   with concordant gene expression changes. We also found convergent
   evolution of histone modifications on pathways known to be associated
   with resistance, such as p53, JAK/STAT, and Ras, thus highlighting the
   importance of chromatin changes in disease progression^[54]9. Although
   studies of the linear genome have been enormously productive in
   yielding fundamental insights into B-ALL biology and therapeutic
   options, recent studies suggest that an additional critical level of
   control of the epigenome is connected to the 3D structure of
   chromosomes.

   Thus far, no one has explored B-ALL progression through the lens of 3D
   chromatin organization. The advent of chromosome conformation capture
   (3 C) techniques, such as Hi-C, revealed that the folding of large
   complex chromosomes is nonrandom and involves a hierarchy of
   biologically relevant structures including chromosome territories, A
   and B compartments, topologically associating domains (TADs), and
   enhancer-promoter chromatin loops^[55]10–[56]14. These organizational
   layers have been shown to be tightly linked to epigenetic and
   transcriptional regulation. Additionally, a growing number of cancers
   have been shown to harbor disruptions in various layers of chromosome
   topology including gliomas, non-Hodgkin lymphoma, and T-cell
   ALL^[57]15–[58]17.

   In B-cell lymphomas, a recurrent histone H1 ((HIST1H1B-E) linker
   protein mutation was shown to cause large scale chromatin decompaction
   coupled with a gain of H3K36me2 and/or loss of H3K27me3, which created
   a permissive landscape for the expression of stem cell genes normally
   silenced during development^[59]18. Loss of H1c and H1e in mice
   enhanced fitness and self-renewal properties of GC B-cells, causing
   aggressive lymphomas. Similarly, recurrent alterations in TAD
   boundaries and TADs in T-ALL were shown to be linked to changes in gene
   expression and affect downstream oncogenic effectors including NOTCH1
   signaling^[60]17. In B-ALL, we showed that the relapse-specific NSD2
   E1099K mutation drives relapse by disrupting 3D chromatin organization
   at the compartment level^[61]19. More recently, recurrent and
   subtype-specific alterations in A/B compartments, TAD activity, and
   chromatin loops were identified in AML^[62]20. Additionally, inhibition
   of DNA methylation with a hypomethylating agent restored 3D genome
   organization and gene regulation in AML cells. We now aim to capitalize
   on chromatin conformation capture techniques to discover how each layer
   of 3D chromatin organization evolves over the course of B-ALL
   progression and contributes to therapy resistance and relapse.

   To uncover 3D chromatin architecture-related mechanisms underlying
   disease progression, we performed Hi-C, ATAC-seq, and RNA-seq on 12
   matched primary pediatric leukemia specimens at diagnosis and relapse.
   Mapping of structural variations using Hi-C data revealed previously
   unidentified stable, diagnosis-specific, and relapse-specific SVs as
   well as relapse-specific fusion transcripts indicating a possible role
   in clonal evolution (e.g. MBTD1::WDR89). Moreover, Hi-C analysis
   revealed genome wide chromatin remodeling specifically in terms of A/B
   compartments, TAD interactivity, and chromatin loops. We showed that
   12.17% of changes were shared by at least 25% of the patients,
   including at the JAK2 locus. Integration with ATAC-seq and RNA-seq
   datasets revealed strong correlation with both gene expression and
   chromatin accessibility. Importantly, integration of looping data with
   ATAC-seq and Hi-ChIP for H3K27ac demonstrated rewiring events can occur
   at stable chromatin accessibility sites. Additionally, we identified
   recurrent A/B compartments and TAD interactivity changes across the
   patient cohort supporting a crucial role in the clonal evolution of
   B-ALL. Recurrent TAD interactivity changes were identified in genes
   previously found dysregulated in cancer, such as PCDH9, as well as
   pathways previously implicated in drug resistance such as Wnt-beta
   catenin signaling, highlighting the activation of these vital pathways
   through 3D chromatin architecture alterations. Lastly, we discovered
   novel genes dysregulated by recurrent 3D changes, such as MN1 and
   ATXN1, which were found to play a role in drug response validating the
   functional significance of our findings. These results significantly
   extend our knowledge of shifts in the genetic and epigenetic landscape
   that drive therapy resistance in childhood B-ALL and identify novel
   therapeutic targets.

Results

Mapping structural variations in pediatric ALL using Hi-C

   To broadly assess global chromatin architecture throughout the
   progression of pediatric B-ALL, we performed Hi-C, RNA-seq, and
   ATAC-seq on sorted blasts (CD45dim, CD19 + ) from 12 matched primary
   bone marrow B-ALL samples at diagnosis and at relapse from patients
   enrolled in Children’s Oncology Group (COG) protocols or St. Jude
   Children’s Research Hospital Total XV protocol^[63]21 (Supplementary
   Fig. [64]1a; Supplementary Table [65]1). Recurrent cytogenetic
   abnormalities in pediatric B-ALL serve to identify subgroups important
   for risk stratification of therapy^[66]22,[67]23, therefore we first
   examined the Hi-C data for structural variants. Patient Hi-C data was
   processed by our in-house HiC-bench platform^[68]24. Alignment rates
   showed a high percentage of usable long-range read-pairs across our
   patient cohort ( ~ 100 million; Supplementary Fig. [69]1b, c).

   Visual inspection of entire patient Hi-C contact matrices revealed
   inter-chromosomal contacts (Fig. [70]1a). Closer inspection of the Hi-C
   contact matrices of patient PATIKJ demonstrated a canonical ALL
   rearrangement, the BCR-ABL1 translocation between chromosomes 9 and 22
   at both diagnosis and relapse (Fig. [71]1b). Reconstruction of Hi-C
   contact matrices revealed the presence of inter-chromosomal contacts
   across the breakpoint and the formation of a new topologically
   associating domain, referred to as a neoTAD^[72]25 (Fig. [73]1c).
   Fusion transcript analysis^[74]26 with RNA-seq data confirmed the
   presence of the BCR::ABL1 fusion transcript in this patient
   (Supplementary Data [75]1). This example of a well-known B-ALL
   translocation screened for at diagnosis by conventional methods
   provided evidence that our methods in detecting these chromosomal
   abnormalities using Hi-C was successful.

Fig. 1. Mapping of structural variations at Diagnosis and Relapse using Hi-C.

   [76]Fig. 1
   [77]Open in a new tab

   a Entire genome Hi-C contact matrix for patient PATIKJ. b Zoomed in
   Hi-C contact matrix demonstrating BCR-ABL1 translocation between
   chromosome 9 and chromosome 22 in patient PATIKJ at diagnosis and at
   relapse (left and right respectively). c Reconstructed Hi-C contact
   matrix presenting an example of a shared BCR-ABL1 translocation in
   patient PATIKJ between chromosomes 9 and 22 at diagnosis and relapse
   (left and right respectively). Green dashed triangle highlights a
   NeoTAD. d Translocations and their log-odds score identified in
   diagnosis and relapse per patient. Those also found by cytogenetics or
   fusion transcripts are marked.

   To further our investigation, we utilized Hi-C-Breakfinder^[78]27 and
   HiNT^[79]28 to call inter-chromosomal translocations from each Hi-C
   dataset in our cohort. We overlapped translocations predicted by the
   methods with cytogenetics provided for each patient, and found
   Hi-C-Breakfinder performed better and revealed previously identified as
   well as novel translocations (Supplementary Fig. [80]2a, b;
   Supplementary Data [81]2). Additionally, EagleC was used to call all
   SVs and was found to produce approximately 50% overlapping
   translocation calls with Breakfinder. While 50 stable translocations
   were called, for the first time, we report diagnosis-specific (5) and
   relapse-specific (21) translocations providing further evidence for
   clonal evolution as a mechanism for drug resistance (Fig. [82]1d). To
   further investigate these 26 dynamic translocations, we visually
   inspected each of them using Juicebox web application^[83]29 and
   confirmed one of the diagnosis-specific and 11 of the relapse-specific
   translocations (Supplementary Data [84]3). Interestingly, two of the
   relapse-specific translocations had faint signal at diagnosis upon
   visual inspection but were below the level of detection for
   Breakfinder, supporting clonal selection of a minor subclone at
   diagnosis. For example, while a translocation was called between
   chromosomes 16 and 22 in SJETV043 at relapse but not at diagnosis, upon
   visual inspection we were able to detect faint signal at the breakpoint
   at diagnosis (Fig. [85]2a). Another example in the same patient
   revealed a faint signal at diagnosis at the breakpoint of the
   t(chr1,12) called at relapse (Supplementary Fig. [86]2c).

Fig. 2. Mapping of structural variations using Hi-C reveals Diagnosis and
Relapse-specific translocations.

   [87]Fig. 2
   [88]Open in a new tab

   a Hi-C contact matrix demonstrating relapse-specific translocation
   between chromosomes 16 and 22 in patient SJETV043 with faint signal
   visually detected at diagnosis (arrow). b Reconstructed Hi-C contact
   matrix presenting a diagnosis-specific translocation in patient PATJJX
   between chromosomes 5 and 9 at diagnosis and relapse (left and right
   respectively). c Reconstructed Hi-C contact matrix presenting a
   relapse-specific translocation in patient SJETV043 between chromosomes
   14 and 17 at diagnosis and relapse (left and right respectively). Green
   dashed triangle highlights a NeoTAD. d Reconstructed Hi-C contact
   matrix demonstrating a relapse-specific NeoTAD (dashed green triangle).
   New contacts overlap region of ATAC peaks (purple tracks) and increased
   transcription at relapse (red tracks) overlapping the E2F4 locus
   (dashed blue triangle). e Bar plot showing upregulation of E2F4
   transcripts at relapse in PASFIF.

   While the presence of inter-chromosomal contacts across breakpoints
   were evidenced by the reconstruction of Hi-C contact matrices^[89]25,
   we wanted to determine whether these contacts resulted in any fusions
   at the RNA level. Overall the fusion transcript analysis using
   STAR-fusion called a variable number of fusion transcripts across
   patients (Supplementary Fig. [90]2d; Supplementary Data [91]1). Of the
   50 translocations shared from diagnosis to relapse, nine were found to
   have fusion transcripts. The one diagnosis-specific and three of the 11
   relapse-specific translocations also had fusion transcripts detected,
   none of which were shared across patients. An example of a
   diagnosis-specific translocation in patient PATJJX showed the presence
   of a fusion between chromosomes 5 and 9 at diagnosis only
   (Fig. [92]2b). Fusion transcript analysis revealed the existence of a
   previously unreported fusion transcript in the diagnosis sample,
   ZCCHC7::PDGFRB (Supplementary Data [93]1 and [94]3). PDGFRB alterations
   are commonly found in Ph-like B-ALL cases, which has clinical
   implications as these patients can be responsive to tyrosine kinase
   inhibition^[95]30,[96]31. Lastly, an example of a relapse-specific
   translocation in patient SJETV043 showed the presence of
   inter-chromosomal contacts between chromosomes 14 and 17 only in the
   relapse sample (Fig. [97]2c). Fusion transcript analysis confirmed the
   presence of the known ETV6::RUNX1 at both diagnosis and relapse, and
   demonstrated the existence of a previously unreported fusion transcript
   MBTD1::WDR89 in the relapse sample (Supplementary Data [98]1 and
   [99]3). MBTD1 is found as a recurrent translocation partner with
   ZMYND11 in acute myeloid leukemia and has been demonstrated to play a
   role in transcriptional activation of oncogenic genes such as
   myc^[100]32.

   For the remaining relapse-specific, visually validated translocations
   that did not have fusion transcripts detectable by STAR-fusion, we
   sought to determine if there were examples of dysregulated gene
   expression surrounding breakpoints. One example was found for the
   relapse-specific translocation between chromosomes 5 and 16 in patient
   PASFIF (Fig. [101]2d; Supplementary Fig. [102]2e). Upon reconstruction
   around the breakpoint, new contacts were observed across the two
   chromosomes in the relapse sample compared to the diagnosis sample.
   Interestingly, several genes within the neoTAD show differential gene
   expression. Two ATAC peaks near a transcriptionally active region are
   evident on chromosome 5 brought into contact with genes near the
   breakpoint on chromosome 16 (Fig. [103]2d), suggestive of potential
   promoter/enhancer hijacking as Xu et al. previously
   demonstrated^[104]20. One of the genes, E2F transcription factor 4
   (E2F4), that is upregulated at relapse (Fig. [105]2e) has been
   implicated in acute myeloid leukemia (AML). AML patients with high
   expression of E2F4 had significantly lower overall survival and in in
   vitro models its overexpression lead to increased proliferation and
   differentiation blockage^[106]33. These results extend the landscape of
   cytogenetic abnormalities in pediatric B-ALL that could play a
   significant role in the identification of subgroups that have
   significant clinical implications for risk stratification and
   therapeutic strategy.

B-ALL relapse is associated with recurrent changes in A/B compartments

   Next, we sought to determine changes in the various layers of 3D
   chromatin architecture throughout the progression of pediatric B-ALL.
   We first assessed compartmentalization of the genome between B-ALL
   samples at diagnosis and relapse. A/B compartment status was analyzed
   in individual patient pairs using Cscore^[107]34. Principal Component
   Analysis (PCA) of Cscore compartment scores revealed two distinct
   groups of patient samples in which the matched diagnosis and relapse
   pairs cluster together and the patient with a KMT2A rearrangement,
   SJETV043, clusters separately (Fig. [108]3a). Overall, the number of A
   and B compartment calls was comparable between diagnosis and relapse
   samples (Fig. [109]3b). We next assessed A/B compartment switching in
   individual diagnosis-relapse patient pairs. We observed 4.88%
   compartment switching between diagnosis and relapse samples across the
   12 patient pairs (Fig. [110]3c; Supplementary Fig. [111]3a). In order
   to confirm that 3D changes were not driven by copy number variations
   (CNVs), we called CNVs using EagleC. Less than 1.3% of compartment bins
   that underwent a switch were found to overlap with a CNV (Supplementary
   Data [112]4). B-ALL is a heterogeneous disease classified by various
   chromosomal abnormalities, such as hyperdiploidy, hypodiploidy,
   translocations, and intra-chromosomal amplification. Given this
   heterogeneity, B-ALL relapse expectedly has various mechanisms of
   clonal selection. While the small size of our cohort is limiting, we
   were particularly interested in identifying recurrent compartment
   switches across the patient cohort and/or switches that were concordant
   with gene expression and chromatin accessibility as they could be
   indictive of a role in clonal evolution. Overall, we identified 12.17%
   recurrent switches shared by at least 3 out of 12 patients (25% of
   patients). Specifically, 5.73% (420/7335) of A to B switches and 5.99%
   (483/8059) of B to A switches were shared by at least 3 patient samples
   (Fig. [113]3d).

Fig. 3. Compartments undergo discrete changes upon B-ALL relapse.

   [114]Fig. 3
   [115]Open in a new tab

   a PCA of A/B compartment calls with Cscore. b Violin plot showing A and
   B compartment calls for diagnosis and relapse samples. Significant
   differences were calculated with a paired one-sided t test comparing
   number of A or B eigenvector-1 bins between diagnosis and relapse. c
   Bar plot showing number of compartment switches for each patient pair
   upon relapse. Bar plot shows mean, bars represent standard deviation,
   and each point represents a patient (n = 12). d Heatmap representation
   of compartment switches shared by at least 3 patients upon relapse. e
   Correlation boxplots between gene expression and compartment changes
   (left) and chromatin accessibility and compartment changes (right) from
   relapse to diagnosis of 12 individual patient pairs. Significant
   differences were calculated by unpaired one-sided t test comparing
   genes or peaks from compartment changes AB or BA to genes or peaks from
   stable compartments (*****p < .00001). Boxplots show medians
   (horizontal line in each box), interquartile ranges (boxes), 1.5
   interquartile (whiskers) and each point represents a gene or peak. f
   IGV tracks demonstrating concordant B to A compartment switches and
   increased gene expression in 3 patient pairs at the JAK2 gene locus.

   We then integrated the Hi-C data with RNA-seq and ATAC-seq data to
   investigate whether compartment switches are associated with changes in
   gene expression and chromatin accessibility (Supplementary
   Fig. [116]3b–e). We performed differential expression analysis of all
   expressed genes (FPKM > 1) and differential chromatin accessibility
   analysis of all ATAC-seq peaks (abs(log2 (fold change))>1.0,
   false-discovery rate (FDR) < 0.01) within compartment switches at the
   individual patient pair level and found that B to A compartment
   switches upon relapse correlated with an increase in gene expression,
   whereas A to B compartment switches upon relapse were associated with a
   decrease in gene expression relative to expression changes within
   stable compartments (Fig. [117]3e, left panel). We witnessed a similar
   trend with chromatin accessibility changes (Fig. [118]3e, right panel).
   Interestingly, we found a recurrent B to A compartment switch at the
   JAK2 locus in 3 of the 12 patient pairs. This B to A compartment switch
   was also concordant with increased gene expression (Fig. [119]3f). The
   JAK2 gene is a member of the non-receptor tyrosine kinase family and
   mediates intracellular signaling upon activation of cytokine
   receptors^[120]35. Several studies have identified gain of function
   mutations and translocations affecting JAK2 in pediatric
   B-ALL^[121]36,[122]37.

B-ALL relapse drives recurrent changes in TAD interactivity

   Following compartment analysis, we investigated TAD interactivity
   changes (i.e. changes in intra-TAD chromatin interactions) between
   B-ALL diagnosis and relapse patient pairs. We first called TADs at
   diagnosis and relapse and identified ~3000 TADs in each patient
   (Supplementary Fig. [123]4a). Comparison of TAD interactivity between
   B-ALL samples at diagnosis and relapse identified a range of
   significantly differential TADs across patients, with 8 out of 12
   patients having more gains in TAD activity at relapse (Fig. [124]4a, b;
   (FDR) < 0.1 and abs(log2(fold change))>0.25; Supplementary
   Fig. [125]4b). Less than 2.4% of differential TADs overlapped with CNVs
   (Supplementary Data [126]5). Next, to investigate whether TAD
   interactivity associated with changes in gene expression and chromatin
   accessibility, we performed differential expression analysis of all
   expressed genes (FPKM > 1) and differential chromatin accessibility
   analysis of all ATAC-seq peaks (abs(log2 (fold change))>1.0,
   FDR < 0.01) within differentially active TADs at the individual patient
   pair level. Increased TAD interactivity at relapse significantly
   associated with positive fold changes in gene expression, whereas
   decreased TAD interactivity at relapse associated with negative fold
   changes in gene expression relative to expression changes within stable
   TADs (Fig. [127]4c, left). We saw a similar trend with chromatin
   accessibility changes (Fig. [128]4c, right).

Fig. 4. B-ALL patients share concordant changes in TAD activity.

   [129]Fig. 4
   [130]Open in a new tab

   a Bar plot presenting number of intra-TAD activity changes per patient
   pair upon relapse (abs(L2FC) > 0.25, FDR < 0.01). Bar plot shows means,
   bars represent standard deviation, and each point represents a patient
   (n = 12). b Volcano plot presenting increased, decreased, and stable
   TADs between diagnosis and relapse samples (red, blue, and gray
   respectively). c Correlation boxplots showing gene expression and
   chromatin accessibility changes within differential TADs (paired t
   test, p < 0.05) from relapse to diagnosis of 12 individual patient
   pairs. Significant differences were calculated by unpaired one-sided t
   test comparing gene expression or ATAC peaks within TADs with
   interactivity changes, losses, or gains, to gene expression or ATAC
   peaks within stable TADs(*****p < .00001). Boxplots show medians
   (horizontal line in each box), interquartile ranges (boxes), 1.5
   interquartile (whiskers) and each point represents a gene or peak. d
   Hi-C contact matrices presenting a TAD decreased in activity at relapse
   in 4 of 12 patient pairs at the PCDH9 locus (Diagnosis, Relapse, and
   L2FC (Relapse/Diagnosis) from top to bottom). e Box plots demonstrating
   decreased PCDH9 mean TAD activity in cpm (top left), decreased gene
   expression in tpm (top right), decreased PCDH9 compartment Cscore
   (bottom left), and decreased chromatin accessibility in cpm (bottom
   right) in the 4 patient pairs at diagnosis and relapse. Significant
   differences were calculated with a paired one-sided t test comparing
   mean TAD activity, TPM, Cscore, and cpm between diagnosis and relapse.
   Boxplots show medians (horizontal line in each box), interquartile
   ranges (boxes), 1.5 interquartile (whiskers) and each point represents
   a patient with pairs connected by lines. f IGV genome browser tracks
   demonstrating A to B compartment switch, decreased chromatin
   accessibility, and decreased gene expression in the 4 patients with
   decreased TAD interactivity at the PCDH9 gene locus.

   Of note, patient SJPHALL005 had more than 800 changes in TAD
   interactivity, when applying the same thresholds, with the majority
   being loss in activity (Supplementary Fig. [131]4c). We observed a
   similar trend with patient SJPHALL005 when integrating the TAD
   interactivity data with either gene expression or chromatin
   accessibility data (Supplementary Fig. [132]4d). This patient also
   showed the greatest amount of compartment changes at 10.85%, with the
   majority being A to B (Supplementary Fig. [133]3a). Interestingly, we
   saw increased CEBPA expression and decreased PAX5 at relapse
   (Supplementary Fig. [134]4e), suggesting a more myeloid-like profile
   upon relapse. These observations are consistent with patient SJPHALL005
   having a therapy related or lineage switch secondary malignancy as
   previously described^[135]38. Furthermore, we also observed myeloid
   lineage architectural changes, including a gain of TAD activity, a
   switch to A compartment, and increased expression of DACH1 at relapse
   (Supplementary Fig. [136]4f, g). DACH1 has been shown to regulate cell
   cycle progression in myeloid cells^[137]39.

   As with the compartment analysis, we next aimed to identify TAD
   interactivity changes that were shared amongst our patient cohort
   possibly indicating major modes of tumor escape from therapy. Notably,
   we found a recurrent decrease in TAD interactivity at the PCDH9 locus
   in 4 of the 12 patient pairs (Fig. [138]4d). All 4 patients
   demonstrated decreased mean TAD interactivity, gene expression, Cscore,
   and accessibility (Fig. [139]4e). Closer examination of the patient
   with the most dramatic change in TAD interactivity, patient PASPNY,
   revealed concordant changes in compartments, chromatin accessibility,
   and gene expression (Fig. [140]4f). The PCDH9 (Protocadherin 9) gene
   encodes a transmembrane protein involved in cell adhesion and calcium
   ion binding and has been identified as one of the top downregulated
   genes in pediatric B-ALL^[141]40–[142]44.

   Lastly, we performed pathway enrichment analysis^[143]45–[144]47 to
   identify pathways associated with changes in 3D architecture. Pathway
   enrichment analysis on genes that were concordant in expression with 3D
   changes such as A/B compartment switches or TAD interactivity changes
   revealed enrichment of previously identified key pathways including,
   mTORC1, Wnt-beta catenin, and p53 signaling (Supplementary
   Fig. [145]4h)^[146]9,[147]48. These data demonstrate that the
   activation of these important pathways goes beyond genetic alterations
   and validate these pathways as novel targets to prevent and treat
   relapse.

Chromatin loop data reveals enriched factors at relapse

   We next wanted to investigate the rewiring of chromatin loops and the
   factors responsible for mediating such changes during selection from
   therapy. We performed a looping analysis with FitHiC2^[148]49,[149]50.
   Looping calls varied based on sequencing depth across the patient
   cohort (Fig. [150]5a). We performed differential looping analysis to
   identify loops that increased, decreased, or remained the same upon
   relapse (Fig. [151]5b) and visually confirmed them using Aggregate Peak
   Analysis (APA) (Supplementary Fig. [152]5a). We observed a strong
   correlation of loop changes with both gene expression changes and
   chromatin accessibility changes (Fig. [153]5c). Additionally, we
   identified a positive association between looping changes and both
   compartment and TAD interactivity changes (Supplementary Fig. [154]5b)

Fig. 5. Chromatin loop data reveals enriched factors at relapse.

   [155]Fig. 5
   [156]Open in a new tab

   a Bar plot presenting number of loops called with Fit-Hi-C2 at
   diagnosis and relapse (q < 0.01). b Bar plot presenting number of loops
   increased, decreased or stable upon relapse (blue, red, gray
   respectively). c Fit-Hi-C loop correlation with gene expression and
   chromatin accessibility (left and right respectively). Significant
   differences were calculated by unpaired one-sided t test comparing
   genes or peaks from loops that increased or decreased to genes/peaks
   from stable loops (*****p < 0.00001) from relapse to diagnosis of 12
   individual patient pairs. Boxplots show medians (horizontal line in
   each box), interquartile ranges (boxes), 1.5 interquartile (whiskers)
   and each point represents a gene or peak. d Enrichment analysis with
   LOLA of decreased and increased ATAC-seq peaks (left and right)
   concordant with Fit-Hi-C loops and gene expression.

   To identify key regulators of transcription attributed to loop changes,
   we performed motif enrichment analysis with Locus Overlap Analysis
   (LOLA)^[157]51 with ATAC-seq peaks overlapping concordant chromatin
   loops. We identified several chromatin binding factors enriched in
   regions with decreased chromatin accessibility and decreased chromatin
   looping including RXR and EZH2 (Fig. [158]5d, left panel). We also
   identified several enriched factors in regions with increased chromatin
   accessibility and increased chromatin looping including FOS and NF-E2
   (Fig. [159]5d, right panel).

H3K27ac HiChIP data reveals exclusive looping at stable chromatin
accessibility sites

   We were also interested in the behavior of enhancer anchored loops upon
   B-ALL relapse based on our previous work demonstrating the importance
   of enhancers and superenhancers in relapse^[160]9. We performed H3K27ac
   HiChIP with one patient pair, PASFIF, to evaluate H3K27ac anchored
   loops. Looping analysis was performed with FitHiC2^[161]49,[162]50. To
   examine the relationship between alterations in the H3K27ac chromatin
   loops and chromatin accessibility, we categorized the H3K27ac chromatin
   loops into those that were exclusively increased (exclusive to
   relapse), exclusively decreased (exclusive to diagnosis), dynamically
   increased (increased), dynamically decreased (decreased), or stable and
   assessed chromatin accessibility changes. Interestingly, integration of
   H3K27ac HiChIP with ATAC-seq data revealed significantly increased and
   decreased exclusive loops at regions with stable chromatin
   accessibility sites (Fig. [163]6a). This data suggests that significant
   3D rewiring events can occur at stable chromatin accessibility sites
   that would otherwise be entirely ignored if analysis is focused only on
   sites with changes in chromatin accessibility using ATAC-seq.

Fig. 6. PASFIF H3K27ac HiChIP data reveals exclusive loops at stable ATAC-seq
peaks.

   [164]Fig. 6
   [165]Open in a new tab

   a ATAC-seq and Fit-Hi-C loop association bar plot. b Enrichment
   analysis with LOLA of increased, decreased, or stable ATAC-seq peaks
   with exclusively decreased, dynamic decreased, stable, exclusively
   increased, or dynamic increased loops. c IGV snapshot of the Fms
   Related Receptor Tyrosine Kinase 1 (FLT1) gene locus presenting
   ATAC-seq, RNA-seq, and H3K27ac Hi-ChIP and Hi-C virtual 4C tracks at
   the TSS as a viewpoint. Gray boxes highlight regions of differential
   contact frequency.

   To expand on this finding, we performed enrichment analysis using
   LOLA^[166]51 with specifically stable ATAC-seq peaks overlapping
   concordant and exclusive H2K27ac anchored loops. We identified several
   interesting chromatin binding factors enriched in regions with stable
   chromatin accessibility and decreased enhancer-promoter looping
   including NCOR1, CREBBP, and MEIS (Fig. [167]6b, left panel). We also
   identified several factors enriched in regions with stable chromatin
   accessibility and increased enhancer-promoter looping including PLU1,
   THAP1, and PHF8 (Fig. [168]6b, right panel).

   Lastly, we present an example of stable chromatin accessibility and
   exclusively increased H3K27ac chromatin looping at the vascular
   endothelial growth factor receptor 1 (FLT1) gene locus^[169]52 in
   patient PASFIF (Fig. [170]6c). Genome browser tracks show stable
   chromatin accessibility and increased gene expression. H3K27ac HiChIP
   and Hi-C virtual 4 C tracks show differential contact frequencies
   between diagnosis and relapse samples. FLT1 activation in ALL has been
   shown to increase survival of blasts by influencing their localization
   into the bone marrow and exit back into circulation^[171]52.

Functional validation of candidate genes suggests a role in drug resistance

   In an effort to discover potential gene targets involved in drug
   resistance or relapse that are downstream of 3D chromatin changes, we
   identified genes that demonstrated 3D change from diagnosis to relapse
   (A/B compartment, TAD interactivity, or chromatin loop), that were
   concordant with gene expression, and were shared by at least 3
   patients, indicating a possible major role in clonal evolution. This
   resulted in a list of 361 upregulated and 387 downregulated genes. To
   validate the expression changes, we first overlapped our 3D gene-based
   lists with a list of the most significantly differentially expressed
   genes from a meta-analysis of a large cohort of three microarray
   datasets generated from pediatric B-ALL diagnosis/relapse
   pairs^[172]53. Thirty-two percent (116 of 387) of the 3D mediated
   downregulated and 17% (61 of 361) of the upregulated genes were also
   differentially regulated in this much larger cohort of samples
   (Fig. [173]7a; Supplementary Table [174]2) indicating a prominent role
   of 3D reorganization in clonal evolution. Furthermore, using published
   data from a genome wide CRISPR screen performed with the B-ALL cell
   line REH and various chemotherapies used in pediatric B-ALL treatment
   performed by Oshima et al.^[175]54, we found that 42% (153 of 361) of
   the upregulated and 24% (93 of 387) of the downregulated genes were
   within the top 10% of guides negatively (for up) or positively (for
   down) selected from at least one of the drugs tested, suggesting a cell
   intrinsic role in drug resistance (Supplementary Data [176]6).

Fig. 7. Functional validation of MN1 and ATXN1 suggests a role in drug
resistance.

   [177]Fig. 7
   [178]Open in a new tab

   a Venn diagrams presenting overlap of 3D gene-based lists with a list
   of the most significantly differential genes from a meta-analysis of
   three microarray datasets. b Example shown for PASYFN of an A to B
   compartment switch shared by 4 patient pairs at the MN1 gene locus
   including TAD activity, V4C signal, A/B compartments, RNA-seq, and
   ATAC-seq tracks. c Box plot demonstrating a decrease in MN1 gene
   expression in tpm (left) and decrease in Cscore (right) in the 4
   patient pairs with the A to B compartment switch. Significant
   differences were calculated using paired t test (*p < 0.05). Boxplots
   show medians (horizontal line in each box), interquartile ranges
   (boxes), 1.5 interquartile (whiskers) and each point represents a
   patient with pairs connected by lines. d Linear regression model
   demonstrating positive correlation between compartment and gene
   expression changes at the MN1 locus for patients with differential
   expression absL2FC > 1.0 (n = 7). e Hi-C contact matrices presenting a
   TAD increased at relapse in 5 of 12 patient pairs at the ATXN1 locus
   (Diagnosis, Relapse, and L2FC (Relapse/Diagnosis) from top to bottom).
   f Box plots demonstrating increased ATXN1 gene expression (top) and
   increased intra-TAD activity (bottom) in the 5 patient pairs upon
   relapse. Significant differences were calculated using paired t test
   (*p < 0.05). Boxplots show medians (horizontal line in each box),
   interquartile ranges (boxes), 1.5 interquartile (whiskers) and each
   point represents a patient with pairs connected by lines. g Cell
   viability (left) and apoptosis assays (right) demonstrating increased
   sensitivity to treatment with Prednisolone in SEM cells overexpressing
   MN1 compared to empty vector control (EV). h Cell viability (left) and
   apoptosis assays (right) demonstrating increased resistance to
   treatment with Prednisolone in 697 cells overexpressing ATXN1 compared
   to EV. Cell viability graphs shown (g, h) are one experiment
   representative of three independent experiments, each with technical
   triplicates. The dots represent the mean of technical triplicates for
   that experiment with standard deviation shown. Curve represents the
   nonlinear fit of transformed X. The null hypothesis of one curve fits
   all datasets was rejected with p < 0.001 (g, h). Apoptosis graphs shown
   represents the mean and standard deviation of 4 (g) and 3 (h)
   individual experiments. Statistical significance shown based on
   unpaired t test with Welch correction with p value indicated.

   Interestingly, at the MN1 locus we identified a recurrent A to B
   compartment switch in 4 of the 12 patient pairs that was concordant
   with decreased gene expression, chromatin accessibility, and virtual
   4 C signal (Fig. [179]7b–d). Linear regression model revealed a
   positive relationship between changes in compartment Cscore and gene
   expression (R^2 = 0.7, p < 0.019, Fig. [180]7d). MN1 encodes a
   transcriptional co-regulator involved in the co-activation of various
   transcription factors^[181]55,[182]56. Notably, we also found a
   recurrent increase in TAD interactivity with a concomitant increase in
   gene expression at the ATXN1 locus in 5 of the 12 patient pairs
   (Fig. [183]7e, f). Closer examination of the patient with the most
   dramatic change in TAD interactivity, PASPNY, revealed concordant
   changes in TAD interactivity, compartments, gene expression, chromatin
   accessibility, and virtual 4 C signal (Supplementary Fig. [184]6a).
   This was further validated by increased H3K27ac peak changes in many
   patients from our previous study^[185]9. The Ataxin-1 (ATXN1) gene is a
   chromatin binding factor that represses Notch
   signaling^[186]57,[187]58.

   We validated a functional role for both MN1 and ATXN1 in B-ALL cell
   lines. The B-ALL cell line SEM was engineered to overexpress MN1 or EV
   as control and verified via western blot (Supplementary Fig. [188]6b).
   We observed a significant increase in sensitivity to prednisolone upon
   overexpression of MN1 (Fig. [189]7g) but no difference was observed
   with thiopurines, suggesting that loss of MN1 would provide an
   advantage to cells undergoing steroid treatment. The B-ALL cell line
   697 was engineered to overexpress ATXN1 or EV as control and verified
   via western blot (Supplementary Fig. [190]6c). Overexpression of ATXN1
   resulted in increased resistance to prednisolone compared to EV
   (Fig. [191]7h), suggesting upregulation of ATXN1 leads to drug
   resistance. There was no difference in IC50s for other chemotherapies,
   including thiopurines or doxorubicin (Supplementary Fig. [192]6d). Note
   when ATXN1 or MN1 expression was modified in other cell lines (REH,
   RS4;11, SEM, 697), we did not observe changes in chemosensitivity
   indicating cell context plays an important role, a finding consistent
   with the fact that these changes were shared in a subset of patients.

Discussion

   Relapsed pediatric B-ALL remains one of the leading causes of cancer
   mortality in children. Previous studies have characterized the genetic,
   epigenetic, and transcriptional landscapes of B-ALL. However, no one
   has investigated B-ALL progression through the lens of 3D chromatin
   organization. In this work, we investigate the dynamics of 3D chromatin
   architecture in B-ALL progression by assessing 12 matched primary
   pediatric leukemia specimens at diagnosis and relapse. We demonstrate
   that 3D genome organization plays an additional critical level of
   control in B-ALL progression.

   The current clinical standard for the detection of genetic
   abnormalities in B-ALL is fluorescence in-situ hybridization (FISH).
   Despite its sensitivity, FISH can only detect known genetic aberrations
   with the provision of specific probes. Hi-C in contrast is a powerful
   tool that gives insight into previously unknown structural variants. We
   sought to leverage Hi-C and investigate the stability of structural
   variants between diagnosis and relapsed B-ALL. Robust SV analysis with
   Hi-C data from diagnosis and relapse samples confirmed translocations
   provided by conventional cytogenetics but also some previously
   unidentified stable, diagnosis-specific, and relapse-specific
   translocations. This data suggests that Hi-C data can be used as a
   computationally less rigorous alternative to WGS methods to discover
   cytogenetic abnormalities in ALL and other disease models. We confirmed
   the presence of fusion transcripts such as BCR::ABL1 and ETV6::RUNX1 as
   well as identified novel translocation-driven fusion transcripts
   offering possible new avenues for therapeutic intervention especially
   those that appear to be enriched at relapse.

   In general, such translocations are thought to be an early event in
   tumorigenesis and are maintained at diagnosis and relapse. We show a
   striking number of additional translocations indicating a fundamental
   role in tumorigenesis. The loss of translocations at relapse indicates
   a dispensable role in clonal evolution while relapse-specific
   translocations may provide additional fitness to the leukemic clone.
   Interestingly, both PDGFRB and ZCCHC7 have been previously observed in
   rearrangements with different loci in B-ALL. The detection of
   ZCCHC7::PDGFRB at diagnosis would support the use of a tyrosine kinase
   inhibitor, although in this case the clone appeared to be extinguished
   with chemotherapy. MBTD1, malignant brain tumor domain containing 1,
   has been identified as a part of a chimeric protein with ZMYND11 (zinc
   finger MYND-type containing 11) that leads to active chromatin states
   at stemness related genes in AML^[193]59. Herein, we show a novel
   fusion transcript with MBTD1 and WDR89, a member of the WD40-repeat
   proteins whose function is largely unknown but may play a role in
   immunotherapy response^[194]60,[195]61. While the functional relevance
   is beyond the scope of the present study, the fact that this novel
   transcript was relapse-specific suggests a role in disease progression.
   Performing Hi-C on a larger cohort of samples could enable the rapid
   benchmarking of cytogenetic abnormalities associated with B-ALL
   progression and would allow for an unbiased view of the genome to
   identify unknown fusions that would otherwise be missed using current
   methods. Lastly, while our studies show a number of diagnosis or
   relapse specific translocations, all pairs had shared translocations
   indicating that the origin of the relapsed clone was from a common
   ancestral precursor including case SJPHALL005 where the initial
   diagnostic clone was lymphoid and the patient relapsed with AML. Thus,
   the subsequent relapse likely originated from a common ancestral
   stem-like clone, a feature known to be associated with Ph+ ALL^[196]62.
   Furthermore, there were several relapse-specific translocations that
   were visually detected (but not called by Breakfinder) at diagnosis,
   supporting the idea of clonal selection of a minor subclone at
   diagnosis^[197]5,[198]7.

   In addition to structural changes, we demonstrate significant A/B
   compartment, TAD interactivity, and looping changes between diagnosis
   and relapse that strongly correlated with expression and accessibility.
   These findings support a prominent role of 3D architecture in
   modulating transcription. We saw that the magnitude of these
   architectural changes varied among pairs. This is consistent with our
   previous findings showing variable degrees of epigenetic changes from
   diagnosis to relapse, highlighting the plasticity of the chromatin
   landscape under the selective pressures of therapy^[199]9. Analysis of
   3D-regulated genes revealed previously discovered pathways associated
   with drivers of clonal evolution. For example, activation of JAK/STAT
   signaling has been implicated in leukemia progression and we observed a
   recurrent B to A compartment switch at the JAK2 locus. Previous studies
   have shown mutations as well as translocations lead to activated JAK2
   in pediatric B-ALL, which could be targeted through JAK
   inhibitors^[200]53,[201]54. Our Hi-C analysis demonstrates an
   additional mechanism by which blasts may increase JAK2 thus extending
   the use of JAK inhibitors at relapse.

   Importantly our work identified additional novel pathways involved in
   drug resistance specifically to prednisolone, a cardinal feature of ALL
   at relapse. A meta-analysis of gene expression in B-ALL diagnosis and
   relapse pairs confirms that MN1 expression is downregulated while ATXN1
   is upregulated at relapse consistent with our findings^[202]53. MN1 is
   a transcriptional co-activator and indirectly activates retinoic acid
   receptor and vitamin D receptor (VDR) target genes^[203]63. Many
   studies have confirmed an association between polymorphisms of VDR with
   osteonecrosis, a side effect of glucocorticoids in ALL
   therapy^[204]64,[205]65. Furthermore, retinoids have been found to
   promote glucocorticoid-induced apoptosis of various T-cell lines. Thus,
   downregulation of MN1 may impair apoptosis induced by retinoid
   signaling under steroid treatment^[206]66. Conversely, overexpression
   of ATXN1 resulted in increased resistance to prednisolone. ATXN1,
   thought to function in extracellular matrix remodeling, has been shown
   to regulate Notch and Capicua (CIC) controlled developmental
   processes^[207]67. We were unable to replicate phenotypic changes in
   multiple cell lines; however, we have previously demonstrated cell
   context specific changes in drug response, which can be explained in
   part due to 3D chromatin differences^[208]56.

   By integrating looping changes with gene expression and chromatin
   accessibility, we were able to identify potential regulators of these
   changes. We found enrichment of sites for chromatin binding factors
   shown to be key players in cancer. One such example is EZH2, which is
   part of the polycomb repressive complex (PRC2) that is crucial for
   proliferation and differentiation and plays a complex role in
   cancer^[209]68. We previously demonstrated DNA hypermethylation at PRC2
   target genes at relapse^[210]9, which would be consistent with
   decreased expression, accessibility, and 3D looping interactions where
   there are EZH2 motifs. Studies related to EZH2 and NSD2 mutations in
   leukemia and myeloma have also revealed a complex interplay between
   these two epigenetic modifiers^[211]69,[212]70 and perhaps
   understanding the 3D chromatin changes associated with EZH2 could
   provide more insight for the best method to target epigenetic
   modifiers. The importance of epigenetic modifiers is strengthened
   further by the enrichment of H3K27me3 and the role this mark has been
   demonstrated to play in repressor hijacking events^[213]20. The
   highest-ranked factor for decreased looping and expression was retinoid
   X receptor (RXR), which is a nuclear receptor that interacts with
   retinoic acid receptor alpha and recruits corepressors to chromatin to
   regulate target genes, including genes involved in differentiation and
   apoptosis^[214]71. Targeting these receptors in IKZF1 mutated BCR-ABL1
   ALL has also been shown to reverse stem cell like phenotype^[215]72.
   The highest-ranking factor motif for activated regions was nuclear
   factor erythroid-2 (NF-E2). Overexpression in mouse models has been
   shown to lead to increased stem and progenitor cells and spontaneous
   development of AML^[216]73.

   Interestingly, the H3K27ac HiChIP with patient PASFIF demonstrated a
   significant amount of de novo loops appearing and disappearing at
   regions of stable chromatin upon relapse. This finding indicates that
   sites of stable chromatin accessibility provide invaluable information
   on a significant number of 3D rewiring events that would otherwise be
   entirely ignored if analyzes are directed to changes in chromatin
   accessibility exclusively (e.g., ATAC-seq). For example, CREBBP motifs
   were enriched in regions of decreased looping at relapse. CREBBP loss
   of function mutations have previously been observed at relapse in a
   significant number of patients and is associated with resistance to
   glucorticoids^[217]74. Likewise, PHF8 motifs were enriched in regions
   of increased looping and PHF8 expression is associated with ALL
   progression through activation of the MEK/ERK pathway^[218]75.

   Overall, this study demonstrates 3D genome organization as a critical
   factor in reshaping the transcriptional output of leukemia cells under
   the selective pressures of therapy. These processes appear to drive
   pathways shown to be involved in drug resistance but also previously
   undiscovered genes/pathways that may play a role in clonal evolution.
   Our findings extend indications for inhibition of kinase and cytokine
   receptor signaling in B-ALL using available agents, as well as,
   highlight the potential of novel approaches to modulate MN1, ATXN1 and
   FLT1 at relapse. The identification of common downstream targets among
   patients indicates the convergence of drug resistance mechanisms and
   has important clinical implications for possible future therapeutic
   interventions.

Methods

   Cell culture, drug preparation, viral preparation, immunoblotting, and
   phenotypic assays were performed according to methods published
   previously^[219]5–[220]7.

Cells and reagents

   The B-lineage leukemia cell lines Reh (ATCC), RS4;11 (ATCC), RCH-ACV
   (DSMZ), and SEM (kindly gifted by Jun Yang, St. Jude Children’s
   Hospital) were grown in RPMI1640 medium. All media were supplemented
   with 10% FBS, 1% penicillin/streptomycin under 5% CO2 at 37 °C. No cell
   lines were used beyond passage 20. Each leukemia line was validated by
   short tandem repeat analysis through IDEXX cellcheck9+ service. Cell
   lines were routinely monitored for mycoplasma contamination by PCR
   using ATCC Universal Mycoplasma Detection Kit (20-1012 K).

Experimental Procedures

Primary B-ALL patient samples

   Cryopreserved paired diagnosis/relapse primary patient bone marrow
   samples from individual patients were obtained from the Children’s
   Oncology group (COG) ALL biorepository or St. Jude Children’s Research
   Hospital^[221]21. All subjects provided consent for banking and future
   research use of these specimens in accordance with the regulations of
   the institutional review boards of all participating institutions.
   Samples were thawed and stained for CD45 (PerCP-Cy™5.5 Mouse Anti-Human
   CD45 clone HI30, BD Bioscience Cat# 564105) and CD19 (APC Mouse
   Anti-Human CD19 Clone HIB19, BD Bioscience Cat# 555415). Leukemic
   blasts (CD45dimCD19 + ) were sorted using SY3200 cell sorter (Sony
   Technologies) in the NYU Cytometry and Cell Sorting Laboratory and then
   immediately processed for downstream steps.

RNA-seq

   Following FACS, 50,000-200,000 leukemic cells were immediately
   resuspended in Buffer RLT plus BME and frozen on dry ice. RNA was
   extracted using the QIAGEN RNeasy Micro Kit and quality was verified by
   an Agilent Bioanalyzer 2100 (PICO chip). RNA-seq libraries were
   generated by the NYU Genome Technology Center using the Low input
   Clontech SMART-Seq kit and sequenced on the Illumina NovaSeq 6000.

ATAC-seq

   Following FACS, 50,000 leukemic cells were immediately processed for
   ATAC-seq by the NYU Genome Technology Center. ATAC libraries were
   generated based on the protocol by Buenrostro et al.^[222]76 Briefly,
   cells are resuspended in cold lysis buffer (10 mM Tris Cl, 10 mM NaCl,
   3 mM MgCl[2], 0.1% (v/v) Igepal CA-630, pH 7.4) and centrifuged for
   1 min at 500 × g. Nuclei were tagmented using Nextera (Illumina)
   Tagmentation DNA buffer and enzyme. PCR amplification was performed as
   described by Buenrostro et al.^[223]76 Libraries were sequenced on the
   Illumina NovaSeq 6000.

Hi-C

   Following FACS, leukemic cells were immediately fixed and crosslinked
   with 2% formaldehyde at room temperature for 10 min. Glycine was added
   to stop the reaction. Crosslinked cells were washed with PBS and then
   frozen on dry ice. Crosslinked cell pellets were stored at −80 until
   ready for processing. Hi-C libraries were generated using the Arima-HiC
   kit (A410110) with Swift Biosciences® Accel-NGS® 2 S Plus DNA Library
   Kit (Cat # 21024) and Indexing kit (Cat #) according to manufacturer’s
   protocol. Libraries were amplified and quantified using KAPA Library
   Amplification kit (Cat # KK2620). Final libraries were sequenced at the
   NYU Genome Technology Center on the Illumina NovaSeq 6000.

H3K27ac HiChIP

   HiChIP libraries were generated using the Arima-HiC^+ kit (A410232)
   with Swift Biosciences® Accel-NGS® 2 S Plus DNA Library Kit (Cat #
   21024) and Indexing kit (Cat # 26148) according to manufacturer’s
   protocol with minor modifications. After in-situ proximity ligation,
   samples were sonicated on Covaris LE 220 at 4 C, 300 W, 15% duty
   factor, 200 cycles per burst, for 300 s. Samples are then precleared
   with protein G beads for an hour, and subsequently incubated with 5 μg
   of H3K27AC antibody (ab4729, Lot:GR3231988) at 4 C overnight. After IP,
   libraries were amplified and quantified using KAPA Library
   Amplification kit (Cat # KK2620). Final libraries were sequenced at the
   NYU Genome Technology Center on the Illumina NovaSeq 6000.

Drug preparation

   Stock solutions of Doxorubicin (Dox) (Sigma-Aldrich, St. Louis, MO)
   were prepared in double-distilled water, Prednisolone (Pred)
   (Pharmacia, St. Paul, MN) in 0.9% NaCl (Saline), 6-thioguanine (6-TG)
   and 6-mercaptopurine (6-MP) in 0.1 M NaOH. Drugs were diluted and added
   to cell culture media at indicated concentrations. Vehicle toxicity was
   tested at highest drug dose used to ensure no impact on viability.

Generation of overexpression cell lines

   The open reading frames of MN1 or ATXN1 were PCR amplified from the
   pcDNA3.1 plasmids purchased from Genscript (OHu18815D and OHu26181D,
   respectively) and cloned into pLenti-R1R2-DEST (kindly gifted by
   Kenneth Scott, Baylor College of Medicine) using Invitrogen’s
   Gateway(R) Technology. HEK293T cells were then transfected with these
   constructs (empty vector or gene of interest) along with packaging
   plasmids coding for VSV-G, Gag-pol, and Rev using lipofectamine 2000
   (Invitrogen, Carlsbad, CA). Viral supernatant was used to spin infect
   B-ALL cell lines with 8 µg/ml Polybrene (Millipore). After 72 h,
   infected cell lines were selected with 0.75–3 µg/ml of Puromoycin.

Cytotoxicity assays

   Cells were plated at a density of 60,000–80,000 cells/well and exposed
   to chemotherapy agents commonly used in the treatment of pediatric
   B-ALL for 4 days. Cell viability was measured using CellTiter-Glo®
   Luminescent Cell viability Assay (Promega) according to the
   manufacturer’s protocol. The luminescence was read using a PerkinElmer
   EnSpire multimode plate reader. Luminescent readings were normalized to
   the untreated control for each cell line to account for any differences
   in baseline growth kinetics or survival. Cytotoxicity percentages were
   plotted using nonlinear regression with a four parametric variable
   slope with log transformed x concentrations with a bottom constraint
   equal to zero on GraphPad Prism 7.2 (GraphPad Prism Software Inc., La
   Jolla, CA). For apoptosis, following treatment, cells were stained with
   Annexin V and DAPI and analyzed via flow cytometry using the BD LSR II
   cytometer within the NYU Cytometry and Cell Sorting core. The percent
   Annexin V+ cells were determined for percent apoptotic using FlowJo
   Software. The percentage of apoptotic cells was compared between empty
   vector control and MN1 or ATXN1 overexpressing cells using an unpaired
   t test. Each experiment was plated in triplicate and repeated at least
   three times.

Computational analysis

RNA-seq analysis

   Patient RNA-seq fastq files were processed in single replicates using
   the route “rna-star” and “rna-star-groups-dge” from the Slide-n-Seq
   (sns) pipeline: [224]https://igordot.github.io/sns/. Processing steps
   include alignment of paired-end reads to the human reference genome
   (hg19) using the STAR aligner with default parameters^[225]77. Counts
   were obtained using featureCounts^[226]78. Bigwig tracks were obtained
   for visualization on individual samples using deeptools
   (v3.1.0)^[227]79. Downstream analysis including normalization and
   differential expression analysis was performed using DESeq2^[228]80.
   Genes were categorized as differentially expressed if abs(L2FC > 0.58,
   p value < .05). Pathway analysis was performed using enrichR^[229]46.

ATAC-seq analysis

   Patient ATAC-seq fastq files were processed in two replicates using the
   route “atac” from the Slide-n-Seq (sns) pipeline:
   [230]https://igordot.github.io/sns/. Processing steps included aligning
   paired-end reads to the human reference genome(hg19) with
   Bowtie2(v2.3.4.1)^[231]81. Reads with a mapping quality <30 were
   removed. Duplicated reads were removed using Sambamba (v0.6.8)^[232]82.
   Remaining reads were analyzed by applying the peak-calling algorithm
   MACS2(v2.1.1)^[233]83. Bigwig tracks were obtained for visualization on
   individual samples using deeptools (v3.1.0)^[234]79. Differential
   ATAC-seq peak analysis was performed using DiffBind^[235]84. Nearest
   genes were annotated using ChIPseeker^[236]85. Enrichment analysis of
   genomic regions sets was performed using Bioconductor package LOLA
   (Locus overlap analysis or enrichment of genomic ranges; R package
   version 1.24.0) with RStudio (v3.6.1) with the hg19 LOLA core
   database^[237]51. LOLA core is curated from many sources including TF
   binding sites from Encode and epigenome databases from Cistrome.

Hi-C and H3K27ac HiChIP analysis

   Raw Hi-C sequencing data was processed with the hic-bench
   platform^[238]24. Cell line Hi-C data was processed as single
   replicates. Data was aligned against the human reference
   genome(GRCh37/hg19) with bowtie2(version 2.3.1)^[239]81. The reads used
   for downstream analyzes were filtered for by the GenomicTools tools-hic
   filter command in the hic-bench platform using default parameters. The
   GenomicTools tools-hic filter command discards reads including
   multi-mapped reads (“multihit”), read-pairs with only one mappable read
   (“single sided”), duplicated read-pairs (“ds.duplicate”), read-pairs
   with a low mapping quality of MAPQ < 20, read-pairs resulting from
   self-ligated fragments (together called “ds.filtered”), and short range
   interactions resulting from read-pairs aligning within 25 kb
   (“ds.too.short”). Downstream analysis was performed with the accepted
   intra-chromosomal read- pairs (“ds.accepted intra”). The number of
   accepted intra-chromosomal read-pairs varied between ~40 and ~140
   million for all samples (Chapter 1; Fig. [240]2). Hi-C interaction
   matrices were generated for each chromosome separately using the
   hic-bench platform at 40 kb resolution. Filtered read counts were
   normalized by iterative correction and eigenvector decomposition
   (ICE)^[241]86. To account for variances in read counts of more distant
   loci, distance normalization for each chromosome matrix was performed.

Translocation calling from patient Hi-C data

   Hi-C breakfinder^[242]27 and HiNT^[243]28 were used to call previously
   identified, as well as, novel intra-chromosomal and inter-chromosomal
   translocations from the matched B-ALL diagnosis/relapse patient Hi-C
   data. Hi-C breakfinder required 3 input files including a bam file, an
   inter-chromosomal expectation file, and an intra-chromosomal
   expectation file. The bam file was generated by hic-bench and the two
   expectations files for hg19 were provided on the Dixon lab GitHub page:
   [244]https://github.com/dixonlab/hic_breakfinder. Hi-C breakfinder
   generated lists of structural variant predictions at different
   resolutions including 1 Mb, 100 kb, and 10 kb. For the purposes of
   comparing translocations between Hi-C breakfinder and HiNT, the 100 kb
   resolution was used. Translocations were called at 100 kb resolution
   using HiNT through the “translocations” step within the hic-bench
   platform^[245]24. EagleC^[246]87 (version 0.1.9) was used to call copy
   number inferences (deletions and duplications) at three resolutions
   5 kb,10 kb,50 kb. High-confidence calls were determined by setting the
   probability cutoff to 0.95.

Fusion transcript analysis

   Fusion transcript analysis was performed using STAR-fusion:
   [247]https://github.com/STAR-Fusion/STAR-Fusion/wiki^[248]88.
   STAR-fusion required paired-end RNA-seq fastq files and a reference
   genome (hg19) in order to identify candidate fusion transcripts.

Hi-C contact matrix visualization

   To visualize Hi-C contact matrices, ICE normalized Hi-C contact
   matrices for the corresponding chromosome were loaded and normalized by
   the total number of intra-chromosomal interactions for Diagnosis and
   Relapse samples. The log2FC Hi-C contact matrix was produced by
   applying the log2 function on the division product of the Relapse Hi-C
   table by the Diagnosis Hi-C table.

   To visualize translocations, Hi-C contact matrices were generated in
   the.hic format using the “tracks” step within the hic-bench
   platform^[249]24. Hi-C contact matrices were visualized using
   Juicebox^[250]29.

Hi-C contact matrix reconstruction

   Hi-C contact matrices were reconstructed to demonstrate
   inter-chromosomal translocations at 20 kb resolution using
   NeoLoopFinder^[251]25. NeoLoopFinder required.cool files which were
   generated using the “tracks” step within the hic-bench
   platform^[252]24. NeoLoopFinder also required files containing a list
   of structural variations that were generated at 10 kb resolution using
   Hi-C breakfinder^[253]27 as described above.

A/B compartments analysis

   A/B compartments analysis was performed using the “compartments” and
   “compartments-stats” steps within the hic-bench platform^[254]24. The
   Cscore tool algorithm was used to assign active (A) and inactive (B)
   compartments^[255]34. For a bin to be considered a switch from A to B
   or from B to A, the compartment score sign had to flip in sample 2
   (Relapse) when compared with the reference sample (sample 1 or
   Diagnosis). The absolute difference between the compartment scores had
   to be higher than the cutoff (default: 1.2). The difference was
   computed as a relative delta:
   [MATH: <mi
   mathvariant="normal">delta</mi><mo>=</mo><mrow><mo>(</mo><mrow><mi>Y</m
   i><mo>−</mo><mi>X</mi></mrow><mo>)</mo></mrow><mo>/</mo><mi>a</mi><mi>b
   </mi><mi>s</mi><mrow><mo>(</mo><mrow><mi>Y</mi></mrow><mo>)</mo></mrow>
   <mspace width="0.25em"></mspace><mi>#</mi><mspace
   width="0.25em"></mspace><mi mathvariant="normal">delta</mi><mspace
   width="0.16em"></mspace><mi mathvariant="normal">value</mi><mspace
   width="0.16em"></mspace><mi mathvariant="normal">calculation</mi>
   :MATH]

   X = compartment score of bin in sample 1 (reference)

   Y = compartment score of bin in sample 2

TAD interactivity analysis

   TAD interactivity was assessed using the “domains” and “domains-diff”
   steps within the hic-bench platform^[256]24. The “domains” step uses
   the hic-ratio algorithm for TAD calling developed within hic-bench by
   previous Tsirigos lab member Haris Lazaris in which the average of the
   normalized interaction scores is calculated for all interactions taking
   place within a particular TAD. The “domains-diff” step assesses TAD
   interactivity alterations and was developed by previous lab members
   Sofia Nomikou and Andreas Kloetgen^[257]17,[258]24. To identify TADs
   with differential interactivity between Relapse and Diagnosis, we used
   the TADs identified in the Diagnosis sample as a reference to identify
   common TADs. Once mean TAD interactivity values were obtained from the
   Hi-C data, a wilcoxon two-sided rank sum non-parametric test was
   performed to determine the p-values for each TAD. Multiple testing was
   used to correct these p-values by adjusting to the total number of
   TADs. Lastly, the log2 fold change (log2FC) of intra-TAD activity value
   was calculated between the samples. TAD interactivity alterations were
   categorized as significant if abs(L2FC > 0.25) and FDR < 0.01.

Looping analysis

   Looping analysis was performed using the “loops” and “loops-diff” steps
   within the hic-bench platform^[259]24. Chromatin loops from Hi-C and
   H2K27ac HiChIP data were called using FitHiC2^[260]49,[261]50. Loops
   were classified as diagnosis-specific, relapse-specific, or common. The
   common loops were also classified as increased, decreased, or stable
   using a log2 fold change cutoff (abs(L2FC) > 1). A loop was considered
   sample-specific if it was found to be significant only in sample ‘x’
   but not in sample ‘y’, when using a q-value cutoff of ‘qcut1’ (default:
   0.01) as well as the loop shouldn’t be found significant in sample ‘y’
   when using a more relaxed cutoff of ‘qcut2’ (default: 0.1). This is to
   avoid considering a loop sample-specific when the significance was
   relatively close in both samples but in one sample it was lower than
   the cutoff (significant) and in the other sample it was slightly higher
   than the cutoff (not significant). These cases were classified as
   common loops.

Integration of patient Hi-C data with other datasets

   To show the correlation between the Hi-C data and the various
   sequencing datasets, the peaks obtained from ATAC-seq, and the genes
   obtained from RNA-seq were mapped to the AB, BA, and stable regions or
   the increased, decreased, or stable TADs using the “bedtools intersect”
   command^[262]89,[263]90. We calculated the peak intensity fold change
   or gene expression fold change for peaks or genes assigned to a
   compartment region or TAD activity region between diagnosis and relapse
   and showed the correlation with boxplots or bar plots. Statistical
   significance was assessed using a paired two-sample t test.

Reporting summary

   Further information on research design is available in the [264]Nature
   Portfolio Reporting Summary linked to this article.

Supplementary information

   [265]Supplementary Information^ (1.8MB, pdf)
   [266]Peer Review File^ (1MB, pdf)
   [267]41467_2024_51492_MOESM3_ESM.pdf^ (77KB, pdf)

   Description of Additional Supplementary Files
   [268]Supplementary Data 1^ (550.2KB, csv)
   [269]Supplementary Data 2^ (10.9KB, csv)
   [270]Supplementary Data 3^ (10.3KB, xlsx)
   [271]Supplementary Data 4^ (1.2MB, xlsx)
   [272]Supplementary Data 5^ (123.9KB, xlsx)
   [273]Supplementary Data 6^ (12.6KB, xlsx)
   [274]Reporting Summary^ (93.4KB, pdf)

Acknowledgements