Abstract
Relapsed pediatric B-cell acute lymphoblastic leukemia (B-ALL) remains
one of the leading causes of cancer mortality in children. We performed
Hi-C, ATAC-seq, and RNA-seq on 12 matched diagnosis/relapse pediatric
leukemia specimens to uncover dynamic structural variants (SVs) and 3D
chromatin rewiring that may contribute to relapse. While translocations
are assumed to occur early in leukemogenesis and be maintained
throughout progression, we discovered novel, dynamic translocations and
confirmed several fusion transcripts, suggesting functional and
therapeutic relevance. Genome-wide chromatin remodeling was observed at
all organizational levels: A/B compartments, TAD interactivity, and
chromatin loops, including some loci shared by 25% of patients. Shared
changes were found to drive the expression of genes/pathways previously
implicated in resistance as well as novel therapeutic candidates, two
of which (ATXN1 and MN1) we functionally validated. Overall, these
results demonstrate chromatin reorganization under the selective
pressure of therapy and offer the potential for discovery of novel
therapeutic interventions.
Subject terms: Acute lymphocytic leukaemia, Computational biology and
bioinformatics
__________________________________________________________________
In this study the authors utilise Hi-C, ATACseq, and RNA-seq data from
12 diagnosis/relapse matched leukemia specimens to reveal 3D genome
organization as a critical factor in reshaping the transcriptional
output of leukemia cells under the selective pressures of therapy.
Introduction
Relapsed pediatric B-cell acute lymphoblastic leukemia (B-ALL) remains
one of the leading causes of cancer mortality in children^[48]1,[49]2.
While five-year survival rates for newly diagnosed pediatric B-ALL have
improved and now approach 90%, up to 20% of patients will suffer
relapse and face a dismal prognosis^[50]3. Although recent therapeutic
approaches to treating relapsed B-ALL seem encouraging, leukemia
sub-clones continue to emerge through the selective pressures of
therapy^[51]4. Therefore, investigating the mechanisms responsible for
cancer recurrence and therapy resistance is crucial for preventing and
treating B-ALL relapse.
Previous studies have focused on the discovery of relapse-enriched
somatic mutations and copy number changes, which proved to be fruitful
in the identification of mutations involved in resistance to one or
more drugs used in treatment^[52]5–[53]8. Our recent work characterized
the evolution of the epigenetic landscape of B-ALL from diagnosis to
relapse. While changes in the epigenome varied widely across patients,
we were able to identify shared relapse-specific superenhancers along
with concordant gene expression changes. We also found convergent
evolution of histone modifications on pathways known to be associated
with resistance, such as p53, JAK/STAT, and Ras, thus highlighting the
importance of chromatin changes in disease progression^[54]9. Although
studies of the linear genome have been enormously productive in
yielding fundamental insights into B-ALL biology and therapeutic
options, recent studies suggest that an additional critical level of
control of the epigenome is connected to the 3D structure of
chromosomes.
Thus far, no one has explored B-ALL progression through the lens of 3D
chromatin organization. The advent of chromosome conformation capture
(3 C) techniques, such as Hi-C, revealed that the folding of large
complex chromosomes is nonrandom and involves a hierarchy of
biologically relevant structures including chromosome territories, A
and B compartments, topologically associating domains (TADs), and
enhancer-promoter chromatin loops^[55]10–[56]14. These organizational
layers have been shown to be tightly linked to epigenetic and
transcriptional regulation. Additionally, a growing number of cancers
have been shown to harbor disruptions in various layers of chromosome
topology including gliomas, non-Hodgkin lymphoma, and T-cell
ALL^[57]15–[58]17.
In B-cell lymphomas, a recurrent histone H1 ((HIST1H1B-E) linker
protein mutation was shown to cause large scale chromatin decompaction
coupled with a gain of H3K36me2 and/or loss of H3K27me3, which created
a permissive landscape for the expression of stem cell genes normally
silenced during development^[59]18. Loss of H1c and H1e in mice
enhanced fitness and self-renewal properties of GC B-cells, causing
aggressive lymphomas. Similarly, recurrent alterations in TAD
boundaries and TADs in T-ALL were shown to be linked to changes in gene
expression and affect downstream oncogenic effectors including NOTCH1
signaling^[60]17. In B-ALL, we showed that the relapse-specific NSD2
E1099K mutation drives relapse by disrupting 3D chromatin organization
at the compartment level^[61]19. More recently, recurrent and
subtype-specific alterations in A/B compartments, TAD activity, and
chromatin loops were identified in AML^[62]20. Additionally, inhibition
of DNA methylation with a hypomethylating agent restored 3D genome
organization and gene regulation in AML cells. We now aim to capitalize
on chromatin conformation capture techniques to discover how each layer
of 3D chromatin organization evolves over the course of B-ALL
progression and contributes to therapy resistance and relapse.
To uncover 3D chromatin architecture-related mechanisms underlying
disease progression, we performed Hi-C, ATAC-seq, and RNA-seq on 12
matched primary pediatric leukemia specimens at diagnosis and relapse.
Mapping of structural variations using Hi-C data revealed previously
unidentified stable, diagnosis-specific, and relapse-specific SVs as
well as relapse-specific fusion transcripts indicating a possible role
in clonal evolution (e.g. MBTD1::WDR89). Moreover, Hi-C analysis
revealed genome wide chromatin remodeling specifically in terms of A/B
compartments, TAD interactivity, and chromatin loops. We showed that
12.17% of changes were shared by at least 25% of the patients,
including at the JAK2 locus. Integration with ATAC-seq and RNA-seq
datasets revealed strong correlation with both gene expression and
chromatin accessibility. Importantly, integration of looping data with
ATAC-seq and Hi-ChIP for H3K27ac demonstrated rewiring events can occur
at stable chromatin accessibility sites. Additionally, we identified
recurrent A/B compartments and TAD interactivity changes across the
patient cohort supporting a crucial role in the clonal evolution of
B-ALL. Recurrent TAD interactivity changes were identified in genes
previously found dysregulated in cancer, such as PCDH9, as well as
pathways previously implicated in drug resistance such as Wnt-beta
catenin signaling, highlighting the activation of these vital pathways
through 3D chromatin architecture alterations. Lastly, we discovered
novel genes dysregulated by recurrent 3D changes, such as MN1 and
ATXN1, which were found to play a role in drug response validating the
functional significance of our findings. These results significantly
extend our knowledge of shifts in the genetic and epigenetic landscape
that drive therapy resistance in childhood B-ALL and identify novel
therapeutic targets.
Results
Mapping structural variations in pediatric ALL using Hi-C
To broadly assess global chromatin architecture throughout the
progression of pediatric B-ALL, we performed Hi-C, RNA-seq, and
ATAC-seq on sorted blasts (CD45dim, CD19 + ) from 12 matched primary
bone marrow B-ALL samples at diagnosis and at relapse from patients
enrolled in Children’s Oncology Group (COG) protocols or St. Jude
Children’s Research Hospital Total XV protocol^[63]21 (Supplementary
Fig. [64]1a; Supplementary Table [65]1). Recurrent cytogenetic
abnormalities in pediatric B-ALL serve to identify subgroups important
for risk stratification of therapy^[66]22,[67]23, therefore we first
examined the Hi-C data for structural variants. Patient Hi-C data was
processed by our in-house HiC-bench platform^[68]24. Alignment rates
showed a high percentage of usable long-range read-pairs across our
patient cohort ( ~ 100 million; Supplementary Fig. [69]1b, c).
Visual inspection of entire patient Hi-C contact matrices revealed
inter-chromosomal contacts (Fig. [70]1a). Closer inspection of the Hi-C
contact matrices of patient PATIKJ demonstrated a canonical ALL
rearrangement, the BCR-ABL1 translocation between chromosomes 9 and 22
at both diagnosis and relapse (Fig. [71]1b). Reconstruction of Hi-C
contact matrices revealed the presence of inter-chromosomal contacts
across the breakpoint and the formation of a new topologically
associating domain, referred to as a neoTAD^[72]25 (Fig. [73]1c).
Fusion transcript analysis^[74]26 with RNA-seq data confirmed the
presence of the BCR::ABL1 fusion transcript in this patient
(Supplementary Data [75]1). This example of a well-known B-ALL
translocation screened for at diagnosis by conventional methods
provided evidence that our methods in detecting these chromosomal
abnormalities using Hi-C was successful.
Fig. 1. Mapping of structural variations at Diagnosis and Relapse using Hi-C.
[76]Fig. 1
[77]Open in a new tab
a Entire genome Hi-C contact matrix for patient PATIKJ. b Zoomed in
Hi-C contact matrix demonstrating BCR-ABL1 translocation between
chromosome 9 and chromosome 22 in patient PATIKJ at diagnosis and at
relapse (left and right respectively). c Reconstructed Hi-C contact
matrix presenting an example of a shared BCR-ABL1 translocation in
patient PATIKJ between chromosomes 9 and 22 at diagnosis and relapse
(left and right respectively). Green dashed triangle highlights a
NeoTAD. d Translocations and their log-odds score identified in
diagnosis and relapse per patient. Those also found by cytogenetics or
fusion transcripts are marked.
To further our investigation, we utilized Hi-C-Breakfinder^[78]27 and
HiNT^[79]28 to call inter-chromosomal translocations from each Hi-C
dataset in our cohort. We overlapped translocations predicted by the
methods with cytogenetics provided for each patient, and found
Hi-C-Breakfinder performed better and revealed previously identified as
well as novel translocations (Supplementary Fig. [80]2a, b;
Supplementary Data [81]2). Additionally, EagleC was used to call all
SVs and was found to produce approximately 50% overlapping
translocation calls with Breakfinder. While 50 stable translocations
were called, for the first time, we report diagnosis-specific (5) and
relapse-specific (21) translocations providing further evidence for
clonal evolution as a mechanism for drug resistance (Fig. [82]1d). To
further investigate these 26 dynamic translocations, we visually
inspected each of them using Juicebox web application^[83]29 and
confirmed one of the diagnosis-specific and 11 of the relapse-specific
translocations (Supplementary Data [84]3). Interestingly, two of the
relapse-specific translocations had faint signal at diagnosis upon
visual inspection but were below the level of detection for
Breakfinder, supporting clonal selection of a minor subclone at
diagnosis. For example, while a translocation was called between
chromosomes 16 and 22 in SJETV043 at relapse but not at diagnosis, upon
visual inspection we were able to detect faint signal at the breakpoint
at diagnosis (Fig. [85]2a). Another example in the same patient
revealed a faint signal at diagnosis at the breakpoint of the
t(chr1,12) called at relapse (Supplementary Fig. [86]2c).
Fig. 2. Mapping of structural variations using Hi-C reveals Diagnosis and
Relapse-specific translocations.
[87]Fig. 2
[88]Open in a new tab
a Hi-C contact matrix demonstrating relapse-specific translocation
between chromosomes 16 and 22 in patient SJETV043 with faint signal
visually detected at diagnosis (arrow). b Reconstructed Hi-C contact
matrix presenting a diagnosis-specific translocation in patient PATJJX
between chromosomes 5 and 9 at diagnosis and relapse (left and right
respectively). c Reconstructed Hi-C contact matrix presenting a
relapse-specific translocation in patient SJETV043 between chromosomes
14 and 17 at diagnosis and relapse (left and right respectively). Green
dashed triangle highlights a NeoTAD. d Reconstructed Hi-C contact
matrix demonstrating a relapse-specific NeoTAD (dashed green triangle).
New contacts overlap region of ATAC peaks (purple tracks) and increased
transcription at relapse (red tracks) overlapping the E2F4 locus
(dashed blue triangle). e Bar plot showing upregulation of E2F4
transcripts at relapse in PASFIF.
While the presence of inter-chromosomal contacts across breakpoints
were evidenced by the reconstruction of Hi-C contact matrices^[89]25,
we wanted to determine whether these contacts resulted in any fusions
at the RNA level. Overall the fusion transcript analysis using
STAR-fusion called a variable number of fusion transcripts across
patients (Supplementary Fig. [90]2d; Supplementary Data [91]1). Of the
50 translocations shared from diagnosis to relapse, nine were found to
have fusion transcripts. The one diagnosis-specific and three of the 11
relapse-specific translocations also had fusion transcripts detected,
none of which were shared across patients. An example of a
diagnosis-specific translocation in patient PATJJX showed the presence
of a fusion between chromosomes 5 and 9 at diagnosis only
(Fig. [92]2b). Fusion transcript analysis revealed the existence of a
previously unreported fusion transcript in the diagnosis sample,
ZCCHC7::PDGFRB (Supplementary Data [93]1 and [94]3). PDGFRB alterations
are commonly found in Ph-like B-ALL cases, which has clinical
implications as these patients can be responsive to tyrosine kinase
inhibition^[95]30,[96]31. Lastly, an example of a relapse-specific
translocation in patient SJETV043 showed the presence of
inter-chromosomal contacts between chromosomes 14 and 17 only in the
relapse sample (Fig. [97]2c). Fusion transcript analysis confirmed the
presence of the known ETV6::RUNX1 at both diagnosis and relapse, and
demonstrated the existence of a previously unreported fusion transcript
MBTD1::WDR89 in the relapse sample (Supplementary Data [98]1 and
[99]3). MBTD1 is found as a recurrent translocation partner with
ZMYND11 in acute myeloid leukemia and has been demonstrated to play a
role in transcriptional activation of oncogenic genes such as
myc^[100]32.
For the remaining relapse-specific, visually validated translocations
that did not have fusion transcripts detectable by STAR-fusion, we
sought to determine if there were examples of dysregulated gene
expression surrounding breakpoints. One example was found for the
relapse-specific translocation between chromosomes 5 and 16 in patient
PASFIF (Fig. [101]2d; Supplementary Fig. [102]2e). Upon reconstruction
around the breakpoint, new contacts were observed across the two
chromosomes in the relapse sample compared to the diagnosis sample.
Interestingly, several genes within the neoTAD show differential gene
expression. Two ATAC peaks near a transcriptionally active region are
evident on chromosome 5 brought into contact with genes near the
breakpoint on chromosome 16 (Fig. [103]2d), suggestive of potential
promoter/enhancer hijacking as Xu et al. previously
demonstrated^[104]20. One of the genes, E2F transcription factor 4
(E2F4), that is upregulated at relapse (Fig. [105]2e) has been
implicated in acute myeloid leukemia (AML). AML patients with high
expression of E2F4 had significantly lower overall survival and in in
vitro models its overexpression lead to increased proliferation and
differentiation blockage^[106]33. These results extend the landscape of
cytogenetic abnormalities in pediatric B-ALL that could play a
significant role in the identification of subgroups that have
significant clinical implications for risk stratification and
therapeutic strategy.
B-ALL relapse is associated with recurrent changes in A/B compartments
Next, we sought to determine changes in the various layers of 3D
chromatin architecture throughout the progression of pediatric B-ALL.
We first assessed compartmentalization of the genome between B-ALL
samples at diagnosis and relapse. A/B compartment status was analyzed
in individual patient pairs using Cscore^[107]34. Principal Component
Analysis (PCA) of Cscore compartment scores revealed two distinct
groups of patient samples in which the matched diagnosis and relapse
pairs cluster together and the patient with a KMT2A rearrangement,
SJETV043, clusters separately (Fig. [108]3a). Overall, the number of A
and B compartment calls was comparable between diagnosis and relapse
samples (Fig. [109]3b). We next assessed A/B compartment switching in
individual diagnosis-relapse patient pairs. We observed 4.88%
compartment switching between diagnosis and relapse samples across the
12 patient pairs (Fig. [110]3c; Supplementary Fig. [111]3a). In order
to confirm that 3D changes were not driven by copy number variations
(CNVs), we called CNVs using EagleC. Less than 1.3% of compartment bins
that underwent a switch were found to overlap with a CNV (Supplementary
Data [112]4). B-ALL is a heterogeneous disease classified by various
chromosomal abnormalities, such as hyperdiploidy, hypodiploidy,
translocations, and intra-chromosomal amplification. Given this
heterogeneity, B-ALL relapse expectedly has various mechanisms of
clonal selection. While the small size of our cohort is limiting, we
were particularly interested in identifying recurrent compartment
switches across the patient cohort and/or switches that were concordant
with gene expression and chromatin accessibility as they could be
indictive of a role in clonal evolution. Overall, we identified 12.17%
recurrent switches shared by at least 3 out of 12 patients (25% of
patients). Specifically, 5.73% (420/7335) of A to B switches and 5.99%
(483/8059) of B to A switches were shared by at least 3 patient samples
(Fig. [113]3d).
Fig. 3. Compartments undergo discrete changes upon B-ALL relapse.
[114]Fig. 3
[115]Open in a new tab
a PCA of A/B compartment calls with Cscore. b Violin plot showing A and
B compartment calls for diagnosis and relapse samples. Significant
differences were calculated with a paired one-sided t test comparing
number of A or B eigenvector-1 bins between diagnosis and relapse. c
Bar plot showing number of compartment switches for each patient pair
upon relapse. Bar plot shows mean, bars represent standard deviation,
and each point represents a patient (n = 12). d Heatmap representation
of compartment switches shared by at least 3 patients upon relapse. e
Correlation boxplots between gene expression and compartment changes
(left) and chromatin accessibility and compartment changes (right) from
relapse to diagnosis of 12 individual patient pairs. Significant
differences were calculated by unpaired one-sided t test comparing
genes or peaks from compartment changes AB or BA to genes or peaks from
stable compartments (*****p < .00001). Boxplots show medians
(horizontal line in each box), interquartile ranges (boxes), 1.5
interquartile (whiskers) and each point represents a gene or peak. f
IGV tracks demonstrating concordant B to A compartment switches and
increased gene expression in 3 patient pairs at the JAK2 gene locus.
We then integrated the Hi-C data with RNA-seq and ATAC-seq data to
investigate whether compartment switches are associated with changes in
gene expression and chromatin accessibility (Supplementary
Fig. [116]3b–e). We performed differential expression analysis of all
expressed genes (FPKM > 1) and differential chromatin accessibility
analysis of all ATAC-seq peaks (abs(log2 (fold change))>1.0,
false-discovery rate (FDR) < 0.01) within compartment switches at the
individual patient pair level and found that B to A compartment
switches upon relapse correlated with an increase in gene expression,
whereas A to B compartment switches upon relapse were associated with a
decrease in gene expression relative to expression changes within
stable compartments (Fig. [117]3e, left panel). We witnessed a similar
trend with chromatin accessibility changes (Fig. [118]3e, right panel).
Interestingly, we found a recurrent B to A compartment switch at the
JAK2 locus in 3 of the 12 patient pairs. This B to A compartment switch
was also concordant with increased gene expression (Fig. [119]3f). The
JAK2 gene is a member of the non-receptor tyrosine kinase family and
mediates intracellular signaling upon activation of cytokine
receptors^[120]35. Several studies have identified gain of function
mutations and translocations affecting JAK2 in pediatric
B-ALL^[121]36,[122]37.
B-ALL relapse drives recurrent changes in TAD interactivity
Following compartment analysis, we investigated TAD interactivity
changes (i.e. changes in intra-TAD chromatin interactions) between
B-ALL diagnosis and relapse patient pairs. We first called TADs at
diagnosis and relapse and identified ~3000 TADs in each patient
(Supplementary Fig. [123]4a). Comparison of TAD interactivity between
B-ALL samples at diagnosis and relapse identified a range of
significantly differential TADs across patients, with 8 out of 12
patients having more gains in TAD activity at relapse (Fig. [124]4a, b;
(FDR) < 0.1 and abs(log2(fold change))>0.25; Supplementary
Fig. [125]4b). Less than 2.4% of differential TADs overlapped with CNVs
(Supplementary Data [126]5). Next, to investigate whether TAD
interactivity associated with changes in gene expression and chromatin
accessibility, we performed differential expression analysis of all
expressed genes (FPKM > 1) and differential chromatin accessibility
analysis of all ATAC-seq peaks (abs(log2 (fold change))>1.0,
FDR < 0.01) within differentially active TADs at the individual patient
pair level. Increased TAD interactivity at relapse significantly
associated with positive fold changes in gene expression, whereas
decreased TAD interactivity at relapse associated with negative fold
changes in gene expression relative to expression changes within stable
TADs (Fig. [127]4c, left). We saw a similar trend with chromatin
accessibility changes (Fig. [128]4c, right).
Fig. 4. B-ALL patients share concordant changes in TAD activity.
[129]Fig. 4
[130]Open in a new tab
a Bar plot presenting number of intra-TAD activity changes per patient
pair upon relapse (abs(L2FC) > 0.25, FDR < 0.01). Bar plot shows means,
bars represent standard deviation, and each point represents a patient
(n = 12). b Volcano plot presenting increased, decreased, and stable
TADs between diagnosis and relapse samples (red, blue, and gray
respectively). c Correlation boxplots showing gene expression and
chromatin accessibility changes within differential TADs (paired t
test, p < 0.05) from relapse to diagnosis of 12 individual patient
pairs. Significant differences were calculated by unpaired one-sided t
test comparing gene expression or ATAC peaks within TADs with
interactivity changes, losses, or gains, to gene expression or ATAC
peaks within stable TADs(*****p < .00001). Boxplots show medians
(horizontal line in each box), interquartile ranges (boxes), 1.5
interquartile (whiskers) and each point represents a gene or peak. d
Hi-C contact matrices presenting a TAD decreased in activity at relapse
in 4 of 12 patient pairs at the PCDH9 locus (Diagnosis, Relapse, and
L2FC (Relapse/Diagnosis) from top to bottom). e Box plots demonstrating
decreased PCDH9 mean TAD activity in cpm (top left), decreased gene
expression in tpm (top right), decreased PCDH9 compartment Cscore
(bottom left), and decreased chromatin accessibility in cpm (bottom
right) in the 4 patient pairs at diagnosis and relapse. Significant
differences were calculated with a paired one-sided t test comparing
mean TAD activity, TPM, Cscore, and cpm between diagnosis and relapse.
Boxplots show medians (horizontal line in each box), interquartile
ranges (boxes), 1.5 interquartile (whiskers) and each point represents
a patient with pairs connected by lines. f IGV genome browser tracks
demonstrating A to B compartment switch, decreased chromatin
accessibility, and decreased gene expression in the 4 patients with
decreased TAD interactivity at the PCDH9 gene locus.
Of note, patient SJPHALL005 had more than 800 changes in TAD
interactivity, when applying the same thresholds, with the majority
being loss in activity (Supplementary Fig. [131]4c). We observed a
similar trend with patient SJPHALL005 when integrating the TAD
interactivity data with either gene expression or chromatin
accessibility data (Supplementary Fig. [132]4d). This patient also
showed the greatest amount of compartment changes at 10.85%, with the
majority being A to B (Supplementary Fig. [133]3a). Interestingly, we
saw increased CEBPA expression and decreased PAX5 at relapse
(Supplementary Fig. [134]4e), suggesting a more myeloid-like profile
upon relapse. These observations are consistent with patient SJPHALL005
having a therapy related or lineage switch secondary malignancy as
previously described^[135]38. Furthermore, we also observed myeloid
lineage architectural changes, including a gain of TAD activity, a
switch to A compartment, and increased expression of DACH1 at relapse
(Supplementary Fig. [136]4f, g). DACH1 has been shown to regulate cell
cycle progression in myeloid cells^[137]39.
As with the compartment analysis, we next aimed to identify TAD
interactivity changes that were shared amongst our patient cohort
possibly indicating major modes of tumor escape from therapy. Notably,
we found a recurrent decrease in TAD interactivity at the PCDH9 locus
in 4 of the 12 patient pairs (Fig. [138]4d). All 4 patients
demonstrated decreased mean TAD interactivity, gene expression, Cscore,
and accessibility (Fig. [139]4e). Closer examination of the patient
with the most dramatic change in TAD interactivity, patient PASPNY,
revealed concordant changes in compartments, chromatin accessibility,
and gene expression (Fig. [140]4f). The PCDH9 (Protocadherin 9) gene
encodes a transmembrane protein involved in cell adhesion and calcium
ion binding and has been identified as one of the top downregulated
genes in pediatric B-ALL^[141]40–[142]44.
Lastly, we performed pathway enrichment analysis^[143]45–[144]47 to
identify pathways associated with changes in 3D architecture. Pathway
enrichment analysis on genes that were concordant in expression with 3D
changes such as A/B compartment switches or TAD interactivity changes
revealed enrichment of previously identified key pathways including,
mTORC1, Wnt-beta catenin, and p53 signaling (Supplementary
Fig. [145]4h)^[146]9,[147]48. These data demonstrate that the
activation of these important pathways goes beyond genetic alterations
and validate these pathways as novel targets to prevent and treat
relapse.
Chromatin loop data reveals enriched factors at relapse
We next wanted to investigate the rewiring of chromatin loops and the
factors responsible for mediating such changes during selection from
therapy. We performed a looping analysis with FitHiC2^[148]49,[149]50.
Looping calls varied based on sequencing depth across the patient
cohort (Fig. [150]5a). We performed differential looping analysis to
identify loops that increased, decreased, or remained the same upon
relapse (Fig. [151]5b) and visually confirmed them using Aggregate Peak
Analysis (APA) (Supplementary Fig. [152]5a). We observed a strong
correlation of loop changes with both gene expression changes and
chromatin accessibility changes (Fig. [153]5c). Additionally, we
identified a positive association between looping changes and both
compartment and TAD interactivity changes (Supplementary Fig. [154]5b)
Fig. 5. Chromatin loop data reveals enriched factors at relapse.
[155]Fig. 5
[156]Open in a new tab
a Bar plot presenting number of loops called with Fit-Hi-C2 at
diagnosis and relapse (q < 0.01). b Bar plot presenting number of loops
increased, decreased or stable upon relapse (blue, red, gray
respectively). c Fit-Hi-C loop correlation with gene expression and
chromatin accessibility (left and right respectively). Significant
differences were calculated by unpaired one-sided t test comparing
genes or peaks from loops that increased or decreased to genes/peaks
from stable loops (*****p < 0.00001) from relapse to diagnosis of 12
individual patient pairs. Boxplots show medians (horizontal line in
each box), interquartile ranges (boxes), 1.5 interquartile (whiskers)
and each point represents a gene or peak. d Enrichment analysis with
LOLA of decreased and increased ATAC-seq peaks (left and right)
concordant with Fit-Hi-C loops and gene expression.
To identify key regulators of transcription attributed to loop changes,
we performed motif enrichment analysis with Locus Overlap Analysis
(LOLA)^[157]51 with ATAC-seq peaks overlapping concordant chromatin
loops. We identified several chromatin binding factors enriched in
regions with decreased chromatin accessibility and decreased chromatin
looping including RXR and EZH2 (Fig. [158]5d, left panel). We also
identified several enriched factors in regions with increased chromatin
accessibility and increased chromatin looping including FOS and NF-E2
(Fig. [159]5d, right panel).
H3K27ac HiChIP data reveals exclusive looping at stable chromatin
accessibility sites
We were also interested in the behavior of enhancer anchored loops upon
B-ALL relapse based on our previous work demonstrating the importance
of enhancers and superenhancers in relapse^[160]9. We performed H3K27ac
HiChIP with one patient pair, PASFIF, to evaluate H3K27ac anchored
loops. Looping analysis was performed with FitHiC2^[161]49,[162]50. To
examine the relationship between alterations in the H3K27ac chromatin
loops and chromatin accessibility, we categorized the H3K27ac chromatin
loops into those that were exclusively increased (exclusive to
relapse), exclusively decreased (exclusive to diagnosis), dynamically
increased (increased), dynamically decreased (decreased), or stable and
assessed chromatin accessibility changes. Interestingly, integration of
H3K27ac HiChIP with ATAC-seq data revealed significantly increased and
decreased exclusive loops at regions with stable chromatin
accessibility sites (Fig. [163]6a). This data suggests that significant
3D rewiring events can occur at stable chromatin accessibility sites
that would otherwise be entirely ignored if analysis is focused only on
sites with changes in chromatin accessibility using ATAC-seq.
Fig. 6. PASFIF H3K27ac HiChIP data reveals exclusive loops at stable ATAC-seq
peaks.
[164]Fig. 6
[165]Open in a new tab
a ATAC-seq and Fit-Hi-C loop association bar plot. b Enrichment
analysis with LOLA of increased, decreased, or stable ATAC-seq peaks
with exclusively decreased, dynamic decreased, stable, exclusively
increased, or dynamic increased loops. c IGV snapshot of the Fms
Related Receptor Tyrosine Kinase 1 (FLT1) gene locus presenting
ATAC-seq, RNA-seq, and H3K27ac Hi-ChIP and Hi-C virtual 4C tracks at
the TSS as a viewpoint. Gray boxes highlight regions of differential
contact frequency.
To expand on this finding, we performed enrichment analysis using
LOLA^[166]51 with specifically stable ATAC-seq peaks overlapping
concordant and exclusive H2K27ac anchored loops. We identified several
interesting chromatin binding factors enriched in regions with stable
chromatin accessibility and decreased enhancer-promoter looping
including NCOR1, CREBBP, and MEIS (Fig. [167]6b, left panel). We also
identified several factors enriched in regions with stable chromatin
accessibility and increased enhancer-promoter looping including PLU1,
THAP1, and PHF8 (Fig. [168]6b, right panel).
Lastly, we present an example of stable chromatin accessibility and
exclusively increased H3K27ac chromatin looping at the vascular
endothelial growth factor receptor 1 (FLT1) gene locus^[169]52 in
patient PASFIF (Fig. [170]6c). Genome browser tracks show stable
chromatin accessibility and increased gene expression. H3K27ac HiChIP
and Hi-C virtual 4 C tracks show differential contact frequencies
between diagnosis and relapse samples. FLT1 activation in ALL has been
shown to increase survival of blasts by influencing their localization
into the bone marrow and exit back into circulation^[171]52.
Functional validation of candidate genes suggests a role in drug resistance
In an effort to discover potential gene targets involved in drug
resistance or relapse that are downstream of 3D chromatin changes, we
identified genes that demonstrated 3D change from diagnosis to relapse
(A/B compartment, TAD interactivity, or chromatin loop), that were
concordant with gene expression, and were shared by at least 3
patients, indicating a possible major role in clonal evolution. This
resulted in a list of 361 upregulated and 387 downregulated genes. To
validate the expression changes, we first overlapped our 3D gene-based
lists with a list of the most significantly differentially expressed
genes from a meta-analysis of a large cohort of three microarray
datasets generated from pediatric B-ALL diagnosis/relapse
pairs^[172]53. Thirty-two percent (116 of 387) of the 3D mediated
downregulated and 17% (61 of 361) of the upregulated genes were also
differentially regulated in this much larger cohort of samples
(Fig. [173]7a; Supplementary Table [174]2) indicating a prominent role
of 3D reorganization in clonal evolution. Furthermore, using published
data from a genome wide CRISPR screen performed with the B-ALL cell
line REH and various chemotherapies used in pediatric B-ALL treatment
performed by Oshima et al.^[175]54, we found that 42% (153 of 361) of
the upregulated and 24% (93 of 387) of the downregulated genes were
within the top 10% of guides negatively (for up) or positively (for
down) selected from at least one of the drugs tested, suggesting a cell
intrinsic role in drug resistance (Supplementary Data [176]6).
Fig. 7. Functional validation of MN1 and ATXN1 suggests a role in drug
resistance.
[177]Fig. 7
[178]Open in a new tab
a Venn diagrams presenting overlap of 3D gene-based lists with a list
of the most significantly differential genes from a meta-analysis of
three microarray datasets. b Example shown for PASYFN of an A to B
compartment switch shared by 4 patient pairs at the MN1 gene locus
including TAD activity, V4C signal, A/B compartments, RNA-seq, and
ATAC-seq tracks. c Box plot demonstrating a decrease in MN1 gene
expression in tpm (left) and decrease in Cscore (right) in the 4
patient pairs with the A to B compartment switch. Significant
differences were calculated using paired t test (*p < 0.05). Boxplots
show medians (horizontal line in each box), interquartile ranges
(boxes), 1.5 interquartile (whiskers) and each point represents a
patient with pairs connected by lines. d Linear regression model
demonstrating positive correlation between compartment and gene
expression changes at the MN1 locus for patients with differential
expression absL2FC > 1.0 (n = 7). e Hi-C contact matrices presenting a
TAD increased at relapse in 5 of 12 patient pairs at the ATXN1 locus
(Diagnosis, Relapse, and L2FC (Relapse/Diagnosis) from top to bottom).
f Box plots demonstrating increased ATXN1 gene expression (top) and
increased intra-TAD activity (bottom) in the 5 patient pairs upon
relapse. Significant differences were calculated using paired t test
(*p < 0.05). Boxplots show medians (horizontal line in each box),
interquartile ranges (boxes), 1.5 interquartile (whiskers) and each
point represents a patient with pairs connected by lines. g Cell
viability (left) and apoptosis assays (right) demonstrating increased
sensitivity to treatment with Prednisolone in SEM cells overexpressing
MN1 compared to empty vector control (EV). h Cell viability (left) and
apoptosis assays (right) demonstrating increased resistance to
treatment with Prednisolone in 697 cells overexpressing ATXN1 compared
to EV. Cell viability graphs shown (g, h) are one experiment
representative of three independent experiments, each with technical
triplicates. The dots represent the mean of technical triplicates for
that experiment with standard deviation shown. Curve represents the
nonlinear fit of transformed X. The null hypothesis of one curve fits
all datasets was rejected with p < 0.001 (g, h). Apoptosis graphs shown
represents the mean and standard deviation of 4 (g) and 3 (h)
individual experiments. Statistical significance shown based on
unpaired t test with Welch correction with p value indicated.
Interestingly, at the MN1 locus we identified a recurrent A to B
compartment switch in 4 of the 12 patient pairs that was concordant
with decreased gene expression, chromatin accessibility, and virtual
4 C signal (Fig. [179]7b–d). Linear regression model revealed a
positive relationship between changes in compartment Cscore and gene
expression (R^2 = 0.7, p < 0.019, Fig. [180]7d). MN1 encodes a
transcriptional co-regulator involved in the co-activation of various
transcription factors^[181]55,[182]56. Notably, we also found a
recurrent increase in TAD interactivity with a concomitant increase in
gene expression at the ATXN1 locus in 5 of the 12 patient pairs
(Fig. [183]7e, f). Closer examination of the patient with the most
dramatic change in TAD interactivity, PASPNY, revealed concordant
changes in TAD interactivity, compartments, gene expression, chromatin
accessibility, and virtual 4 C signal (Supplementary Fig. [184]6a).
This was further validated by increased H3K27ac peak changes in many
patients from our previous study^[185]9. The Ataxin-1 (ATXN1) gene is a
chromatin binding factor that represses Notch
signaling^[186]57,[187]58.
We validated a functional role for both MN1 and ATXN1 in B-ALL cell
lines. The B-ALL cell line SEM was engineered to overexpress MN1 or EV
as control and verified via western blot (Supplementary Fig. [188]6b).
We observed a significant increase in sensitivity to prednisolone upon
overexpression of MN1 (Fig. [189]7g) but no difference was observed
with thiopurines, suggesting that loss of MN1 would provide an
advantage to cells undergoing steroid treatment. The B-ALL cell line
697 was engineered to overexpress ATXN1 or EV as control and verified
via western blot (Supplementary Fig. [190]6c). Overexpression of ATXN1
resulted in increased resistance to prednisolone compared to EV
(Fig. [191]7h), suggesting upregulation of ATXN1 leads to drug
resistance. There was no difference in IC50s for other chemotherapies,
including thiopurines or doxorubicin (Supplementary Fig. [192]6d). Note
when ATXN1 or MN1 expression was modified in other cell lines (REH,
RS4;11, SEM, 697), we did not observe changes in chemosensitivity
indicating cell context plays an important role, a finding consistent
with the fact that these changes were shared in a subset of patients.
Discussion
Relapsed pediatric B-ALL remains one of the leading causes of cancer
mortality in children. Previous studies have characterized the genetic,
epigenetic, and transcriptional landscapes of B-ALL. However, no one
has investigated B-ALL progression through the lens of 3D chromatin
organization. In this work, we investigate the dynamics of 3D chromatin
architecture in B-ALL progression by assessing 12 matched primary
pediatric leukemia specimens at diagnosis and relapse. We demonstrate
that 3D genome organization plays an additional critical level of
control in B-ALL progression.
The current clinical standard for the detection of genetic
abnormalities in B-ALL is fluorescence in-situ hybridization (FISH).
Despite its sensitivity, FISH can only detect known genetic aberrations
with the provision of specific probes. Hi-C in contrast is a powerful
tool that gives insight into previously unknown structural variants. We
sought to leverage Hi-C and investigate the stability of structural
variants between diagnosis and relapsed B-ALL. Robust SV analysis with
Hi-C data from diagnosis and relapse samples confirmed translocations
provided by conventional cytogenetics but also some previously
unidentified stable, diagnosis-specific, and relapse-specific
translocations. This data suggests that Hi-C data can be used as a
computationally less rigorous alternative to WGS methods to discover
cytogenetic abnormalities in ALL and other disease models. We confirmed
the presence of fusion transcripts such as BCR::ABL1 and ETV6::RUNX1 as
well as identified novel translocation-driven fusion transcripts
offering possible new avenues for therapeutic intervention especially
those that appear to be enriched at relapse.
In general, such translocations are thought to be an early event in
tumorigenesis and are maintained at diagnosis and relapse. We show a
striking number of additional translocations indicating a fundamental
role in tumorigenesis. The loss of translocations at relapse indicates
a dispensable role in clonal evolution while relapse-specific
translocations may provide additional fitness to the leukemic clone.
Interestingly, both PDGFRB and ZCCHC7 have been previously observed in
rearrangements with different loci in B-ALL. The detection of
ZCCHC7::PDGFRB at diagnosis would support the use of a tyrosine kinase
inhibitor, although in this case the clone appeared to be extinguished
with chemotherapy. MBTD1, malignant brain tumor domain containing 1,
has been identified as a part of a chimeric protein with ZMYND11 (zinc
finger MYND-type containing 11) that leads to active chromatin states
at stemness related genes in AML^[193]59. Herein, we show a novel
fusion transcript with MBTD1 and WDR89, a member of the WD40-repeat
proteins whose function is largely unknown but may play a role in
immunotherapy response^[194]60,[195]61. While the functional relevance
is beyond the scope of the present study, the fact that this novel
transcript was relapse-specific suggests a role in disease progression.
Performing Hi-C on a larger cohort of samples could enable the rapid
benchmarking of cytogenetic abnormalities associated with B-ALL
progression and would allow for an unbiased view of the genome to
identify unknown fusions that would otherwise be missed using current
methods. Lastly, while our studies show a number of diagnosis or
relapse specific translocations, all pairs had shared translocations
indicating that the origin of the relapsed clone was from a common
ancestral precursor including case SJPHALL005 where the initial
diagnostic clone was lymphoid and the patient relapsed with AML. Thus,
the subsequent relapse likely originated from a common ancestral
stem-like clone, a feature known to be associated with Ph+ ALL^[196]62.
Furthermore, there were several relapse-specific translocations that
were visually detected (but not called by Breakfinder) at diagnosis,
supporting the idea of clonal selection of a minor subclone at
diagnosis^[197]5,[198]7.
In addition to structural changes, we demonstrate significant A/B
compartment, TAD interactivity, and looping changes between diagnosis
and relapse that strongly correlated with expression and accessibility.
These findings support a prominent role of 3D architecture in
modulating transcription. We saw that the magnitude of these
architectural changes varied among pairs. This is consistent with our
previous findings showing variable degrees of epigenetic changes from
diagnosis to relapse, highlighting the plasticity of the chromatin
landscape under the selective pressures of therapy^[199]9. Analysis of
3D-regulated genes revealed previously discovered pathways associated
with drivers of clonal evolution. For example, activation of JAK/STAT
signaling has been implicated in leukemia progression and we observed a
recurrent B to A compartment switch at the JAK2 locus. Previous studies
have shown mutations as well as translocations lead to activated JAK2
in pediatric B-ALL, which could be targeted through JAK
inhibitors^[200]53,[201]54. Our Hi-C analysis demonstrates an
additional mechanism by which blasts may increase JAK2 thus extending
the use of JAK inhibitors at relapse.
Importantly our work identified additional novel pathways involved in
drug resistance specifically to prednisolone, a cardinal feature of ALL
at relapse. A meta-analysis of gene expression in B-ALL diagnosis and
relapse pairs confirms that MN1 expression is downregulated while ATXN1
is upregulated at relapse consistent with our findings^[202]53. MN1 is
a transcriptional co-activator and indirectly activates retinoic acid
receptor and vitamin D receptor (VDR) target genes^[203]63. Many
studies have confirmed an association between polymorphisms of VDR with
osteonecrosis, a side effect of glucocorticoids in ALL
therapy^[204]64,[205]65. Furthermore, retinoids have been found to
promote glucocorticoid-induced apoptosis of various T-cell lines. Thus,
downregulation of MN1 may impair apoptosis induced by retinoid
signaling under steroid treatment^[206]66. Conversely, overexpression
of ATXN1 resulted in increased resistance to prednisolone. ATXN1,
thought to function in extracellular matrix remodeling, has been shown
to regulate Notch and Capicua (CIC) controlled developmental
processes^[207]67. We were unable to replicate phenotypic changes in
multiple cell lines; however, we have previously demonstrated cell
context specific changes in drug response, which can be explained in
part due to 3D chromatin differences^[208]56.
By integrating looping changes with gene expression and chromatin
accessibility, we were able to identify potential regulators of these
changes. We found enrichment of sites for chromatin binding factors
shown to be key players in cancer. One such example is EZH2, which is
part of the polycomb repressive complex (PRC2) that is crucial for
proliferation and differentiation and plays a complex role in
cancer^[209]68. We previously demonstrated DNA hypermethylation at PRC2
target genes at relapse^[210]9, which would be consistent with
decreased expression, accessibility, and 3D looping interactions where
there are EZH2 motifs. Studies related to EZH2 and NSD2 mutations in
leukemia and myeloma have also revealed a complex interplay between
these two epigenetic modifiers^[211]69,[212]70 and perhaps
understanding the 3D chromatin changes associated with EZH2 could
provide more insight for the best method to target epigenetic
modifiers. The importance of epigenetic modifiers is strengthened
further by the enrichment of H3K27me3 and the role this mark has been
demonstrated to play in repressor hijacking events^[213]20. The
highest-ranked factor for decreased looping and expression was retinoid
X receptor (RXR), which is a nuclear receptor that interacts with
retinoic acid receptor alpha and recruits corepressors to chromatin to
regulate target genes, including genes involved in differentiation and
apoptosis^[214]71. Targeting these receptors in IKZF1 mutated BCR-ABL1
ALL has also been shown to reverse stem cell like phenotype^[215]72.
The highest-ranking factor motif for activated regions was nuclear
factor erythroid-2 (NF-E2). Overexpression in mouse models has been
shown to lead to increased stem and progenitor cells and spontaneous
development of AML^[216]73.
Interestingly, the H3K27ac HiChIP with patient PASFIF demonstrated a
significant amount of de novo loops appearing and disappearing at
regions of stable chromatin upon relapse. This finding indicates that
sites of stable chromatin accessibility provide invaluable information
on a significant number of 3D rewiring events that would otherwise be
entirely ignored if analyzes are directed to changes in chromatin
accessibility exclusively (e.g., ATAC-seq). For example, CREBBP motifs
were enriched in regions of decreased looping at relapse. CREBBP loss
of function mutations have previously been observed at relapse in a
significant number of patients and is associated with resistance to
glucorticoids^[217]74. Likewise, PHF8 motifs were enriched in regions
of increased looping and PHF8 expression is associated with ALL
progression through activation of the MEK/ERK pathway^[218]75.
Overall, this study demonstrates 3D genome organization as a critical
factor in reshaping the transcriptional output of leukemia cells under
the selective pressures of therapy. These processes appear to drive
pathways shown to be involved in drug resistance but also previously
undiscovered genes/pathways that may play a role in clonal evolution.
Our findings extend indications for inhibition of kinase and cytokine
receptor signaling in B-ALL using available agents, as well as,
highlight the potential of novel approaches to modulate MN1, ATXN1 and
FLT1 at relapse. The identification of common downstream targets among
patients indicates the convergence of drug resistance mechanisms and
has important clinical implications for possible future therapeutic
interventions.
Methods
Cell culture, drug preparation, viral preparation, immunoblotting, and
phenotypic assays were performed according to methods published
previously^[219]5–[220]7.
Cells and reagents
The B-lineage leukemia cell lines Reh (ATCC), RS4;11 (ATCC), RCH-ACV
(DSMZ), and SEM (kindly gifted by Jun Yang, St. Jude Children’s
Hospital) were grown in RPMI1640 medium. All media were supplemented
with 10% FBS, 1% penicillin/streptomycin under 5% CO2 at 37 °C. No cell
lines were used beyond passage 20. Each leukemia line was validated by
short tandem repeat analysis through IDEXX cellcheck9+ service. Cell
lines were routinely monitored for mycoplasma contamination by PCR
using ATCC Universal Mycoplasma Detection Kit (20-1012 K).
Experimental Procedures
Primary B-ALL patient samples
Cryopreserved paired diagnosis/relapse primary patient bone marrow
samples from individual patients were obtained from the Children’s
Oncology group (COG) ALL biorepository or St. Jude Children’s Research
Hospital^[221]21. All subjects provided consent for banking and future
research use of these specimens in accordance with the regulations of
the institutional review boards of all participating institutions.
Samples were thawed and stained for CD45 (PerCP-Cy™5.5 Mouse Anti-Human
CD45 clone HI30, BD Bioscience Cat# 564105) and CD19 (APC Mouse
Anti-Human CD19 Clone HIB19, BD Bioscience Cat# 555415). Leukemic
blasts (CD45dimCD19 + ) were sorted using SY3200 cell sorter (Sony
Technologies) in the NYU Cytometry and Cell Sorting Laboratory and then
immediately processed for downstream steps.
RNA-seq
Following FACS, 50,000-200,000 leukemic cells were immediately
resuspended in Buffer RLT plus BME and frozen on dry ice. RNA was
extracted using the QIAGEN RNeasy Micro Kit and quality was verified by
an Agilent Bioanalyzer 2100 (PICO chip). RNA-seq libraries were
generated by the NYU Genome Technology Center using the Low input
Clontech SMART-Seq kit and sequenced on the Illumina NovaSeq 6000.
ATAC-seq
Following FACS, 50,000 leukemic cells were immediately processed for
ATAC-seq by the NYU Genome Technology Center. ATAC libraries were
generated based on the protocol by Buenrostro et al.^[222]76 Briefly,
cells are resuspended in cold lysis buffer (10 mM Tris Cl, 10 mM NaCl,
3 mM MgCl[2], 0.1% (v/v) Igepal CA-630, pH 7.4) and centrifuged for
1 min at 500 × g. Nuclei were tagmented using Nextera (Illumina)
Tagmentation DNA buffer and enzyme. PCR amplification was performed as
described by Buenrostro et al.^[223]76 Libraries were sequenced on the
Illumina NovaSeq 6000.
Hi-C
Following FACS, leukemic cells were immediately fixed and crosslinked
with 2% formaldehyde at room temperature for 10 min. Glycine was added
to stop the reaction. Crosslinked cells were washed with PBS and then
frozen on dry ice. Crosslinked cell pellets were stored at −80 until
ready for processing. Hi-C libraries were generated using the Arima-HiC
kit (A410110) with Swift Biosciences® Accel-NGS® 2 S Plus DNA Library
Kit (Cat # 21024) and Indexing kit (Cat #) according to manufacturer’s
protocol. Libraries were amplified and quantified using KAPA Library
Amplification kit (Cat # KK2620). Final libraries were sequenced at the
NYU Genome Technology Center on the Illumina NovaSeq 6000.
H3K27ac HiChIP
HiChIP libraries were generated using the Arima-HiC^+ kit (A410232)
with Swift Biosciences® Accel-NGS® 2 S Plus DNA Library Kit (Cat #
21024) and Indexing kit (Cat # 26148) according to manufacturer’s
protocol with minor modifications. After in-situ proximity ligation,
samples were sonicated on Covaris LE 220 at 4 C, 300 W, 15% duty
factor, 200 cycles per burst, for 300 s. Samples are then precleared
with protein G beads for an hour, and subsequently incubated with 5 μg
of H3K27AC antibody (ab4729, Lot:GR3231988) at 4 C overnight. After IP,
libraries were amplified and quantified using KAPA Library
Amplification kit (Cat # KK2620). Final libraries were sequenced at the
NYU Genome Technology Center on the Illumina NovaSeq 6000.
Drug preparation
Stock solutions of Doxorubicin (Dox) (Sigma-Aldrich, St. Louis, MO)
were prepared in double-distilled water, Prednisolone (Pred)
(Pharmacia, St. Paul, MN) in 0.9% NaCl (Saline), 6-thioguanine (6-TG)
and 6-mercaptopurine (6-MP) in 0.1 M NaOH. Drugs were diluted and added
to cell culture media at indicated concentrations. Vehicle toxicity was
tested at highest drug dose used to ensure no impact on viability.
Generation of overexpression cell lines
The open reading frames of MN1 or ATXN1 were PCR amplified from the
pcDNA3.1 plasmids purchased from Genscript (OHu18815D and OHu26181D,
respectively) and cloned into pLenti-R1R2-DEST (kindly gifted by
Kenneth Scott, Baylor College of Medicine) using Invitrogen’s
Gateway(R) Technology. HEK293T cells were then transfected with these
constructs (empty vector or gene of interest) along with packaging
plasmids coding for VSV-G, Gag-pol, and Rev using lipofectamine 2000
(Invitrogen, Carlsbad, CA). Viral supernatant was used to spin infect
B-ALL cell lines with 8 µg/ml Polybrene (Millipore). After 72 h,
infected cell lines were selected with 0.75–3 µg/ml of Puromoycin.
Cytotoxicity assays
Cells were plated at a density of 60,000–80,000 cells/well and exposed
to chemotherapy agents commonly used in the treatment of pediatric
B-ALL for 4 days. Cell viability was measured using CellTiter-Glo®
Luminescent Cell viability Assay (Promega) according to the
manufacturer’s protocol. The luminescence was read using a PerkinElmer
EnSpire multimode plate reader. Luminescent readings were normalized to
the untreated control for each cell line to account for any differences
in baseline growth kinetics or survival. Cytotoxicity percentages were
plotted using nonlinear regression with a four parametric variable
slope with log transformed x concentrations with a bottom constraint
equal to zero on GraphPad Prism 7.2 (GraphPad Prism Software Inc., La
Jolla, CA). For apoptosis, following treatment, cells were stained with
Annexin V and DAPI and analyzed via flow cytometry using the BD LSR II
cytometer within the NYU Cytometry and Cell Sorting core. The percent
Annexin V+ cells were determined for percent apoptotic using FlowJo
Software. The percentage of apoptotic cells was compared between empty
vector control and MN1 or ATXN1 overexpressing cells using an unpaired
t test. Each experiment was plated in triplicate and repeated at least
three times.
Computational analysis
RNA-seq analysis
Patient RNA-seq fastq files were processed in single replicates using
the route “rna-star” and “rna-star-groups-dge” from the Slide-n-Seq
(sns) pipeline: [224]https://igordot.github.io/sns/. Processing steps
include alignment of paired-end reads to the human reference genome
(hg19) using the STAR aligner with default parameters^[225]77. Counts
were obtained using featureCounts^[226]78. Bigwig tracks were obtained
for visualization on individual samples using deeptools
(v3.1.0)^[227]79. Downstream analysis including normalization and
differential expression analysis was performed using DESeq2^[228]80.
Genes were categorized as differentially expressed if abs(L2FC > 0.58,
p value < .05). Pathway analysis was performed using enrichR^[229]46.
ATAC-seq analysis
Patient ATAC-seq fastq files were processed in two replicates using the
route “atac” from the Slide-n-Seq (sns) pipeline:
[230]https://igordot.github.io/sns/. Processing steps included aligning
paired-end reads to the human reference genome(hg19) with
Bowtie2(v2.3.4.1)^[231]81. Reads with a mapping quality <30 were
removed. Duplicated reads were removed using Sambamba (v0.6.8)^[232]82.
Remaining reads were analyzed by applying the peak-calling algorithm
MACS2(v2.1.1)^[233]83. Bigwig tracks were obtained for visualization on
individual samples using deeptools (v3.1.0)^[234]79. Differential
ATAC-seq peak analysis was performed using DiffBind^[235]84. Nearest
genes were annotated using ChIPseeker^[236]85. Enrichment analysis of
genomic regions sets was performed using Bioconductor package LOLA
(Locus overlap analysis or enrichment of genomic ranges; R package
version 1.24.0) with RStudio (v3.6.1) with the hg19 LOLA core
database^[237]51. LOLA core is curated from many sources including TF
binding sites from Encode and epigenome databases from Cistrome.
Hi-C and H3K27ac HiChIP analysis
Raw Hi-C sequencing data was processed with the hic-bench
platform^[238]24. Cell line Hi-C data was processed as single
replicates. Data was aligned against the human reference
genome(GRCh37/hg19) with bowtie2(version 2.3.1)^[239]81. The reads used
for downstream analyzes were filtered for by the GenomicTools tools-hic
filter command in the hic-bench platform using default parameters. The
GenomicTools tools-hic filter command discards reads including
multi-mapped reads (“multihit”), read-pairs with only one mappable read
(“single sided”), duplicated read-pairs (“ds.duplicate”), read-pairs
with a low mapping quality of MAPQ < 20, read-pairs resulting from
self-ligated fragments (together called “ds.filtered”), and short range
interactions resulting from read-pairs aligning within 25 kb
(“ds.too.short”). Downstream analysis was performed with the accepted
intra-chromosomal read- pairs (“ds.accepted intra”). The number of
accepted intra-chromosomal read-pairs varied between ~40 and ~140
million for all samples (Chapter 1; Fig. [240]2). Hi-C interaction
matrices were generated for each chromosome separately using the
hic-bench platform at 40 kb resolution. Filtered read counts were
normalized by iterative correction and eigenvector decomposition
(ICE)^[241]86. To account for variances in read counts of more distant
loci, distance normalization for each chromosome matrix was performed.
Translocation calling from patient Hi-C data
Hi-C breakfinder^[242]27 and HiNT^[243]28 were used to call previously
identified, as well as, novel intra-chromosomal and inter-chromosomal
translocations from the matched B-ALL diagnosis/relapse patient Hi-C
data. Hi-C breakfinder required 3 input files including a bam file, an
inter-chromosomal expectation file, and an intra-chromosomal
expectation file. The bam file was generated by hic-bench and the two
expectations files for hg19 were provided on the Dixon lab GitHub page:
[244]https://github.com/dixonlab/hic_breakfinder. Hi-C breakfinder
generated lists of structural variant predictions at different
resolutions including 1 Mb, 100 kb, and 10 kb. For the purposes of
comparing translocations between Hi-C breakfinder and HiNT, the 100 kb
resolution was used. Translocations were called at 100 kb resolution
using HiNT through the “translocations” step within the hic-bench
platform^[245]24. EagleC^[246]87 (version 0.1.9) was used to call copy
number inferences (deletions and duplications) at three resolutions
5 kb,10 kb,50 kb. High-confidence calls were determined by setting the
probability cutoff to 0.95.
Fusion transcript analysis
Fusion transcript analysis was performed using STAR-fusion:
[247]https://github.com/STAR-Fusion/STAR-Fusion/wiki^[248]88.
STAR-fusion required paired-end RNA-seq fastq files and a reference
genome (hg19) in order to identify candidate fusion transcripts.
Hi-C contact matrix visualization
To visualize Hi-C contact matrices, ICE normalized Hi-C contact
matrices for the corresponding chromosome were loaded and normalized by
the total number of intra-chromosomal interactions for Diagnosis and
Relapse samples. The log2FC Hi-C contact matrix was produced by
applying the log2 function on the division product of the Relapse Hi-C
table by the Diagnosis Hi-C table.
To visualize translocations, Hi-C contact matrices were generated in
the.hic format using the “tracks” step within the hic-bench
platform^[249]24. Hi-C contact matrices were visualized using
Juicebox^[250]29.
Hi-C contact matrix reconstruction
Hi-C contact matrices were reconstructed to demonstrate
inter-chromosomal translocations at 20 kb resolution using
NeoLoopFinder^[251]25. NeoLoopFinder required.cool files which were
generated using the “tracks” step within the hic-bench
platform^[252]24. NeoLoopFinder also required files containing a list
of structural variations that were generated at 10 kb resolution using
Hi-C breakfinder^[253]27 as described above.
A/B compartments analysis
A/B compartments analysis was performed using the “compartments” and
“compartments-stats” steps within the hic-bench platform^[254]24. The
Cscore tool algorithm was used to assign active (A) and inactive (B)
compartments^[255]34. For a bin to be considered a switch from A to B
or from B to A, the compartment score sign had to flip in sample 2
(Relapse) when compared with the reference sample (sample 1 or
Diagnosis). The absolute difference between the compartment scores had
to be higher than the cutoff (default: 1.2). The difference was
computed as a relative delta:
[MATH: delta=(Y−X)/ab
s(Y)
#deltavaluecalculation
:MATH]
X = compartment score of bin in sample 1 (reference)
Y = compartment score of bin in sample 2
TAD interactivity analysis
TAD interactivity was assessed using the “domains” and “domains-diff”
steps within the hic-bench platform^[256]24. The “domains” step uses
the hic-ratio algorithm for TAD calling developed within hic-bench by
previous Tsirigos lab member Haris Lazaris in which the average of the
normalized interaction scores is calculated for all interactions taking
place within a particular TAD. The “domains-diff” step assesses TAD
interactivity alterations and was developed by previous lab members
Sofia Nomikou and Andreas Kloetgen^[257]17,[258]24. To identify TADs
with differential interactivity between Relapse and Diagnosis, we used
the TADs identified in the Diagnosis sample as a reference to identify
common TADs. Once mean TAD interactivity values were obtained from the
Hi-C data, a wilcoxon two-sided rank sum non-parametric test was
performed to determine the p-values for each TAD. Multiple testing was
used to correct these p-values by adjusting to the total number of
TADs. Lastly, the log2 fold change (log2FC) of intra-TAD activity value
was calculated between the samples. TAD interactivity alterations were
categorized as significant if abs(L2FC > 0.25) and FDR < 0.01.
Looping analysis
Looping analysis was performed using the “loops” and “loops-diff” steps
within the hic-bench platform^[259]24. Chromatin loops from Hi-C and
H2K27ac HiChIP data were called using FitHiC2^[260]49,[261]50. Loops
were classified as diagnosis-specific, relapse-specific, or common. The
common loops were also classified as increased, decreased, or stable
using a log2 fold change cutoff (abs(L2FC) > 1). A loop was considered
sample-specific if it was found to be significant only in sample ‘x’
but not in sample ‘y’, when using a q-value cutoff of ‘qcut1’ (default:
0.01) as well as the loop shouldn’t be found significant in sample ‘y’
when using a more relaxed cutoff of ‘qcut2’ (default: 0.1). This is to
avoid considering a loop sample-specific when the significance was
relatively close in both samples but in one sample it was lower than
the cutoff (significant) and in the other sample it was slightly higher
than the cutoff (not significant). These cases were classified as
common loops.
Integration of patient Hi-C data with other datasets
To show the correlation between the Hi-C data and the various
sequencing datasets, the peaks obtained from ATAC-seq, and the genes
obtained from RNA-seq were mapped to the AB, BA, and stable regions or
the increased, decreased, or stable TADs using the “bedtools intersect”
command^[262]89,[263]90. We calculated the peak intensity fold change
or gene expression fold change for peaks or genes assigned to a
compartment region or TAD activity region between diagnosis and relapse
and showed the correlation with boxplots or bar plots. Statistical
significance was assessed using a paired two-sample t test.
Reporting summary
Further information on research design is available in the [264]Nature
Portfolio Reporting Summary linked to this article.
Supplementary information
[265]Supplementary Information^ (1.8MB, pdf)
[266]Peer Review File^ (1MB, pdf)
[267]41467_2024_51492_MOESM3_ESM.pdf^ (77KB, pdf)
Description of Additional Supplementary Files
[268]Supplementary Data 1^ (550.2KB, csv)
[269]Supplementary Data 2^ (10.9KB, csv)
[270]Supplementary Data 3^ (10.3KB, xlsx)
[271]Supplementary Data 4^ (1.2MB, xlsx)
[272]Supplementary Data 5^ (123.9KB, xlsx)
[273]Supplementary Data 6^ (12.6KB, xlsx)
[274]Reporting Summary^ (93.4KB, pdf)
Acknowledgements