Abstract
MYC is a well characterized oncogenic transcription factor in prostate
cancer, and CTCF is the main architectural protein of three-dimensional
genome organization. However, the functional link between the two
master regulators has not been reported. In this study, we find that
MYC rewires prostate cancer chromatin architecture by interacting with
CTCF protein. Through combining the H3K27ac, AR and CTCF HiChIP
profiles with CRISPR deletion of a CTCF site upstream of MYC gene, we
show that MYC activation leads to profound changes of CTCF-mediated
chromatin looping. Mechanistically, MYC colocalizes with CTCF at a
subset of genomic sites, and enhances CTCF occupancy at these loci.
Consequently, the CTCF-mediated chromatin looping is potentiated by MYC
activation, resulting in the disruption of enhancer-promoter looping at
neuroendocrine lineage plasticity genes. Collectively, our findings
define the function of MYC as a CTCF co-factor in three-dimensional
genome organization.
Subject terms: Prostate cancer, Epigenomics, Chromatin structure
__________________________________________________________________
The functional link between MYC and CTCF in prostate cancer remains to
be investigated. Here, the authors highlight the role of MYC in
rewiring chromatin architecture by interacting with CTCF protein.
Introduction
The oncoprotein MYC regulates various biological activities that
contribute to tumorigenesis. Numerous studies have described the
regulation of MYC expression by distal enhancer dynamics^[64]1–[65]3,
but the function of MYC in three-dimensional (3D) genome organization
is largely unexplored. As a master transcription factor (TF), MYC
protein has an intrinsically disordered transactivation domain and can
form phase-separated condensates with the MED1 protein^[66]4. MYC is
also required for the chromatin decompaction and loop formation during
B cell activation^[67]5. A recent study suggests MYC overexpression in
U2OS cells increases global chromatin interactions at super-enhancers
and MYC binding sites^[68]6. Although these clues imply MYC
functionalities in 3D genome organization, the molecular mechanisms
have not been specified yet.
CTCF is a principal 3D chromatin architecture organizer, which
functions together with the cohesin complex to establish chromatin
loops and structure topologically associated domains (TAD)^[69]7. CTCF
shows pleiotropic functions in gene expression regulation, such as
insulating enhancers/promoters within CTCF-CTCF loops from outside
regulatory elements (RE) or facilitating enhancer-promoter interactions
by colocalizing with other TFs^[70]7,[71]8. Despite the importance of
CTCF in 3D genome, only a few proteins, including cohesin^[72]9,[73]10
and MAZ^[74]11,[75]12, have been identified to stabilize the
CTCF-mediated chromatin contact. Considering the widespread function of
CTCF sites in transcription regulation, functional interactions between
CTCF and other master regulators could be potentially exploited by
cancer cells as a strategy to fine-tune oncogenic gene expression.
Here, we show that MYC reshapes the chromatin architecture of prostate
cancer (PCa) cells through interacting with CTCF protein. By defining
H3K27ac, AR and CTCF HiChIP profiles and CRISPR deletion of a CTCF site
upstream of MYC, we reveal MYC activation results in genome-wide
changes of CTCF looping. Mechanistically, MYC interacts with CTCF
protein and increases CTCF chromatin binding affinities at MYC/CTCF
common sites. Utilizing multi-omics approaches in an ectopic MYC
expression model, we reveal that MYC represses a subset of target genes
involved in neuroendocrine lineage plasticity by potentiating
CTCF-mediated chromatin looping. Taken together, this study unravels
the role of MYC in 3D genome architecture, promoting CTCF-mediated
chromatin interactions to regulate PCa gene expression.
Results
3D interaction mapping of regulatory elements in PCa
To systematically characterize RE-associated chromatin interactions in
PCa, we performed H3K27ac HiChIP in PCa cell lines 22Rv1 and V16A under
full FBS medium culture. To investigate androgen-induced chromatin
dynamics, we also performed H3K27ac, AR and CTCF HiChIP in VCaP cells
under charcoal-stripped FBS medium with and without androgen (DHT)
stimulation (Fig. [76]1a). To gain mechanistic insights of PCa 3D
genome organization, we further generated H3K27ac and CTCF HiChIP data
in CRISPR-Cas9-mediated genomic deletion and MYC-overexpressed 22Rv1
cells. In total, 31 HiChIP libraries, including replicates, were
quantified via pair-end (PE) sequencing at an average of 215.5 million
(M) ± 94.0 M PE reads per sample.
Fig. 1. Profiling of regulatory element interactome in PCa cell lines.
[77]Fig. 1
[78]Open in a new tab
a Overview of HiChIP experiments and analyses in PCa cell lines. b The
number of significant loops (FDR < 0.05) in VCaP HiChIP assays. From
top to bottom, n = 74514, 61748, 46819, 9751, 24982, 24737,
respectively. c Strength of H3K27ac and AR loops anchored at TSSs of
all protein-coding genes. The frequency curves showed the normalized
read number (loop strength) distribution at anchors distal to TSSs. The
strength of each loop was normalized by the number of total ‘cis-far’
unique valid pairs. P-values were determined by paired t-test. For
TSSs, n = 19962. d Strength of H3K27ac loops anchored at genes
downregulated or upregulated by 2 h DHT treatment. The bar plot
summarized the strength of loops-anchored gene TSSs. P-values were
determined by paired t-test. For TSSs of upregulated and downregulated
genes, n = 350 and 205, respectively. e Strength of AR loops anchored
at genes downregulated or upregulated by 2 h DHT treatment. The bar
plot summarized the strength of loops anchored at gene TSSs. P-values
were determined by paired t-test. For TSSs of upregulated and
downregulated genes, n = 350 and 205, respectively. f The expression of
IL20RA was upregulated from the early time point (2 h) after DHT
stimulation. n = 2. Data represent means ± SD. g For the IL20RA gene,
AR loops were boosted as early as 2 h after DHT stimulation. h 3C–qPCR
assay of the IL20RA genomic region. The data represents relative
frequencies of interaction between the anchor region near the IL20RA
TSS and selected PstI digestion sites (circles). n = 3. Data represent
means ± SD. P-values were determined by Student’s t-test. *P < 0.05;
**P < 0.01. Source data are provided as a Source Data file.
HiChIP technique identifies RE-associated interactions by
antibody-mediated immunoprecipitation on the Hi-C ligated
chromatin^[79]13. Therefore, ChIP peaks-based loop calling is crucial
to capture reliable RE-associated interactions. Thus, we first employed
the HiCUP-hichipper pipeline to identify ChIP peaks-based loops and
then calculated the statistical significance of looping strength by
modeling the distribution of mated PE loop reads at all possible
anchors (Fig. [80]S1a, b). The HiCUP quality control (QC) reports
showed a very high proportion of valid pairs in our HiChIP libraries
during the filtering step (Fig. [81]S1c). The de-duplication QC
suggested most of the valid pairs were unique di-tags, which are mainly
the expected cis-far di-tags (>10Kbp), indicating the high quality of
our HiChIP libraries (Fig. [82]S1c). We identified 74,514 high
confidence H3K27ac loops (FDR < 0.05) in androgen-deprived VCaP cells,
and found the loop numbers dropped to 61,748 and 46,819 after 2 h and
24 h DHT treatment, respectively (Fig. [83]1b). In sharp contrast, the
numbers of AR loops increased from 9751 under androgen-deprivation
condition to 24,982 and 24,737 after 2 h and 24 h DHT treatment,
respectively (Fig. [84]1b), suggesting the involvement of activated AR
sites in 3D chromatin interaction. Overall, 37.0% of H3K27ac anchors
overlap with 50.1% of AR anchors (Fig. [85]S1d), indicating most
H3K27ac loops are independent of AR. The opposite effects of androgen
on H3K27Ac-specific and AR-specific looping are coincident with
AR-mediated co-factor redistribution mechanisms^[86]14.
H3K27ac and AR loops play distinct roles in androgen-induced gene regulation
To better understand the function of RE-associated interactions in
androgen-driven transcription, we integrated H3K27ac and AR HiChIP with
RNA-Seq data in VCaP cells with vehicle, 2 h or 24 h DHT treatment. The
number of H3K27ac loops with 3~10 mated PE reads showed moderate and
pronounced reductions after 2 h and 24 h androgen treatments,
respectively, but the number was stable for H3K27ac loops with more
reads (Fig. [87]S1e). Conversely, we observed a marked elevation in the
number of all types of AR loops after 2 h and 24 h DHT treatments
(Fig. [88]S1f). Consistent with the loop count changes, the average
strength of H3K27ac loops anchored at expressed genes was diminished
after 24 h DHT treatment (P < 2.2e–16), while the strength of AR loops
was markedly elevated with both 2 h (P = 3.3e-09) and 24 h
(P = 3.4e-09) DHT treatments (Fig. [89]1c, [90]S1g and [91]S1h). We
next characterized the strength dynamics of loops anchored at
differentially expressed genes. For the DHT-upregulated genes, H3K27ac
loop strength showed no evident change at 2 h (P = 0.10) or 24 h
(P = 0.29) after DHT treatment, but AR-mediated interactions were
markedly reinforced by both 2 h (P = 1.9e-06) and 24 h (P = 3.3e-06)
DHT treatment (Figs. [92]1d, [93]1e, [94]S1h and [95]S1i). For the
downregulated genes, H3K27ac loop strength was slightly attenuated by
2 h (P = 1.3e-10) DHT treatment and further reduced by 24 h
(P = 1.4e-13) DHT treatment (Fig. [96]1d and [97]S1h). AR-associated
interaction dynamics were neglectable at DHT-downregulated genes,
especially for genes suppressed by 2 h DHT treatment (Fig. [98]1e and
[99]S1i), suggesting that the direct function of AR in acute androgen
stimulation is transcriptional activation, not repression. IL20RA and
CD82 are example genes upregulated by 2 h and 24 h DHT treatments,
respectively (Fig. [100]1f and [101]S2a). In line with their gene
expression dynamics, AR-associated enhancer-promoter loops emerged from
2 h at IL20RA gene and 24 h at CD82 gene after DHT treatment, while
H3K27ac loops at the two genes were not significantly changed by DHT
treatment (Fig. [102]1g and [103]S2b). Consistently, quantitative 3 C
assay reveals a significant increase of chromatin interactions between
IL20RA promoter and enhancers after 24 h DHT treatment (Fig. [104]1h).
On the contrary, H3K27ac loops were diminished from 2 h at FOXN2 gene
and 24 h at BTF3 gene after DHT treatment, in line with transcriptional
repression of these genes (Fig. [105]S2c-f). No AR loops were observed
at FOXN2 and BTF3 genes, supporting the indirect repressive function of
AR (Fig. [106]S2e and [107]S2f). These results indicate AR-associated
chromatin contacts are significantly strengthened upon androgen
stimulation.
We previously reported redistribution of cofactors mediates
androgen-activation of AR-binding sites and repression of
H3K27ac-binding enhancers^[108]14. Similar mechanisms may engage with
AR-associated chromatin contacts (enriched at androgen-stimulatory
loci) versus the overall H3K27ac-associated chromatin contacts
(enriched at androgen-repressive loci). To identify TFs involved in the
androgen-induced transcription repression, the DHT-suppressed genes
were classified into four categories based on the overlap between gene
promoters and H3K27ac/AR loops (Fig. [109]S2g). We reasoned that the
H3K27ac loops anchored at DHT-suppressed gene promoters without AR
loops were related with other TFs (not AR) functioning in transcription
repression. To identify the involved TFs, we compared distal anchors of
these H3K27ac loops with TCGA PCa ATAC-Seq peaks, and subjected the
overlapping genomic regions to motif enrichment analysis. As expected,
AR and FOXA1 motifs were found among the top-ranked motifs enriched in
the distal anchors of H3K27ac+/AR+ loops anchored at DHT-activated gene
promoters (Fig. [110]S2h, Supplementary data [111]1). CTCF and KLF4
motifs were significantly enriched in both DHT-activated gene
promoter-related distal H3K27ac+/AR+ anchors and DHT-suppressed gene
promoter-related distal H3K27ac anchors, indicating their universal
function in transcription regulation (Fig. [112]S2h, Supplementary
data [113]1 and [114]2). Among the top 5 enriched motifs, ERG, NFYA,
and ETV4 were specifically associated with H3K27ac loops anchored at
DHT-suppressed gene promoters (Fig. [115]S2h, Supplementary
data [116]2), suggesting these TFs play important roles in the
regulation of androgen-induced transcription repression. Indeed, ERG
was reported to repress AR-mediated transactivation^[117]15. Together,
these analyses pinpointed the potential TFs participating in the
AR-independent enhancer-promoter interactions of androgen-repressed
genes.
Cell-type-specific CTCF loops regulate the expression of PCa-related genes
In both AR+/H3K27ac+ and H3K27ac-only loop anchors, CTCF was the top
one enriched motif, suggesting a fundamental role of CTCF in the
establishment of RE interactions in PCa. To further delineate the
function of CTCF in PCa, we generated CTCF-mediated chromatin contact
maps by HiChIP in VCaP and 22Rv1, two widely used PCa cell lines
derived from a vertebral metastasis of a prostate cancer patient and a
human prostatic carcinoma xenograft, respectively. Using merged CTCF
ChIP-Seq peaks as anchors, we identified 127,197 and 114,435 CTCF loops
in two VCaP replicates, and 159,462 and 164,907 in two 22Rv1
replicates, respectively, which is comparable to the loop numbers of
published CTCF HiChIP data^[118]16,[119]17 (Fig. [120]S3a). The
replicates of the same cell line are highly correlated (r = 0.9318 for
VCaP replicates; r = 0.9271 for 22Rv1 replicates), while the samples
between different cell lines were less correlated (r = 0.6958 and
0.6720), indicating the reliability of these CTCF HiChIP datasets
(Fig. [121]2a–c and [122]S3b). Only 44.92~63.63% of the CTCF HiChIP
loops were common between VCaP and 22Rv1 cells (Fig. [123]S3c), while
75~78% of CTCF ChIP-Seq peaks were shared between VCaP and 22Rv1 cells
(Fig. [124]S3c), suggesting CTCF HiChIP captures a higher cell-type
heterogeneity compared with CTCF ChIP-Seq. The common CTCF loops
displayed a higher proportion of strong interactions (more than 10 PE
mated reads) than cell-type-specific CTCF loops (Fig. [125]S3d). In
line with CTCF HiChIP results, Hi-C interaction signal of common CTCF
loops was much stronger than cell-type-specific CTCF loops
(Fig. [126]S3e). The higher cell-type diversity of CTCF loops than
peaks was further validated by independent CTCF ChIP-Seq and HiChIP
data sets of GM12878 and Hela cells (Fig. [127]S3f)^[128]16–[129]18.
The CTCF looping data in GM12878 and Hela also confirmed that common
CTCF loops were overall stronger than cell-type-specific CTCF loops
(Fig. [130]S3g). The high-level heterogeneity of CTCF looping in PCa
cell lines suggests the engagements of 3D chromatin architecture with
additional variables.
Fig. 2. Cell-type-specific CTCF looping regulates PCa-associated genes.
[131]Fig. 2
[132]Open in a new tab
a Scatter plots showing the association of CTCF loop strength between
two replicates in VCaP and 22Rv1, respectively. n = 167054 and 218739
from left to right, respectively. P-values were calculated using
Pearson correlation. b Scatter plots showing the association of CTCF
loop strength between VCaP replicate #1 and 22Rv1 replicate #1.
n = 217761. P-value was calculated using Pearson correlation. c Scatter
plots showing the association of CTCF loop strength between VCaP
replicate #2 and 22Rv1 replicate #2. n = 210816. P-value was calculated
using Pearson correlation. d Boxplots showing the strength and
distances of CTCF loops classified based on cell-type specificity and
H3K27ac status of two anchors from the same loop. Box plots indicating
the mean (middle line), 25th and 75th percentile (box) and 10th and
90th percentile (whiskers), and n = 24661, 24708, 8804, 17267, 21223,
8967, 12496, 11277, 3348 for boxes from left to right, respectively.
P-values were calculated by Wilcoxon signed-rank test. e The number of
genes within CTCF loops. The gene numbers were normalized by CTCF loop
distances. Box plots indicating the mean (middle line), 25th and 75th
percentile (box) and 10th and 90th percentile (whiskers), and
n = 24661, 24708, 8804, 17267, 21223, 8967, 12496, 11277, 3348 for
boxes from left to right, respectively. P-values were calculated by
Wilcoxon signed-rank test. f Average log2 expression fold changes
(22Rv1 vs. VCaP) of genes within CTCF loops. Box plots indicating the
mean (middle line), 25th and 75th percentile (box), and 10th and 90th
percentile (whiskers), and n = 24661, 24708, 8804, 17267, 21223, 8967,
12496, 11277, 3348 for boxes from left to right, respectively. P-values
were calculated by Wilcoxon signed-rank test. g KEGG pathway enrichment
analysis of genes annotated to the anchors of CTCF loops. The proximity
of genes to anchors is restricted to 3Kb. h Upper: A cell-type-specific
CTCF binding insulates the interaction between a distal enhancer and
TMC5 promoter by forming CTCF-CTCF loops. The 22Rv1-specific CTCF
binding was highlighted within a red dashed rectangle. Bottom: TMC5
expression levels in VCaP and 22Rv1 cells based on RNA-Seq data. i
Pearson correlation coefficients between DNA methylation ratio of CpGs
and CTCF binding affinity at this CTCF site in 20 ENCODE cell lines.
Each circle denotes a CpG within this CTCF site. j Scatter plot showing
a positive correlation between average DNA methylation levels at this
CTCF site and TMC5 expression levels in Changhai 2020 data set. The
error band indicates SEM. P-value was calculated using Pearson
correlation.
CTCF insulates RE interactions by forming CTCF-CTCF loops anchored at
TAD boundaries, and can also regulate RE interactions by directly
binding to active REs. We observed a high ratio of H3K27ac modification
in both common and cell-type-specific CTCF loop anchors, and the
H3K27ac-positive CTCF anchors contained a higher ratio of promoter
regions than H3K27ac-negative CTCF anchors, indicating the importance
of CTCF for RE activation (Fig. [133]S3h, [134]S3i). Within common
loops, the H3K27ac-marked CTCF loops had the highest strength (25 ± 31
PE mated reads) and shortest distance (125.3 ± 163.6 Kb; Fig. [135]2d).
For both common and cell-type-specific CTCF loops, the H3K27ac-marked
CTCF loops encompassed more genes than other CTCF loops (Fig. [136]2e).
Compared with H3K27ac-negative (H3K27ac −/−) CTCF loops, CTCF loops
with double-positive H3K27ac in two anchors (H3K27ac +/+) were
positively related with cell-type-specific gene expression in both
22Rv1 (P = 1.3e-08) and VCaP cells (P < 2.2e-16; Fig. [137]2f),
suggesting H3K27ac +/+ CTCF loops play a role in promoting gene
transcription. The intermediate expression alteration of genes within
H3K27ac -/+ CTCF loops compared with H3K27ac −/− and +/+ CTCF loops
indicate both transactivation and insulation function of H3K27ac -/+
CTCF loops (Fig. [138]2f). To find the biological relevance of CTCF
loops, we conducted pathway enrichment analysis using genes annotated
to CTCF anchors. Many cancer-related pathways were enriched in H3K27ac
-/+ common CTCF loop anchors, such as “Pancreatic cancer” and “Acute
myeloid leukemia” (Fig. [139]2g). Interestingly, “Prostate cancer”
pathway genes were enriched in both common and cell-type-specific CTCF
loop anchors (Fig. [140]2g), which is consistent with the tissue origin
of VCaP and 22Rv1 cells, suggesting CTCF looping is involved in
tissue-specific gene regulation.
An example of CTCF-loop-regulated gene is TMC5, which is
transcriptionally suppressed by androgen stimulation in VCaP cells
(Fig. [141]S3j) and has been reported to promote prostate cancer cell
proliferation^[142]19. A 22Rv1-specific H3K27ac-negative CTCF peak,
which is at ~10 Kb upstream of TMC5 promoter, was linked to the
H3K27ac-positive CTCF peak at TMC5 promoter, resulting in the
insulation of TMC5 promoter from upstream enhancers in 22Rv1 cells
(Fig. [143]2h). This CTCF site and loop were not observed in VCaP
cells, and accordingly, upstream enhancers were connected to TMC5
promoter by H3K27ac loops (Fig. [144]2h). In line with the chromatin
looping, the expression of TMC5 is much higher in VCaP cells than in
22Rv1 cells (Fig. [145]2h). Consistent with the previous report that
cell-type-specific CTCF occupancy is negatively correlated with DNA
methylation level^[146]7, the CTCF binding affinities showed
significant negative correlations with the methylation levels of all
the 8 CpGs at this site in ENCODE cell lines (Fig. [147]2i).
Importantly, we observed a positive correlation between DNA methylation
of this CTCF site and TMC5 expression levels in PCa clinical samples
(Fig. [148]2j), which emphasizes the clinical significance of the
insulation function of this CTCF loop.
Deletion of a CTCF site near the MYC promoter leads to profound changes of
CTCF looping
After linking the CTCF loops to the regulation of PCa-related genes, we
sought to obtain a global view of the CTCF-mediated contacts between
CTCF binding sites and essential cancer genes. We retrieved 2,134
essential genes in cancer development from DepMap^[149]20, and divided
the CTCF sites looping to those genes into H3K27ac-negative (H3K27ac-)
and H3K27ac-positive (H3K27ac+ ) groups. Then, the fold changes of CTCF
binding affinities (22Rv1 vs. VCaP) were compared with expression
changes of matched essential genes. The CTCF sites, of which the
binding affinity changes are consistent with expression changes of
matched essential genes, are three times the number of CTCF sites with
opposite trends (n = 676 and 221; Fig. [150]3a). Of note, while the
number of H3K27ac- CTCF sites is comparable to that of H3K27ac+ CTCF
sites in the consistent trend group (n = 343 and 333), for the CTCF
sites with opposite trends, H3K27ac- site number is much higher than
H3K27ac+ site number (n = 134 and 87; P = 0.013; Fig. [151]3a). These
results suggest the H3K27ac- CTCF sites are more likely to suppress
their looped essential genes compared with H3K27ac+ CTCF sites.
Fig. 3. Deletion of a CTCF site near the MYC promoter leads to
re-organization of CTCF looping.
[152]Fig. 3
[153]Open in a new tab
a The scatter plot showing the fold changes of CTCF binding affinities
at CTCF sites looping to gene promoters and the expression fold changes
of corresponding genes. The interactions between CTCF sites and gene
promoters were determined by CTCF HiChIP loops. VCaP and 22Rv1 CTCF
ChIP-Seq data were obtained from ENCODE and gene expression data were
retrieved from [154]GSE25183. The CTCF-gene pairs with consistent
binding/expression fold changes and both absolute fold changes > 1.5
are labelled red, and opposite fold changes are labelled blue. The dots
and triangles indicate CTCF sites with and without H3K27ac overlapping,
respectively. Genes in representative CTCF-gene pairs were annotated.
n = 2088. P-value was calculated by Chi-square test. b Upper: The
highlighted CTCF site was connected to MYC promoter by a CTCF loop. The
CTCF binding affinities at this site (−10 Kb from MYC promoter) were
negatively correlated with MYC expression in prostate cancer cell
lines. The CTCF site −10 Kb from MYC promoter was deleted by
CRISPR/Cas9-mediated knock-out in 22RV1 cells. The control (sgCtrl) and
CTCF deletion (sgDele-10Kb) cells were then used for H3K27ac and CTCF
HiChIP experiments. Bottom: Significantly changed H3K27ac loops at MYC
region by the “−10 Kb CTCF site” deletion. For each group of H3K27ac
HiChIP, two biological replicates were performed. c Number of
significantly changed CTCF loops at each chromosome before and after
“−10Kb CTCF site” deletion. d Enrichment analysis of MYC binding at the
anchors of dysregulated CTCF or H3K27ac loops. Orange points represent
the actual ratio of dysregulated loop anchors with MYC binding. Each
box represents 500-time random sampling from all CTCF or H3K27ac loop
anchors. Same number of loop anchors as in the sgCtrl-specific or
sgDele-10Kb-specific anchor set was used for random sampling,
respectively. O/E (observed vs. expected) was calculated by comparing
the overlap percentage of actual dysregulated loop anchors with that of
the average of randomly sampled anchors. Box plots indicating the mean
(middle line), 25th and 75th percentile (box), and 10th and 90th
percentile (whiskers). Points were highlighted by red if P < 0.05.
P-values were determined by Student’s t-test. n = 215, 297, 2896, 2322
from left to right, respectively. e Motifs enriched in the CTCF peaks
at CTCF anchors of indicated loops. f Motif distribution in CTCF peaks
at CTCF anchors of indicated loops. For each CTCF peak, the location of
the best-ranked CTCF motif was used as the center for the motif density
plot. g The aggregated CTCF ChIP-Seq signal in 22Rv1 cells at indicated
peaks.
Among the H3K27ac- CTCF sites, one site at ~10 Kb upstream of the MYC
gene (hg19, chr8:128737774-128738489; referred to as “−10Kb CTCF site”
hereafter) was looped to MYC promoter, and the CTCF binding affinities
at this site are negatively correlated with MYC expression levels in
PCa cell lines (Fig. [155]3a, [156]b). Consistently, our group recently
found the deletion of the −10Kb CTCF site led to a dramatic
upregulation of MYC expression^[157]1. In PCa, the MYC-located 8q24
region contains lots of CTCF-mediated short and long chromatin loops,
but the function of these CTCF loops is largely unknown
(Fig. [158]S4a). To gain a deeper insight into the −10Kb CTCF site
function in 3D chromatin interaction, we generated CTCF and H3K27ac
HiChIP data with control (sgCtrl) and −10Kb CTCF site deletion
(sgDele-10kb) 22Rv1 cells (Fig. [159]3b and [160]S4b). As shown by CTCF
HiChIP, the −10Kb CTCF site looped to another CTCF site at MYC promoter
(Fig. [161]3b), which was reported as the enhancer-docking site for MYC
transcription^[162]2. As expected, the deletion of “−10Kb CTCF site”
completely disrupted the CTCF loop between −10Kb CTCF site and MYC
enhancer-docking site, and introduced extensive new H3K27ac-associated
loops connecting multiple upstream enhancers to MYC region
(Fig. [163]3b, [164]S4c and [165]S4d). We and others previously
reported that a cluster of super-enhancers within PCAT1/2 region
interact with MYC promoter to regulate MYC expression in VCaP
cells^[166]14,[167]21,[168]22. Here, we found the super-enhancers
within PCAT1/2 region were also robustly looped to MYC in 22Rv1 cells,
but the looping between them was not significantly changed by −10Kb
CTCF site deletion (Fig. [169]S4d). Interestingly, the
H3K27ac-associated interaction between super-enhancers of the PCAT1/2
locus was weakened by −10Kb CTCF site deletion (Fig. [170]3b),
indicating a shift of interacting co-factors to MYC-related loops upon
MYC upregulation. Together, these data suggest that a subset of
H3K27ac- CTCF sites can block enhancer-promoter interaction by forming
CTCF-CTCF loop to gene promoter.
MYC is enriched in CTCF loop anchors
We then examined the impact of the −10Kb CTCF site deletion on
chromatin architecture at genome-wide scale. Unexpectedly, besides the
local chromatin architecture alteration, we also observed a global
impact of this CTCF site deletion on RE interactions. Differential
looping analysis identified 261 and 2,952 significantly changed CTCF
and H3K27ac loops, respectively, across the whole genome. Upon −10Kb
CTCF site deletion, there were 13 chromosomes with more upregulated
CTCF loops, 6 chromosomes with an equal number of upregulated and
downregulated CTCF loops, and only 4 chromosomes with more
downregulated CTCF loops (Fig. [171]3c). Since the primary effect of
−10Kb CTCF site deletion is to drive MYC upregulation, we speculated
the global alteration of CTCF loops is associated with chromatin
occupation of MYC. In agreement with our hypothesis, MYC binding sites
were enriched in both sgDele-10Kb-specific and sgCtrl-specific CTCF
anchors (Figs. S4e, [172]3d). To validate the activation of MYC gene
after repression of −10Kb CTCF site, we then targeted this site using
dCas9-KRAB complex. Repression of −10Kb CTCF site resulted in the
decrease of CTCF binding at this site and enhancement of MYC
expression, cell proliferation, MYC binding affinity at target genes
and MYC target gene expression (Fig. [173]S4f–g), supporting an
increasement of global MYC activity.
To further determine the molecular basis of MYC function at CTCF
anchors, we performed motif enrichment analysis using CTCF binding
peaks within CTCF anchors. The MYC motif was recognized as one of the
mst enriched motifs (rank 3, P = 1e-23) in MYC+/H3K27ac- CTCF anchors,
and showed moderate enrichment in MYC+/H3K27ac+ CTCF anchors (rank 12,
P = 1e-6; Fig. [174]3e). Motif distribution scanning by Homer confirmed
the highest enrichment of MYC motif in MYC+/H3K27ac- CTCF anchors, and
revealed the colocalization of MYC motif and CTCF motif (Fig. [175]3f).
Importantly, the colocalization pattern of MYC and CTCF was supported
by cistrome data. MYC occupation was generally accompanied by higher
enhancer activity and chromatin accessibility as revealed by 22Rv1
H3K27ac ChIP-Seq and PCa ATAC-Seq data, but also indicated higher CTCF
binding in 22Rv1 cells and cohesin binding in A549 cells at H3K27ac-
CTCF sites (Fig. [176]3g and [177]S4k). These results suggest MYC and
CTCF motifs are co-localized at a subset of genomic regulatory
elements, which is associated with the co-occupancy of MYC and CTCF at
these sites.
MYC facilitates CTCF chromatin binding
To further assess the effects of MYC on CTCF-mediated 3D chromatin
organization, we performed RNA-Seq, MYC/CTCF/H3K27ac ChIP-Seq and
CTCF/H3K27ac HiChIP assays in control and MYC-overexpressed 22Rv1 cells
(Figs. [178]4a, [179]4b). MYC overexpression induced 8,998 new MYC
peaks (54.2% of total MYC peaks) in 22Rv1 cells (Fig. [180]4c), and
highly increased the binding affinities of MYC peaks (Fig. [181]4d).
After MYC overexpression, there were 4,045 emerged and 6,676 diminished
CTCF peaks (Fig. [182]4c), respectively, which accounted for a small
fraction of total CTCF peaks (5.6% and 9.2%). In MYC-overexpressed
cells, 54.1% of MYC peaks overlapped with 14.1% of CTCF peaks
(Fig. [183]4e and [184]S5a), showing a disproportional MYC association
with CTCF on the chromatin. MYC overexpression increased CTCF occupancy
at CTCF/MYC common sites, especially at H3K27ac- sites (Fig. [185]4f),
suggesting a mechanistic link of MYC to gene repression. In addition, a
genome-wide redistribution of H3K27ac signal was induced by MYC
overexpression, with 10,817 emerged and 14,327 diminished H3K27ac peaks
(Fig. [186]4c). H3K27ac levels were slightly repressed by MYC at both
MYC+ or MYC- regions (Fig. [187]4g). We also observed the chromatin
colocalization of CTCF and MYC in LNCaP and VCaP ChIP-Seq data
(Fig. [188]S5b, [189]S5c). Although the portions of overlapping peaks
in LNCaP and VCaP cells were smaller than that in 22Rv1 cells, the
MYC+ CTCF binding was still stronger than MYC- CTCF binding in LNCaP
and VCaP cells, supporting that MYC facilitates CTCF chromatin
occupancy (Fig. [190]S5d).
Fig. 4. MYC facilitates CTCF chromatin binding.
[191]Fig. 4
[192]Open in a new tab
a Multi-omics assays to evaluate the function of MYC on CTCF binding
and looping in 22Rv1 cells. b Western blot assay showing the efficiency
of MYC overexpression. c The overlap of MYC, CTCF and H3K27ac peaks
between control (Ctrl) and MYC overexpression (MYC-OE) cells. d
Heatmaps showing the ChIP-Seq signal of MYC, CTCF and H3K27ac at MYC
and CTCF peaks. From top to bottom, MYC and CTCF peaks were separated
into shared, MYC-only, and CTCF-only groups. e The overlap between MYC
and CTCF peaks in MYC-OE cells. f The aggregated CTCF ChIP-Seq signal
in Ctrl and MYC-OE cells. CTCF peaks were divided into four groups
based on MYC and H3K27ac status. g The aggregated H3K27ac ChIP-Seq
signal in Ctrl and MYC-OE cells. H3K27ac peaks were divided into four
groups based on MYC and CTCF status. h Co-immunoprecipitation to detect
the protein-protein interaction between CTCF and MYC in Ctrl and MYC-OE
cells. i Western blot and corresponding Coomassie blue staining of GST
pull-down assay. 22Rv1 cell lysates were subjected to pulldowns with
immobilized GST only or recombinant GST-CTCF protein. Bound protein was
probed with anti-CTCF and anti-MAX antibodies by Western blot. The red
arrow indicates the GST-MYC band. j 22Rv1 cells expressing GFP-MYC and
Flag-CTCF were used in co-IP assay. The input and IPed proteins were
analyzed by Western blot with anti-GFP and anti-Flag antibodies. k
Proximity Ligation Assay to detect the in-situ interaction between MYC
and CTCF proteins. Nuclei were stained with DAPI. Scale bar, 10 μm. For
b, h, i, j, and k, these experiments were repeated independently three
times with similar results. Source data are provided as a Source Data
file.
Next, we examined whether there is a physical interaction between MYC
and CTCF proteins. By co-immunoprecipitation (co-IP), we observed the
interaction between MYC and CTCF in both control and MYC-overexpressed
22Rv1 cell (Fig. [193]4h). Similarly, co-IP assay also showed the
interaction of CTCF and MYC proteins in V16A cells (Fig. [194]S5e). The
MYC-CTCF interaction was further confirmed by GST-pulldown assay, co-IP
assay of tagged proteins, and proximity ligation assay
(Fig. [195]4i–k). Taken together, our data suggest MYC interacts with
CTCF and facilitates its chromatin occupancy at CTCF/MYC common sites.
MYC represses a subset of neuroendocrine lineage plasticity genes by
enhancing CTCF-mediated chromatin looping
After confirming the effect of MYC on CTCF binding, we next
investigated the impact of MYC on CTCF-mediated looping and gene
expression. Differential expression analysis (FDR < 0.05 and fold
change > 2; DESeq2) showed a higher number of repressed (n = 479)
compared with activated (n = 181) genes upon MYC overexpression
(Fig. [196]5a). While MYC-activated genes were enriched cell
proliferation gene sets, such as “E2F_TARGETS” and “mesenchymal cell
proliferation”, MYC-repressed genes were enriched in neurogenesis and
endocrine-related gene sets like “regulation of neurogenesis” and
“insulin secretion” (Fig. [197]5b and [198]S6a–c). Since neuroendocrine
transdifferentiation plays an important role in PCa progression, we
then assessed whether the pan-neuroendocrine tumor (pan-NET) genes that
we previously defined^[199]23 were regulated by MYC. Gene Set
Enrichment Analysis (GSEA) showed a dramatical downregulation of
pan-NET genes by MYC (P = 5.19e-06; Fig. [200]5c). Of 88 pan-NET genes,
51 were significantly repressed by MYC (FDR < 0.05), including
canonical neuroendocrine marker gene CHGA and CHGB (Fig. [201]S6d).
Consistent with gene expression alteration, H3K27ac-associated
chromatin looping was enhanced at the promoters of MYC-upregulated
genes and diminished at promoters of MYC-downregulated genes
(Fig. [202]5d and [203]S6e). In line with its stimulatory effects on
CTCF binding, MYC expression caused more strengthened than weakened
CTCF loops (n = 5609 and n = 2502, respectively). The distribution
curves of both up- and downregulated CTCF loops show peaks at 150~330
Kb (Fig. [204]5e). A large fraction of enhanced CTCF loops spanned TAD
boundaries (Fig. [205]5f and [206]S6f), indicating their long-range
insulation function.
Fig. 5. MYC represses neuroendocrine genes by promoting CTCF looping.
[207]Fig. 5
[208]Open in a new tab
a Volcano plot showing dysregulated genes by MYC overexpression. n = 2.
b GO BP enrichment analysis of 479 MYC-repressed genes. The GO terms
with high similarity were clustered together and summarized. n = 2. c
GSEA plot showing the pan-NET genes were overall suppressed by MYC
overexpression. d Boxplots showing H3K27ac loop strength fold changes
at the promoters of upregulated or downregulated genes after MYC
overexpression. Box plots indicating the mean (middle line), 25th and
75th percentile (box) and 10th and 90th percentile (whiskers). n = 854,
5024 from left to right, respectively. P-value was determined by
Student’s t-test. e The length distribution of CTCF loops dysregulated
by MYC. f Normalized CTCF HiChIP contact matrices of Ctrl and MYC-OE
cells at a genomic region of chromosome 2. The black rectangles
indicate TAD structures. Black arrows highlight the intra-TAD CTCF
looping enhanced by MYC. Red arrows highlight the corresponding CTCF
binding increased by MYC. g The analysis workflow to identify genes
with H3K27ac loops disrupted by increased CTCF looping after MYC
overexpression. h Ranked dot plot showing the decreased H3K27ac loop
strength (MYC-OE - Ctrl) at promoters of 4557 genes from the analysis
in (g). Red and blue circles indicate genes up- and downregulated by
MYC, respectively. The annotated circles were downregulated pan-NET
genes. Bar plot showing the overlapping between MYC-dysregulated genes
and the 4557 genes. The P-value was determined by Chi-square test. i
Dysregulated CTCF and H3K27ac loops at the genomic region spanning the
CDK5R2 gene in Ctrl and MYC-OE cells. MYC-induced CTCF binding and
looping were highlighted. j Gene set z-scores of MYC targets and
pan-NET genes were negatively correlated in two PCa RNA-Seq data sets.
Spearman rho and P-value were shown. Box plots indicating the mean
(middle line), 25th and 75th percentile (box) and 10th and 90th
percentile (whiskers). n = 83, 167, 83, 9, 16, 9 from left to right,
respectively. k Top: Three CRISPRi sgRNAs were designed to target the
three CTCF sites upstream of CDK5R2 gene, respectively. Bottom: CTCF
ChIP-qPCR and RT-qPCR with or without CRISPRi in MYC overexpressed
22Rv1 cells. n = 3. Data represent means ± SD. P values were two-sided
Student’s t test. *P < 0.05; **P < 0.01. Source data are provided as a
Source Data file.
We then sought to connect the CTCF looping changes to the MYC-induced
gene dysregulation by analyzing the crossover between upregulated CTCF
loops and downregulated H3K27ac loops (Fig. [209]5g). Our analysis
identified 941 downregulated H3K27ac loops crossover with upregulated
CTCF loops, and those H3K27ac-associated interactions were potentially
insulated by CTCF. Those H3K27ac loops were mapped to 4,557 gene
promoters, of which MYC-downregulated genes accounted for a significant
higher portion compared with MYC-upregulated genes (P = 0.0042;
Chi-squared test; Fig. [210]5h). We then took CDK5R2, one of
MYC-repressed pan-NET genes, to exemplify the CTCF effect. A robust
H3K27ac loop connected a distal enhancer to CDK5R2 promoter,
potentially maintaining CDK5R2 expression (Fig. [211]5i). MYC
overexpression facilitated CTCF binding in this region and thus
introduced new CTCF loops intersecting with the H3K27ac loop of CDK5R2,
resulting in the weakening of promoter-enhancer interaction of CDK5R2
gene (Fig. [212]5i). To assess the contribution of CTCF sites in
MYC-repressed CDK5R2 expression, we targeted three CTCF sites at the
anchors of new CTCF loops by dCas9-KRAB strategy. CTCF binding affinity
was significantly reduced at all three sites by CRISPRi (Fig. [213]5k).
Consequently, CDK5R2 mRNA levels significantly increased in all three
CRISPRi cell lines under MYC overexpression conditions (Fig. [214]5k).
Finally, we checked the association between MYC activities and pan-NET
scores in several PCa clinical data sets. High pan-NET scores were
significantly related to low MYC activities in both primary PCa and
CRPC data sets (Fig. [215]5j and [216]S6g), highlighting the clinical
significance of MYC-induced repression of neuroendocrine genes.
Overall, our data suggest MYC suppresses neuroendocrine gene
transcription by enhancing CTCF-mediated chromatin looping.
Discussion
As a key architectural protein, CTCF bridges the genome topology and
gene expression regulation. By exploring the CTCF-associated chromatin
contact map in PCa, we found a direct association between selective
CTCF looping and gene expression regulation. For both TMC5 and MYC
genes, the cell-type-specific CTCF binding sites near PCa-related genes
form CTCF-CTCF loops and interrupt the access of enhancers to gene
promoters. This observation is consistent with a previous integrative
analysis of Hi-C and CTCF ChIP-Seq data, which showed that several CTCF
sites near gene promoters inhibited gene expression in PCa^[217]24.
Furthermore, the methylation level at a cell-type-specific CTCF site
looping to TMC5 gene is positively correlated with TMC5 expression in a
clinical PCa cohort, indicating the potential implication of
context-dependent CTCF sites in the development of DNA
methylation-based PCa biomarkers.
MYC is known to reside in an enhancer-less locus, and its expression
was predominantly subjected to regulation by long-range chromatin
interaction dynamics. In breast cancer, MYC and PVT1 promoters compete
for a cluster of downstream enhancers^[218]3. In a B-ALL cell line,
CTCF regulates MYC expression by maintaining the chromatin interaction
between MYC promoter and a distal downstream enhancer cluster^[219]25.
In PCa, our group reported androgen represses MYC transcription by
disrupting the interaction between super-enhancers within PCAT1 region
and MYC promoter^[220]14. However, the effect and mechanism of MYC in
3D genome organization was poorly defined. We reported in this study
that MYC potentiates CTCF-mediated chromatin looping to suppress the
expression of a subset of genes in PCa (Fig. [221]S5h). This finding is
also supported by a previous report that Myc deletion in activated
mouse B cells markedly reduces loop contacts and Rad21 binding at loop
anchors^[222]5. Interestingly, a recent study reported MYC activation
strengthens chromatin interactions at super-enhancers and MYC binding
sites in U2OS osteosarcoma cells^[223]6. Together, these findings start
to corroborate MYC functions in 3D genome. Although our data suggest
MYC assists CTCF chromatin binding at MYC/CTCF common sites, we could
not exclude additional function of MYC in the regulation of CTCF
chromatin occupancy. For example, considering CTCF chromatin binding
was reported to display a cell cycle stage-dependent dynamics^[224]26,
and MYC is a well-established regulator in cell cycle control^[225]27,
MYC-regulated cell cycle progression may also play a role in CTCF
binding changes. In addition, as our results were obtained by
population-based methodologies, the increase of CTCF chromatin
occupancy could rely on either the enhanced CTCF binding affinity or
prolonged chromatin residence time of CTCF. Further single-cell
epigenomic assays and single-molecule imaging assays are needed to
unravel the detailed mechanism of MYC-facilitated CTCF chromatin
occupancy.
AR acts mainly as a transcriptional activator but is also involved in
the regulation of androgen-induced transcriptional
repression^[226]28,[227]29. Our AR HiChIP showed a prominent increase
in AR-associated enhancer-promoter interactions after androgen
stimulation, supporting the direct transcriptional activation function
of AR from the perspective of 3D genomics. Although AR binding has been
reported to be related to several androgen-repressed
genes^[228]28,[229]30, our HiChIP analysis of AR loops before and after
androgen stimulation decoupled the AR-associated enhancer-promoter
interactions from androgen-induced transcription repression. Instead,
the dynamics of H3K27ac-associated promoter-enhancer contacts
synchronize with gene expression alteration at androgen-repressed
genes, consistent with their dependence on TFs other than AR. Our motif
analysis identified the ERG motif was enriched in H3K27ac+/AR-
enhancers of androgen-repressed genes. The TMPRSS2-ERG fusion, which
results in ERG overexpression, occurs in ~50% of prostate
tumors^[230]31,[231]32. As an oncogenic gene in prostate cancer, ERG
regulates proliferation and invasion genes by orchestrating
higher-order chromatin organization, which is distinct from
AR-associated chromatin connectivity^[232]33–[233]35. Indeed, ERG has
been demonstrated to repress AR-mediated transactivation^[234]15. In
line with these reports, our results further suggest ERG may play an
important role in the construction of promoter-enhancer looping of
androgen-repressed genes.
In summary, by interrogating PCa interactome data, we revealed the role
of MYC in regulating 3D genome organization. Moreover, we provided a
comprehensive collection of 3D-epigenome data sets in multiple PCa
cellular models, which would be a valuable resource for PCa genomic
research.
Methods
Cell lines
All cell lines were cultured at 37 °C with 5% CO[2]. The medium used
for VCaP (CRL-2876, ATCC), 22RV1 (CRL-2505, ATCC), V16A (established by
Dr. Amina Zoubeidi’s laboratory), and HEK293FT ([235]R70007, Thermo
Fisher Scientific) cell culture was supplemented with 10% FBS (GIBCO,
10437-028), 1% streptomycin and 1% penicillin. For androgen stimulation
treatment, cells were grown to 50%~60% confluence in a medium
containing 5% charcoal-dextran stripped FBS (CDS) for 48 h and then
treated by 10 nM DHT for 2 or 24 h. All cell lines used in this study
were tested negative for mycoplasma contamination. We utilized ATCC
services following extended passages to authenticate by utilizing Short
Tanden Repeat (STR) profiling.
Lenti-viral vector construction and transfection
The MYC CDS sequence was cloned into the pLVX-IRES-Puro vector (named
pLVX-MYC), and then verified by DNA sequencing. The primer sequences
for MYC amplification were listed in Supplementary table [236]1. The
plasmids pLVX-MYC or the negative control (pLVX-NC) were co-transfected
with the plasmids of pLP1, pLP2, and pLP/VSVG using the transfection
reagent Lipofectamine™ 2000 (Invitrogen, 11668–019) according to the
manufacturer’s protocol. The stably-transfected 22Rv1 cells were
selected by puromycin (600 ng/mL).
Western blotting
Cells were harvested and lysed on ice using RIPA buffer (Beyotime,
P0013B) containing 1% PMSF (Solarbio, P0100). The proteins were
purified by centrifugation (12,000 g at 4 °C for 20 min) and quantified
by Bicinchoninic Acid (BCA) protein assay. Total protein concentrations
were normalized in all samples. The proteins were heated at 95 °C for
10 min in the loading buffer (Beyotime, P0015L), then resolved in 10%
dodecyl sulfate (SDS)-polyacrylamide electrophoresis gel,
electrophoresed with Tris-glycine running buffer at 15 V/cm for 1 h,
and finally transferred to a polyvinylidene difluoride (PVDF) membrane
(Millipore, IPVH00010). This membrane was incubated at room temperature
in blocking buffer (5% non-fat dry milk in TBST) for 2 h. Subsequently,
the membranes were incubated with primary antibodies rabbit anti-MYC
(1:1000; Abcam, ab32072, Rabbit monoclonal [Y69], Lot: GR3377350-5) and
rabbit anti-GADPH (1:1000; Cell Signaling Technology, 5174 S, Rabbit
monoclonal [D16H11], Lot: 8) at 4 °C overnight. GADPH was used as
internal control. The membranes were washed three times with TBST and
incubated with horseradish-peroxidase (HRP)-coupled secondary antibody
anti-rabbit IgG (1:10,000; ZSbio, zb-2301) for 1 h at room temperature.
After the second round of wash, the ECL reagent (Boster, AR1191) was
used to visualize protein bands.
CRISPR/Cas9-mediated −10 Kb CTCF site deletion
22Rv1 clones with the “−10Kb CTCF site” deletion was reported in our
previous study^[237]1. These clones were used for H3K27ac and CTCF
HiChIP assays.
CRISPRi assay
To stably express dCas9-KRAB in 22Rv1 cells, the lentiviral packaging
system including the Lenti-dCas9-KRAB-blast plasmid (Addgene plasmid #
89567), pMDG.2 and psPAX2 packaging plasmids were used. Lentiviral
particles were generated in HEK293FT cells using Lipofectamine 2000
(Thermo Fisher) according to the manufacturer’s instructions. 22Rv1
cells were infected with lentiviral supernatant for 24 h and selected
with 10 µg/ml of blasticidin (ST018, Beyotime Biotechnology) for 10-14
days. sgRNA sequences targeting −10Kb CTCF site or CTCF sites near
CDK5R2 were designed using CRISPOR ([238]http://crispor.tefor.net), and
cloned into the lentiGuide-Puro plasmid (Addgene plasmid # 52963). The
sequences of the sgRNAs are shown in Supplementary Data. Lentiviral
particles for each sgRNA were generated in HEK293FT cells, and
transduced dCas9-KRAB 22Rv1 cells were selected with puromycin for
72 h. To assess the effect of CRISPRi, the relative enrichment of
CTCF binding sites was quantified by ChIP-qPCR. MYC and CDK5R2
expression levels were examined by qPCR using primers
in [239]Supplementary Information.
Co-immunoprecipitation assay
Cells were scraped and rinsed twice with PBS. Co-immunoprecipitation
assays were performed using the BeaverBeads Protein A/G
Immunoprecipitation Kit (22202-100, Beaver Biotechnology) following the
manufacturer’s protocol. In brief, cells were lysed in IP binding
buffer containing protease inhibitors for 30 min and then centrifuged
at 12,000 g for 10 min at 4 °C. Protein A/G beads were washed twice and
resuspended in IP binding buffer (5 mM Tris-HCl (pH7.4), 150 mM NaCl,
1 mM EDTA, 1% TritonX-100, 5% Glycerol), incubated with anti-Flag,
anti-CTCF or anti-MYC antibodies for 30 min at room temperature, and
rabbit mAb IgG was used as isotype control. The cell lysates were
incubated with the anti-CTCF, anti-MYC, anti-Flag beads, or IgG beads
overnight at 4 °C. The antigen-antibody complex was captured by
magnetic separation rack and then washed 10 times with washing buffer.
Proteins were eluted from the beads in 50 µL 1×SDS-PAGE loading Buffer,
boiled for 5 min, and then subjected to SDS-PAGE and visualized by
western blotting using horseradish peroxidase-conjugated mouse
anti-rabbit IgG. Monoclonal anti-Myc (IP, 2 µg; Western blot, 1:1000;
ab32072; Rabbit monoclonal [Y69]; Lot: GR3377350-5), anti-GFP (IP,
2 µg; Western blot, 1:1000; ab290; Rabbit polyclonal; Lot: GR3321575-1)
and anti-MAX (Western blot, 1:1000; ab199489; Rabbit monoclonal
[[240]EPR19352], Lot: GR3441065-2) were obtained from Abcam (Cambridge,
MA, USA). Antibodies against CTCF (IP, 2 µg; Western blot, 1:1000;
3418 S; Rabbit monoclonal [D31H2]; Lot: 5), Flag (Western blot, 1:1000;
14793; Rabbit monoclonal [D6W5B], Lot: 7), rabbit mAb IgG control (IP,
2 µg; 3900 S; Rabbit monoclonal [DA1E]; Lot: 45), and mouse anti-rabbit
IgG mAb (HRP Conjugate) (light-chain specific, 1:1000; 93702 S; Mouse
monoclonal [D4W3E]; Lot: 5) were purchased from Cell Signaling
Technology (Danvers, MA, USA).
GST-pulldown
Primers were designed to clone MYC gene into the vector pGEX4T-1 with a
GST tag at the N terminus. Escherichia coli BL21(DE3) were transformed
with either a plasmid GST-MYC or a pGEX4T-1 empty vector. GST fusion
proteins was performed by inducing 100 mL of transformed bacterial
cultures with 0.25 mM isopropyl 1-β-d-thiogalactopyranoside and
incubating them for 18 h at 16 °C in a shaking incubator. BL21 E. coli
were resuspended in ice-cold phosphate-buffered saline with 1 mM PMSF
and homogenized by gentle sonication on ice. The soluble proteins were
obtained by centrifugation at 10,000 g for 20 min at 4 °C, and the
soluble GST or GST-MYC protein were immobilized on glutathione beads
(70601-5, Beaver Biotechnology). The Flag-CTCF protein expressing in
22Rv1 cells was then incubated with glutathione beads for 4 h at 4 °C.
The beads were again washed five times with cold PBS, and the
bead-bound protein complexes were treated with 1× SDS-PAGE loading
buffer and detected by Western blotting using an anti-Flag or anti-MAX
pAb.
Proximity ligation assay
Proximity ligation assay (PLA) was carried out to detect a potential
physical interaction between MYC and CTCF. To perform this assay, a
Duolink® In Situ Orange Starter Kit (Sigma, Cat. No. DUO92102) was used
in accordance with the manufacturer’s instructions. Briefly, 22Rv1
cells were grown on coverslips in 48-well plates until reaching 60-80%
confluence. Then, cells were fixed with 4% paraformaldehyde for 20 min
at room temperature and permeabilized with ice-cold 100% methanol for
30 min at −20 °C. Cells were incubated with Duolink® block solution for
60 min at room temperature, and then stained overnight at 4 °C with
primary antibodies of different species. In the assay, the antibodies
used were mouse anti-MYC (1:50; Santa Cruz, sc-40; Mouse monoclonal
[9E10]; Lot: K1920) and rabbit anti-CTCF (1:400; Cell Signaling
Technology, 3418 S; Rabbit monoclonal [D31H2]; Lot: 5). The next day,
cells were incubated with PLA probes mix containing PLUS antibody and
MINUS antibody at 37 °C for 60 min. Then, ligation solution and
amplification solution were successively added, respectively for 30 min
and 100 min at 37 °C. At last, coverslips were mounted with Prolong
Gold mounting medium with DAPI and the images were captured by a
confocal microscope (63 × oil immersion; Zeiss LSM 800, Zeiss,
Germany). Red spots represented protein-protein interactions.
Chromosome conformation capture (3C) assay
The 3 C assay was performed using methods as previously
described^[241]36. In brief, 5 million VCaP cells were treated with
vehicle or DHT (10 nM) for 24 h and then fixed by 1% formaldehyde in
PBS buffer for 10 min, followed by quenching the reaction by glycine.
Cells were washed with cold PBS buffer supplemented with 10% FBS, and
resuspended in lysis buffer (10 mM Tris-HCl, pH 8.0; 10 mM NaCl; 0.2%
NP-40; 1x protease inhibitor). Nuclear extracts were then digested
overnight at 37 °C with 400 U PstI (R0140S, New England BioLabs).
Digested chromatin DNA was ligated using T4 DNA ligase buffer (750 ul
of 10% Triton X-100, 750 ul of 10× NEB ligation buffer, 75 ul of
10 mg/ml BSA, and 4000 U T4 DNA ligase) at room temperature. Proteinase
K (20 ul, 19133, Qiagen) was then added and incubated overnight at
65 °C to reverse crosslinking. DNA fragments were purified by ethanol
precipitation, and subjected to PCR amplification using the primers
listed in Supplementary Data. The ligation products from PstI-digested
DNA fragments were used to assess primer efficiency and normalize 3 °C
interaction frequency.
Chromatin immunoprecipitation (ChIP) and ChIP-Seq
ChIP assays were performed using 22RV1 cells with or without MYC stable
overexpression. Protein A/G Dynabeads (88845/88847, ThermoFisher) were
mixed at a 1:1 ratio, and appropriate antibodies were added to
incubated with gentle rotation for 3 hours at 4 °C before
immunoprecipitation. Cells were cross-linked by 1% warm formaldehyde
for 10 min and then quenched with 125 mM glycine at room temperature.
The cell pellets were washed twice with cold PBS, and then samples were
incubated on ice with 10 ml of LB1 buffer (50 mM Hepes-KOH, pH 7.5;
140 mM NaCl; 1 mM EDTA; 10% Glycerol; 0.5% NP-40 or Igepal CA-630;
0.25% Triton X-100) for 10 min to extract nuclear fractions. Nuclear
fractions were collected by centrifugation and subsequently resuspended
in 10 ml of LB2 buffer (10 mM Tris-HCL, pH8.0; 200 mM NaCl; 1 mM EDTA;
0.5 mM EGTA) for 5 min. Nuclear fractions were collected again and
resuspended in LB3 buffer (10 mM Tris-HCl, pH 8; 100 mM NaCl; 1 mM
EDTA; 0.5 mM EGTA; 0.1% Na-Deoxycholate; 0.5% N-lauroylsarcosine;
Protease inhibitor cocktail). Nuclear fractions were transferred to
sonication tubes, and sonicated in a water bath sonicator (Diagenode
bioruptor) to generate chromatin fragments from 300 bp to 700 bp. 0.1
volume of 10% Triton X-100 was added to each sample. After
centrifugation, the supernatant was collected and 10% of the
supernatant was used as the input DNA. The rest chromatin lysate was
incubated with appropriate antibody-conjugated beads at 4 °C overnight.
Antibodies used for ChIP assays are anti-MYC (5 µg, ab32072, Abcam,
Rabbit monoclonal [Y69], Lot: GR3377350-5), anti-CTCF (5 µg, 3418 S,
CST, Rabbit monoclonal [D31H2], Lot: 5) and anti-H3K27ac (5 µg, ab4729,
Abcam, Rabbit polyclonal, Lot: GR3374555-1). Following incubation,
beads were washed 10 times with 1 ml of RIPA buffer (50 mM Tris, pH
7.6; 1 M NaCl; 1 mM EDTA; 0.1% SDS; 1% Igepal CA-630; 0.5% sodium
deoxycholate), and then resuspended in elution buffer (0.1 M NaHCO3; 1%
SDS; proteinase K) to reverse cross-linking of DNA-protein complexes at
65 °C for 8–16 h. DNA was purified using the ChIP DNA Clean &
Concentrator kit (D5205, Zymo Reasearch), and then subjected to
Illumina ChIP-Seq library construction using ThruPLEX DNA-seq kit
(Rubicon Genomics).
HiChIP
HiChIP was performed as previously described with a few
modifications^[242]13. Ten million cells were collected, pelleted, and
resuspended in 1% formaldehyde for 10 min with rotation at room
temperature. The crosslinking was then quenched in 125 mM glycine for
5 min. The cells were washed twice by PBS and then resuspended in
500 μL ice-cold Hi-C lysis buffer (10 mM Tris-HCl pH 7.5, 10 mM NaCl,
0.2% NP-40, 1× protease inhibitor) followed by 30 min rotation at 4 °C.
Nuclei were pelleted by 2500 rcf centrifugation for 5 min at 4 °C
followed by washing once in Hi-C lysis buffer. The nucleus pellets were
resuspended in 100 μL 5% SDS and incubated at 62 °C for 10 minutes. To
quench the SDS, 335 μL 1.5% Triton X-100 was then added and the mixture
was incubated at 37 °C for 15 min. The chromatin was digested by adding
50 μL NEB buffer 2 and 375 U MboI restriction enzyme (NEB, R0147) and
incubated at 37 °C for 2 h. The MboI restriction enzyme was then
inactivated by 62 °C incubation for 10 minutes. The fill-in master mix
(1.5 μL of 10 mM dTTP, 1.5 μL of 10 mM dCTP, 1.5 μL of 10 mM dGTP,
37.5 μL of 0.4 mM biotin-dATP, 10 μL of 5U/μL DNA Polymerase I (NEB,
M0210) was added and the tube was incubated at 37 °C for 1 h with
rotation. The ligation mix, containing 10 μL 400 U/μL T4 DNA Ligase
(NEB, M0202), 150 μL 10X NEB T4 DNA ligase buffer with 10 mM ATP (NEB,
B0202), 125 μL 10% Triton X-100, 3 μL 50 mg/mL BSA, and 660 μL H[2]O,
was added and the tube was incubated at room temperature for 4 h with
rotation. The nuclei were pelleted and then resuspended in an 880 μL
nuclear lysis buffer (50 mM Tris-HCl pH 7.5, 10 mM EDTA, 1% SDS, 1×
protease inhibitor). Nuclei were sonicated in a water bath sonicator
(Diagenode bioruptor) to generate chromatin fragments of 300~800 bp.
After sonication, the lysate was diluted to 1:9 by adding ChIP dilution
buffer (0.01% SDS, 1.10% Triton X-100, 1.2 mM EDTA, 16.7 mM Tris-HCl pH
7.5, 167 mM NaCl). Chromatin immunoprecipitation was performed by
adding protein A beads and appropriate antibodies (anti-H3K27ac, 5 µg,
ab4729, Rabbit polyclonal, Lot: GR3374555-1; anti-AR, 5 µg, ab108341,
Rabbit monoclonal [ER179(2)], Lot: GR3233427-1; anti-CTCF, 5 µg,
3418 S, Rabbit monoclonal [D31H2], Lot: 5) to the sheared chromatin
before incubation at 4 °C for overnight with rotation. The ChIPed DNA
was eluted from beads by 200 μL ChIP elution buffer (50 mM NaHCO[3], 1%
SDS) and purified by Zymo DNA Clean & Concentrator kit. The
biotin-labelled DNA was captured by streptavidin C1 beads (Invitrogen,
65001) and subjected to tagmentation using 2.5 μL Tn5 (Illumina) for
50 ng DNA. The HiChIP DNA was amplified by ~8 cycles using primers
containing Illumina sequencing adapters. Each HiChIP library was
sequenced by an Illumina sequencer to the depth of ~150 million
2 × 150 bp paired-end reads.
HiChIP data processing
Pair-end HiChIP reads were mapped to hg19 human reference genome by
Bowtie2^[243]37 within the HiCUP pipeline (v0.7.2)^[244]38. The
digested reference genome was created by HiCUP using the MboI cutting
site. The resulting bam files from HiCUP pipeline were converted into
valid fragment pairs by samtools^[245]39 and bedtools^[246]40. The
HiChIP valid pairs and matched ChIP-Seq peaks were then subjected to
the hichipper pipeline (v0.7.7)^[247]41 to generate bedpe files
containing HiChIP loops. For HiChIP loops called by hichipper, the
median loop anchor width is ~2.5 Kb for libraries prepared using MboI
enzyme. In this study, only the filtered intra-chromosome loops were
used for downstream analyses.
Significant loop calling
To identify high confidence HiChIP loops, we considered that the mated
PE reads could be located at any possible loop anchor pair by a
binomial distribution. The number of all possible loops (N[Possible
loops]) are a combination of two anchors from all possible loop anchors
(N[Possible loop anchors]) within the same chromosome. The N[Possible
loop anchors] can be calculated by:
[MATH:
NPoss<
/mi>ibleloopancho
mi>rs=N
mrow>Observ
edloopancho
mi>rs+N
mrow>Potent
ialloopancho
mi>rs :MATH]
1
The observed loop anchors are loop anchors with mated PE reads, which
could be found in the output of the analysis pipeline. Potential loop
anchors are genomic regions with matched ChIP peaks but no mated PE
reads. Since the HiChIP loop anchor could span multiple ChIP-Seq peaks,
we cannot use the number of ChIP-Seq peaks as the amount of potential
loop anchors. Intuitively,
[MATH:
NObservedloopancho
mi>rsNPossib
mi>leloopancho
mi>rs∝NChIPpeaks
mi>inobser
mi>vedloopancho
mi>rsNTotalChIPpeaks
mi> :MATH]
2
Then we can roughly estimate the number of overall loop anchors
(possible loop anchors) by:
[MATH:
NPoss<
/mi>ibleloopancho
mi>rs≈NObserve
dloopancho
mi>rs×N
mrow>TotalChIPpeaks
mi>NC<
/mi>hIPpeaks
mi>inobser
mi>vedloopancho
mi>rs :MATH]
3
Because we only estimate the intra-chromosome loops, we have to
calculate N [Possible loop anchors] for each chromosome and get the
sum. Therefore, the N [Possible loops] can be presented by:
[MATH:
NPoss<
/mi>ibleloops
mi>=∑iCNPossibleloopancho
mi>rsi,2
mfenced>=∑iCNO
bserved<
/mi>loopancho
mi>rsi×
NTotalChIPpeaks
mi>i<
mi>NChIPpeaks
mi>inobser
mi>vedloopancho
mi>rsi,2i=chr1,
mo>chr2,⋯,chrX
mi> :MATH]
4
According to the binomial distribution, the possibility of a given loop
can be calculated as
[MATH: PX=m=Cn,mpm(1−p)n−m :MATH]
5
[MATH: m :MATH]
denotes the loop strength corrected by anchor length.
[MATH: n :MATH]
denotes the total corrected loop strength.
[MATH: p :MATH]
is the probability of one PE mated read mapped to a specific loop.
[MATH: p :MATH]
denotes the determined by:
[MATH: p=1NPos
mi>sibleloops
mi> :MATH]
6
P-values are corrected by Bonferroni-Holm method and loops with
adjusted P-values < 0.05 were considered as reliable loops in this
study.
Visualization
HiChIP loops at a focal genomic region were visualized using R.
Normalized loop strength was represented by loop curve length. For each
loop, the centers of two anchors were used as the start and end of a
loop curve. To visualize chromosome-wide HiChIP contact map, HiChIP
fastq files were processed by Juicer (v1.6) pipeline^[248]42 to
generate hic files. The.hic files were loaded into R by
plotgardener^[249]43 to show the genomic contact heatmap.
Motif analysis
To identify DNA binding motifs enriched in the open chromatin regions
of HiChIP loop anchors, TCGA PCa ATAC-Seq peak set was downloaded from
[250]https://gdc.cancer.gov/about-data/publications/ATACseq-AWG^[251]44
. The hg38 version PCa ATAC-Seq peaks were converted to hg19 genome
coordinates using “hg38ToHg19.over.chain” from UCSC genome browser. The
motifs enriched in the indicated regions were then identified by
Cistrome SeqPos^[252]45. For the motif analysis in H3K27ac and/or CTCF
anchors with or without MYC binding, ‘findMotifsGenome.pl’ script of
HOMER (v4.11)^[253]46 was used to obtain the motifs enriched in six
peak types with parameters ‘-size 600 -mask’ and ‘annotatePeaks.pl’
script was used to calculate the distribution of MYC or CTCF DNA
binding motif PWMs at indicated peaks.
Loop anchor annotation and enrichment analysis
The genomic features and nearby genes were annotated to HiChIP loop
anchors by the annotatePeak function of R packages ChIPseeker^[254]47
and TxDb.Hsapiens.UCSC.hg19.knownGene. To identify the pathways related
with loop anchors, the anchor-associated genes were subjected to KEGG
pathway enrichment analysis using R package clusterProfiler^[255]48.
Hi-C data processing
22RV1 Hi-C data was downloaded from [256]GSE118629^[257]49. By the
Juicer (v1.6) pipeline^[258]42, the raw Hi-C read pairs were first
mapped to hg19 reference genome, and after deduplication, the bam files
of alignments were used to generate hic files, which contain genomic
interaction matrices. The normalized average interaction strength of
aggregated loops was then obtained from hic files by Aggregate Peak
Analysis (APA) function of Juicer.
ChIP-Seq data processing
Reads from ChIP-Seq experiments were aligned to the hg19 version of the
human reference genome by Bowtie2 (version 2.2.1). The resulting sam
files were converted to bam files by samtools (v.0.1.18). MACS2
(v2.2.7.1)^[259]50 was used to call peaks from bam files with
parameters ‘–keep-dup=1 -g hs -B–SPMR’. The resultant bedGraph files
containing signal per million reads were converted to bigWig files by
UCSC tools (v385). The bigWig files were loaded to genome browser IGV
(v2.8.12) for peak binding visualization. Deeptools (v3.4.3)^[260]51
was used to extract ChIP-Seq signals of indicated peaks from bigWig
files and generate the profile plots.
RNA-Seq data processing
The reads were aligned to the hg19 human reference genome by STAR
(version 2.4.2a) with default settings^[261]52. The resulting
*.ReadsPerGene.out.tab files were then merged to a read count matrix
and the matrix was used for differential expression analysis by R
package DESeq2. GO enrichment analysis and GSEA were performed by R
package clusterProfiler^[262]48. The gene expression levels were also
quantified by calculating the reads per kilobase per million mapped
reads (RPKM) using the read count matrix and GENCODE v19 gene
annotation.
Reporting summary
Further information on research design is available in the [263]Nature
Portfolio Reporting Summary linked to this article.
Supplementary information
[264]Supplementary Information^ (14.6MB, pdf)
[265]Peer Review File^ (2MB, pdf)
[266]41467_2023_37544_MOESM3_ESM.pdf^ (389.5KB, pdf)
Description of Additional Supplementary Files
[267]Supplementary Data 1^ (19KB, xlsx)
[268]Supplementary Data 2^ (19.2KB, xlsx)
[269]Reporting Summary^ (3.4MB, pdf)
Acknowledgements