Abstract
ARID1A and PI3-Kinase (PI3K) pathway alterations are common in
neoplasms originating from the uterine endometrium. Here we show that
monoallelic loss of ARID1A in the mouse endometrial epithelium is
sufficient for vaginal bleeding when combined with PI3K activation.
Sorted mutant epithelial cells display gene expression and promoter
chromatin signatures associated with epithelial-to-mesenchymal
transition (EMT). We further show that ARID1A is bound to promoters
with open chromatin, but ARID1A loss leads to increased promoter
chromatin accessibility and the expression of EMT genes. PI3K
activation partially rescues the mesenchymal phenotypes driven by
ARID1A loss through antagonism of ARID1A target gene expression,
resulting in partial EMT and invasion. We propose that ARID1A normally
maintains endometrial epithelial cell identity by repressing
mesenchymal cell fates, and that coexistent ARID1A and PI3K mutations
promote epithelial transdifferentiation and collective invasion.
Broadly, our findings support a role for collective epithelial invasion
in the spread of abnormal endometrial tissue.
Subject terms: Cancer models, Endometrial cancer, Endometrial cancer,
Cancer epigenetics
__________________________________________________________________
PIK3CA mutations and ARID1A loss co-exist in endometrial neoplasms.
Here, the authors show that these co-mutations drive gene expression
profiles correlated with differential chromatin accessibility and
ARID1A binding in the endometrial epithelium, resulting in partial EMT
and myometrial invasion.
Introduction
The endometrium is the dynamic inner layer of the uterus, composed of
stroma and epithelial cells that undergo monthly proliferation,
differentiation, and shedding throughout the menstrual cycle in
reproductive age women^[54]1. Disruption of normal endometrial
processes results in a number of pathologies, including endometrial
hyperplasia, endometrial cancer (EC)^[55]2, endometriosis^[56]3,
adenomyosis^[57]4, and endometriosis-associated ovarian cancer
(EAOC)^[58]5. An estimated 63,230 women will be diagnosed with EC this
year^[59]6, making it the most commonly diagnosed gynecologic
malignancy. Furthermore, EC incidence is rising due to the increasing
prevalence of obesity^[60]2,[61]7.
The SWI/SNF chromatin remodeling complex is mutated in >20% of all
human cancers^[62]8,[63]9, and the ARID1A (BAF250A) subunit is
particularly prone to mutation in gynecologic
cancer^[64]5,[65]10–[66]13. ARID1A mutations are found in 40% of
low-grade EC^[67]12, while ARID1A protein expression is lost in 26–29%
of low-grade and 39% of high-grade EC^[68]13. ARID1A loss is observed
in focal areas of atypical endometrial hyperplasia^[69]14, indicating
clonal loss. Loss of ARID1A in complex atypical hyperplasia is
associated with malignant transformation and concurrent EC^[70]15.
ARID1A mutations are observed in 11% of endometriosis and >30% of
EAOCs^[71]3,[72]5,[73]16,[74]17. These data support a tumor suppressor
role for ARID1A-containing SWI/SNF complexes in neoplasms originating
from the endometrium.
Among highly mutated tumor suppressor genes, ARID1A is unique because
ARID1A knockout mice are embryonic lethal in the heterozygous
state^[75]18, while other tumor suppressor genes (e.g., TP53) are
non-essential for mouse development^[76]19. ARID1A-null embryos die at
embryonic day (E) E6.5^[77]18, while DNA-binding defective
ARID1A^V1068G mutant embryos die around E10^[78]20. ARID1A mutations
are often nonsense and result in a frameshift of the open-reading
frame^[79]10, a characteristic of many tumor suppressors.
Mutations leading to PI3K/AKT pathway upregulation are frequent in
EC^[80]21, with 84% of patients displaying mutations in PIK3CA, PIK3R1,
or PTEN^[81]22. PIK3CA mutations commonly co-occur with ARID1A loss in
EC^[82]23. However, PIK3CA mutations have been observed in normal
endometrium^[83]17. Missense mutations of PIK3CA are common in complex
atypical hyperplasia, and PIK3CA mutation has been identified as an
early event in endometrial carcinogenesis^[84]24.
Genetically engineered mouse models (GEMMs) offer the opportunity to
study gynecologic pathologies in vivo^[85]25–[86]28. ARID1A loss in the
mouse ovarian surface epithelium drives tumorigenesis when paired with
PTEN loss or PIK3CA^H1047R mutation^[87]29,[88]30. In this study, we
utilize lactotransferrin-Cre (LtfCre) to target ARID1A mutations and
PIK3CA^H1047R directly to the endometrial epithelium. Utilizing the
Arid1a^fl and Arid1a^V1068G alleles, we develop an allelic series of
loss of function ARID1A mutations in the endometrium, each with
increasing severity. We employ genome-wide approaches to profile gene
expression and chromatin accessibility of sorted endometrial epithelial
cells in vivo and identified chromatin accessibility changes at
promoters upon ARID1A loss, which correlate with changes in
transcription. Using chromatin immunoprecipitation sequencing
(ChIP-seq), we show that ARID1A binding correlates with chromatin
accessibility and is associated with gene expression changes upon loss
of ARID1A. We utilize human endometrial epithelial cells to elucidate
the consequences of ARID1A loss and PIK3CA^H1047R in vitro, and
discover a mechanism by which ARID1A and PIK3CA mutations result in a
partial EMT phenotype capable of collective invasion into the uterine
myometrium. In this context, we characterize the role of ARID1A in
epithelial cell identity of the endometrium.
Results
ARID1A is haploinsufficient in the endometrial epithelium
ARID1A has been hypothesized to function as a haploinsufficient tumor
suppressor^[89]31. To explore this further, we utilized publicly
available Uterine Corpus Endometrial Carcinoma (UCEC) mutation and
copy-number datasets from The Cancer Genome Atlas (TCGA). Most
endometrioid EC patients with ARID1A mutations (either single or
multiple hits) show no detectable copy-number alterations at the ARID1A
locus, with 33% of all patients having a single nonsense mutation and
normal ploidy at ARID1A (Fig. [90]1a). Co-existing PIK3CA mutation was
significantly associated with ARID1A mutation, and a majority (61%) of
heterozygous ARID1A tumors also have PIK3CA alterations (Fig. [91]1a).
These data demonstrate that 20% of endometrioid EC patients are
genetically heterozygous for ARID1A mutations and carry PIK3CA
alterations.
Fig. 1.
[92]Fig. 1
[93]Open in a new tab
Development of genetic mouse models representing an allelic series of
ARID1A mutations in the endometral epithelium. a UCEC endometrioid
patient ARID1A alteration status and co-incidence with PIK3CA mutation,
taken from TCGA-UCEC dataset. b LacZ expression (blue) is specific to
the endometrial epithelium. Sections were counter-stained with nuclear
fast red (scale bar = 400 μm). c Diagram of mutant alleles utilized in
this study. d PCR genotyping results to detect LtfCre^0/+,
(Gt)R26Pik3ca^*H1047R, Arid1a^fl, and Arid1a^V1068G. e Representative
gross images of mice at time of sacrifice due to vaginal bleeding.
White arrows indicate tumors. Size of uterine tumor varies within
genotype at time of sacrifice. f Weight of semi-dry mouse uterus by
genotype. Control (N = 5), LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/+ (N = 14), LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/V1068G (N = 7), LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl (N = 6) (mean ± s.d; * p < 0.05, unpaired t-test,
two-tailed). g Survival of mice, based on time until vaginal bleeding.
(Gt)R26Pik3ca^*H1047R (N = 5), Arid1a^fl/fl (N = 7),
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/+ (N = 17), (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/V1068G (N = 7), (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl (N = 12).
Mice succumb to vaginal bleeding (sample image inset) at a median
(μ[1/2]) of 16 weeks (LtfCre^0/+; (Gt)R26Pik3ca^*H1047R Arid1a^fl/fl)
or 14 weeks (LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/+, and
LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/V1068G), without a
significant difference between these genotypes. LtfCre^0/+ mice
harboring Arid1a^fl/fl or (Gt)R26Pik3ca^*H1047R alone did not develop
vaginal bleeding. h H&E staining and IHC for ARID1A, P-S6 and KRT8
(N ≥ 2) of the endometrium at 5 × (scale bar = 200 μm) and 20 × (scale
bar = 50 μm) magnification, with x20 magnifications representing
portion panel to the right surrounded by black box. ARID1A expression
is lost in the endometrial epithelium of LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl mice. P-S6 is shown as marker of
AKT pathway activation; KRT8 as a marker of endometrial epithelium
arrows indicate endometrial epithelium
To induce CRE in the mouse endometrial epithelium, we utilized LtfCre
(Tg(Ltf-iCre)14Mmul). LtfCre induction occurs naturally as females
undergo sexual maturity, becoming fully active by 60 days^[94]32
(Fig. [95]1b). To investigate the consequence of ARID1A loss in the
endometrial epithelium, we bred LtfCre^0/+ mice to mice with an
Arid1a^fl allele, permitting conditional knockout of ARID1A upon CRE
expression (Fig. [96]1c)^[97]30. Genotyping by PCR confirmed expression
of each allele (Fig. [98]1d). We observed no gross phenotypes in
LtfCre^0/+; Arid1a^fl/fl mice (Supplementary Fig. [99]1a). Previously,
we found (Gt)R26Pik3ca^*H1047R to be a potent driver of epithelial
ovarian tumors when combined with Arid1a^fl/fl 30.
(Gt)R26Pik3ca^*H1047R provides conditional expression of the oncogenic
PIK3CA^H1047R mutation (Fig. [100]1c)^[101]33. No gross phenotypes were
observed in LtfCre^0/+; (Gt)R26Pik3ca^*H1047R (Supplementary
Fig. [102]1a), as previously described in the endometrial
epithelium^[103]34. Therefore, we bred LtfCre mice with mice harboring
(Gt)R26Pik3ca^*H1047R, Arid1a^fl, and Arid1a^V1068G (DNA-binding domain
defective ARID1A mutant, Fig. [104]1c)^[105]20 to develop an allelic
series with increasing ARID1A mutational burden in the endometrial
epithelium.
Abnormal vaginal bleeding is a prominent symptom of endometrial
dysfunction in humans. LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl
mice were sacrificed after a median age of 14 weeks due to vaginal
bleeding and uterine tumors (Fig. [106]1e, g). Surprisingly, homozygous
ARID1A loss was not required for vaginal bleeding, as LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/+ mice developed endometrial lesions
and vaginal bleeding (Fig. [107]1e, g). For both LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/+, and LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/V1068G mice, median uterus weight, and
survival were not significantly different from LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl (Fig. [108]1f, g). ARID1A loss and
PI3K pathway activation (via phospho-S6 ribosomal protein, P-S6,
expression) were determined by immunohistochemistry, while Cytokeratin
8 (KRT8) labeled the endometrial epithelium (Fig. [109]1h and
Supplementary Fig. [110]1b). LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/+, LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/V1068G, and
LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl showed evidence of
widespread atypical endometrial hyperplasia and nuclear atypia,
including glandular crowding and abnormal cytologic features
(Fig. [111]1h and Supplementary Fig. [112]1b). Endometrial tumors were
moderately to poorly differentiated, with areas of solid and cribiform
architecture (Fig. [113]1h). In one mouse, we observed visible lung
metastasis (Supplementary Fig. [114]1c), a site of metastasis in some
EC patients. In the LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl
endometrial epithelium we observed downregulation of estrogen
receptor-α (ESR1) and loss of the progesterone receptor, suggesting
changes to steroid hormone regulation (Supplementary Fig. [115]2a).
Impaired steroid hormone regulation indicates poor prognosis in
EC^[116]35.
Mutant endometrial epithelium show hallmarks of EMT
To profile in vivo gene expression changes in mutant endometrial
epithelium at an early stage of transformation, we devised an enzymatic
digestion and magnetic isolation protocol to positively enrich
epithelial populations (Fig. [117]2a). Endometrial epithelial cells
express EPCAM (Fig. [118]2b), and EPCAM expression is not altered in
the hyperplastic endometrium of LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl mice (Fig. [119]2c). Following positive selection, we
analyzed purified populations by flow cytometry and observed no
significant difference in purity between genotypes (Supplementary
Fig. [120]3a, b). We isolated RNA from control and LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl mice. Purified LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl cells showed significantly reduced
ARID1A messenger RNA (mRNA) expression (Fig. [121]2d). These samples
were processed for RNA-seq, from which we observed 3481 differentially
expressed genes between control and LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl (FDR < 0.05) (Supplementary Fig. [122]3c). Using stringent
criteria (FDR < 10^−5, twofold change), we identified a gene signature
of 517 differentially expressed genes (Supplementary Fig. [123]3d). We
found overlap between LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl,
LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/+ and LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/V1068G, including 963 genes
differentially expressed in all genotypes relative to control
(Supplementary Fig. [124]3e–g).
Fig. 2.
[125]Fig. 2
[126]Open in a new tab
RNA-seq analysis of EPCAM-positive endometrial epithelial cells
isolated via magnetic sorting. a Schematic of EPCAM isolation using
anti-EPCAM-PE antibody and anti-PE microbeads. b EPCAM is expressed in
the endometrial epithelium of a LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl mouse by IHC (N = 3). Arrows indicate endometrial
epithelium (scale bar = 100 μm). c IF staining of EPCAM and ARID1A in
mouse endometrium (N ≥ 3). Arrows indicate endometrial epithelium
(scale bar = 25 μm). d qPCR analysis of Arid1a gene expression of
isolated control (N = 3, pooled groups of six mice) and LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl (mutant) (N = 4, single mice) cells
(mean ± s.d; **p < 0.01, unpaired t-test, two-tailed). e, f Pathway
enrichment analysis on human orthologs of differentially expressed
genes between LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl, and
control mice (FDR < 0.05; 3481 genes) for mSigDb Hallmark pathways (e)
and Gene Ontology (GO) Biological Process terms (f). g GSEA plots
showing significance of Mak et al. pan-cancer EMT signature
upregulation within LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl
compared to control and UCEC ARID1A^mut patients compared to ARID1A^wt.
h Hierarchical clustering of 77 genes within the Mak et al. pan-cancer
EMT signature between control and mutant purified endometrium. Genes
found in the Hallmark EMT pathway, and CDH1, are identified
We performed Gene Set Enrichment Analyses (GSEA) on differentially
expressed genes (FDR < 0.05) in LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl endometrial epithelial cells and identified EMT as the top
dysregulated pathway using hallmark pathway enrichment (Fig. [127]2e).
Mesenchymal-marker overexpression in EC correlates with poor
prognosis^[128]36, which is consistent with several Gene Ontology (GO)
pathways related to cell motility, migration and adhesion that were
identified (Fig. [129]2f), further suggesting EMT as a key dysregulated
pathway in the LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl
endometrial epithelium. Recently, Mak et al.^[130]37 identified a
patient-derived EMT signature of 77 genes across 11 cancer types. This
gene signature was significantly enriched by GSEA in LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl vs. control, and in ARID1A mutant
UCEC patients vs. ARID1A wild-type patients (NES = 1.72 and 1.88,
respectively) (Fig. [131]2g), and contained 33 genes that were
differentially expressed in mutant mouse endometrial cells
(Fig. [132]2h).
EMT is characterized by the loss of cell adherens junctions, tight
junctions and apical-basal polarity^[133]38. In LtfCre^0/+;
Arid1a^fl/fl mice, we observed reduced CLDN10 and tight junction
protein-1 (ZO-1) expression by immunofluorescence (IF), while
expression of ICAM-1 was induced, indicating impaired tight junctions
(Supplementary Fig. [134]4a–d). ZO-1 expression was partially restored
in LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl (Supplementary
Fig. [135]4a). LtfCre^0/+; Arid1a^fl/fl endometrium has high expression
of Cleaved Caspase-3 (CASP3), indicating increased apoptosis in the
absence of PIK3CA^H1047R (Supplementary Fig. [136]4e). Expression of
mesenchymal-marker VIM (Vimentin) and EMT transcription factor SNAI2
(Slug) were observed in both LtfCre^0/+; Arid1a^fl/fl and LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl mutant endometrial epithelium,
indicating a shift towards a mesenchymal phenotype (Supplementary
Fig. [137]4f, g). CDH1 (E-Cadherin) mislocalization was observed in
mutant endometrial epithelium, suggesting alterations in epithelial
adherens junctions (Supplementary Fig. [138]4h). These data suggest
that the EMT phenotype observed in LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl endometrial epithelium are driven primarily by ARID1A
loss.
Mouse gene signature identifies invasive patient population
We next wanted to determine if LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl gene expression patterns resembled human disease. We
utilized mutation and RNA-seq expression data from the TCGA-UCEC
dataset with single-sample GSEA (ssGSEA) to rank UCEC patient
endometrioid tumors with gene expression patterns similar to our mouse
model. We segregated the upper (similar to mouse) and lower (dissimilar
to mouse) quartiles of patients based on human orthologs of our gene
signature (Fig. [139]3a). Upper quartile UCEC patients display
concordant expression changes for 74% of genes within the LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl gene signature relative to lower
quartile patients (Fig. [140]3b). Upper quartile patients show
upregulation of EMT, Interferon gamma (IFNγ), Notch and P53 signaling
pathways, and downregulation of the unfolded protein response (UPR)
(Fig. [141]3c). We confirmed downregulation of GRP94 and GRP78, two
proteins critical to the UPR, in the LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl endometrial epithelium in vivo by IHC and IF
(Supplementary Fig. [142]2b, c). When comparing ARID1A mutant and
wild-type UCEC patients, we also identified upregulation of the EMT
pathway (Fig. [143]3d).
Fig. 3.
[144]Fig. 3
[145]Open in a new tab
LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl gene signature
correlates with invasive patient gene expression. a Distribution of
TCGA-UCEC endometrioid patient tumors relative to ssGSEA score for
human orthologs of LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl gene
signature. b Clustered comparison of scaled fold-change values for
signature genes between LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl
vs. control mice and upper vs. lower quartile of UCEC endometrioid
patients. EMT genes from Hallmark pathway and Mak and Tong pan-cancer
gene signature are identified. c Scatter plot of Hallmark pathway GSEA
Normalized Enrichment Scores (NES) from LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl vs. control (human orthologs) and
upper quartile of UCEC endometrioid patients vs. lower quartile. d
Scatter plot of Hallmark pathway GSEA NES from upper quartile of UCEC
endometrioid patients vs. lower quartile and UCEC endometrioid
ARID1A^mut (frameshift/truncating alterations) vs. ARID1A^wt. e Upper
quartile ssGSEA-enriched UCEC endometrioid patients present with higher
stage disease relative to all patients (p < 0.01, Chi-squared). f Upper
quartile ssGSEA-enriched UCEC endometrioid patients have more invasive
tumors relative to lower quartile patients (p < 0.05, unpaired
Mann–Whitney U, one-tailed). Box-and-whiskers plotted in the style of
Tukey without outliers
Clinical staging of endometrial cancer is determined by invasion into
surrounding tissue, including the myometrium, cervix, vagina, bladder,
and distant metastasis^[146]2. Upper quartile patients were diagnosed
with advanced clinical stage relative to all UCEC patients, with
significantly more stage III and stage IV patients (p < 0.01,
Chi-squared) (Fig. [147]3e). Furthermore, upper quartile patients had
significantly more invasion than lower quartile patients (p < 0.05,
unpaired Mann–Whitney U, two-tailed) (Fig. [148]3f). These data suggest
that endometrial cells from LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl mice are representative of UCEC patients with advanced
stage, invasive tumors.
ARID1A loss increases promoter accessibility in vivo
To gain insight into chromatin accessibility alterations that may drive
the observed gene expression changes, we performed ATAC-seq (Assay for
Transposase-Accessible Chromatin)^[149]39 on anti-EPCAM-purified cells.
In general, the peaks were broader in LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl cells compared to cells from
control mice (p < 10^−15, unpaired Mann–Whitney U, two-tailed),
potentially indicating greater chromatin accessibility in mutant cells
(Fig. [150]4a, b). Among differentially accessible peaks (FDR < 0.20),
2053 showed decreased accessibility in LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl mice, while 1429 showed increased
accessibility, suggesting a global trend toward decreasing
accessibility (Fig. [151]4c). Primarily, differentially accessible
peaks represented mononucleosome fragments (Fig. [152]4d). Despite the
trend toward decreased accessibility, among promoters (defined as
regions ±3 kb to transcription start sites or TSS) we observed
significant increases in accessibility (p < 10^−72, Chi-squared)
(Fig. [153]4e), with 470 promoter peaks increasing in accessibility and
179 decreasing (Fig. [154]4f). Genomic repeat elements trended toward
decreased accessibility (80% decreasing), accounting for a global trend
toward decreasing accessibility (Fig. [155]4f). Among peaks with
increased accessibility, CpG islands, promoters and 5′ UTR were the top
enriched genomic features (Fig. [156]4g). Differentially accessible
peaks, including promoter peaks, were generally located proximal to
TSS, with 31.2% of all peaks located within 10 kb of a TSS
(Fig. [157]4h, i). We also performed ATAC-seq on EPCAM-purified cells
from LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/+ and LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/V1068G, and observed enrichment for
differential accessibility among promoters (p < 10^−500) (Supplementary
Fig. [158]3h–p).
Fig. 4.
[159]Fig. 4
[160]Open in a new tab
ATAC-seq analysis of differentially accessible chromatin in LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl endometrial epithelium. a ATAC-seq
read density heatmap from naive overlapping peaks of control and
LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl EPCAM-positive cells,
ranked by total intensity. Reads are centered on the middle of the
accessible peak ±3 kb. Control (N = 2, pooled groups of six mice) and
LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl (N = 2, single mice). b
Peak width distributions of control and LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl ATAC-seq peaks, which are
significantly different (p < 10^−15, unpaired Mann–Whitney U,
two-tailed). c Volcano plot for differential accessibility of ATAC-seq
peaks between control and LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl cells. Red points represent significant peaks
(FDR < 0.20). d Peak width distribution of differentially accessible
peaks. e Magnitude distribution of differentially accessible peaks
separated by total peaks (gray) and promoter peaks (red, within 3 kb of
TSS). f Detailed peak annotation of increasing and decreasing
differentially accessible regions for total, non-repetitive and
repetitive peaks based on genome annotation. g Enrichment for
significant genomic features among differentially accessible peaks,
ranked by p-value. Enrichment ratio is calculated by bp of feature in
ATAC peak set compared to background genome. h Histogram of all
differential ATAC peaks depicting distance to nearest TSS. Percent of
peaks found within + /−10, 30, or 100 kb of the TSS are shown. i
Histogram of differential ATAC promoter peaks depicting distance to
nearest TSS. j mSigDb Hallmark pathway enrichment of genes with
differentially accessible promoter peaks. k Differentially accessible
promoter peak clustering based on direction and magnitude of change in
gene expression and promoter accessibility. Black bars indicate
significant differential gene expression by RNA-seq (FDR < 0.05). l
Scatter plot depicting the relationship between direction and magnitude
of change in accessibility and gene expression for differential
promoter peaks. Accessibility and expression were significantly
correlated (r[s] = 0.26, p < 10^−9, Spearman). m mSigDb Hallmark
pathway enrichment of overlapping differentially accessible promoters
and differentially expressed genes
Among genes with differentially accessible promoter peaks, EMT appeared
as the top enriched pathway (Fig. [161]4j). We identified significant
overlap between differentially accessible promoters and differentially
expressed genes (p < 10^−8, hypergeometric enrichment) (Fig. [162]4k).
Chromatin accessibility was positively correlated with gene expression
(p < 10^−9, Spearman) (Fig. [163]4l). Among these genes, EMT again
appeared as a top affected pathway by enrichment analysis
(Fig. [164]4m). Altogether, these data demonstrate that endometrial
ARID1A loss and PI3K activation results in increased accessibility at
gene promoters and differential accessibility of EMT pathway genes.
ARID1A functionally binds gene promoters
To explore the role of ARID1A loss alone in the regulation of
endometrial epithelial chromatin accessibility, we utilized an
immortalized human endometrial epithelial cell line, 12Z^[165]40.
Transfection of 12Z cells with short-interfering RNAs (siRNAs)
targeting ARID1A (siARID1A) reduced ARID1A protein expression relative
to cells transfected with non-targeting control (siNONtg)
(Fig. [166]5a). Next, we performed ATAC-seq on siARID1A transfected 12Z
cells (Supplementary Fig. [167]5a–d). ARID1A loss led to a trend toward
decreasing chromatin accessibility genome-wide, while chromatin
accessibility was significantly increased at promoters (p < 10^−500,
Chi-squared) (Fig. [168]5b). These results recapitulate our findings in
vivo, suggesting differential chromatin changes in vivo are driven by
ARID1A loss alone.
Fig. 5.
[169]Fig. 5
[170]Open in a new tab
ARID1A binding is associated with accessibility and differential gene
expression driven by ARID1A loss in human endometrial epithelial cell
line. a Western blot of ARID1A expression in siRNA-treated 12Z cells.
β-Actin was used as endogenous control. b Annotation of differentially
accessible ATAC peaks (FDR < 0.05) from 12Z siARID1A, separated into
fractions by directionality and promoter vs. non-promoter. Significant
association (p < 10^−500, Chi-squared) between increasing accessibility
and promoter status. c Annotation of ARID1A ChIP peaks in wild-type 12Z
cells. d Peak width distribution of ChIP peaks. e Histogram of all ChIP
peaks depicting distance to nearest TSS. Percent of peaks found
within + /−10, 30, or 100 kb of the TSS are shown. f Histogram of ChIP
promoter peaks depicting distance to nearest TSS. g Enrichment for
significant genomic features among ChIP peaks, ranked by p-value.
Enrichment ratio is calculated by bp of feature in ChIP peak set
compared to background genome. h de novo Motif enrichment of ChIP peaks
genome-wide and at promoters. i mSigDb Hallmark pathway enrichment of
genes with ChIP promoter peaks. j Read density heatmap of ARID1A
ChIP-seq and ATAC-seq (control) at all gene promoters (N = 24,132),
ranked by signal intensity for ARID1A ChIP-seq. k Scatter plot
depicting correlation between ARID1A binding and chromatin
accessibility (r[s] = 0.312, p < 10^–15, Spearman). l Proportional
Euler diagram of overlap between ARID1A binding, decreasing and
increasing chromatin accessibility at promoters. m Enrichment for
LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl gene signature among
12Z siARID1A differentially expressed genes (p < 10^−5, hypergeometric
enrichment). n Enrichment of ARID1A binding at 12Z siARID1A
differentially expressed genes (p < 10^−208, hypergeometric
enrichment). o Fold-change in gene expression of siARID1A upregulated
genes, segregated based on ARID1A promoter-binding status (p = 0.002,
unpaired Mann–Whitney U, two-tailed). Box-and-whiskers plotted in the
style of Tukey without outliers. p mSigDb Hallmark pathway enrichment
of 12Z siARID1A differentially expressed genes (FDR < 0.0001). q
Example browser tracks for ARID1A binding profile. Signal is displayed
as log likelihood ratio (logLR). Single replicate signal is represented
in light green, overlapping signal is represented in dark green. Green
bars represent peaks called
In order to profile sites of genome-wide ARID1A occupancy, we performed
ARID1A ChIP-seq in 12Z cells. The specificity of the ARID1A ChIP-seq
antibody used was validated by co-immunoprecipitation (co-IP) and mass
spectrometry (Supplementary Fig. [171]5e, f). We identified 46,180
unique sites of ARID1A genome-wide occupancy (Fig. [172]5c). The
majority of ARID1A ChIP-seq peaks were less than 1000 bp in width
(Fig. [173]5d) and generally were proximal to TSS, with roughly
one-quarter of all peaks being within 10 kb of the TSS (Fig. [174]5e,
f). ARID1A binding was significantly enriched at promoters
(Fig. [175]5g). Among ARID1A-bound sites, we observed an enrichment of
the AP-1 motif, both genome-wide (p < 10^−8170) and at promoters
(p < 10^−800). ARID1A has been shown to regulate chromatin
accessibility at AP-1 motifs^[176]41,[177]42, and we also observed an
enrichment for the AP-1 motif at sites of differential accessibility in
vivo and in vitro (Supplementary Fig. [178]5g), suggesting ARID1A
regulation of chromatin at AP-1 sites.
ARID1A-bound promoters were enriched for EMT hallmark genes
(Fig. [179]5i). We also observed significant overlap between ARID1A
binding and sites of accessible chromatin, which were positively
correlated (p < 10^−15, Spearman) (Fig. [180]5j, k). Among
differentially accessible promoters, ARID1A was bound to 354 promoters,
which increased in accessibility, and 124 promoters, which decreased in
accessibility upon ARID1A loss (Fig. [181]5l).
To further explore the relationship between ARID1A binding and gene
expression, we performed RNA-seq on siNONtg and siARID1A treated 12Z
cells. Differentially expressed genes (FDR < 0.0001) were significantly
enriched for the LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl gene
signature (p < 10^−5, hypergeometric enrichment) (Fig. [182]5m). ARID1A
promoter binding was significantly enriched in differentially expressed
genes with ARID1A knockdown (p < 10^−208, hypergeometric enrichment)
(Fig. [183]5n). While ARID1A promoter binding was evenly distributed
among upregulated and downregulated genes (Supplementary Fig. [184]5h),
we observed a higher degree of gene upregulation following ARID1A loss
among genes with ARID1A binding at the promoter (p = 0.002, unpaired
Mann–Whitney U, two-tailed) (Fig. [185]5o). ARID1A bound, upregulated
genes are enriched for EMT pathways (Fig. [186]5p). ARID1A binding is
observed in the promoters of mesenchymal identity genes (Fig. [187]5q).
These data support a mechanistic role for ARID1A in the suppression of
mesenchymal gene transcription.
ARID1A loss promotes mesenchymal phenotype
To further interrogate the relationship between ARID1A and PIK3CA in
the regulation of the EMT pathway, we again utilized the 12Z cell line.
EMT is regulated by several transcription factors, including SNAI1
(Snail), SNAI2 and TWIST1 (Twist)^[188]38. Upon ARID1A knockdown by
siRNA (siARID1A), we observed upregulation of SNAI1, SNAI2, and TWIST1
protein expression (Fig. [189]6a). Transfection with PIK3CA^H1047R
expression plasmid (pPIK3CA^H1047R) led to AKT/mTOR pathway activation,
as indicated by phosphorylation of AKT at serine 473 (P-AKT Ser473)
(Fig. [190]6a). In cells transfected with both siARID1A and
pPIK3CA^H1047R, we observed decreased induction TWIST1 (Fig. [191]6a).
Expression of SNAI1 and SNAI2 was not affected by pPIK3CA^H1047R
(Fig. [192]6a). Moreover, pPIK3CA^H1047R induced CDH1 expression
(Fig. [193]6a) and partially rescued the CDH1 downregulation observed
in cells transfected with only siARID1A.
Fig. 6.
[194]Fig. 6
[195]Open in a new tab
PIK3CA^H1047R antagonizes ARID1A loss-induced mesenchymal phenotypes. a
Western blot of ARID1A, β-Actin, AKT, P-AKT, CDH1, SNAI1, SNAI2, and
TWIST1 following co-transfection of siNONtg and empty vector (control),
siARID1A and empty vector (siARID1A), siNONtg and pPIK3CA^H1047R
(PIK3CA^H1047R), or siARID1A and pPIK3CA^H1047R
(siARID1A/PIK3CA^H1047R). b Proportional Euler diagram displaying
differentially expressed genes (FDR < 0.0001) from siARID1A,
PIK3CA^H1047R, and siARID1A/PIK3CA^H1047R relative to control. c mSigDb
Hallmark pathway enrichment for siARID1A, PIK3CA^H1047R, and
siARID1A/PIK3CA^H1047R differentially expressed genes. d Enrichment for
LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl mouse signature
ortholog genes and Mak et al. pan-cancer gene signature within
differentially expressed genes from siARID1A, PIK3CA^H1047R, and
siARID1A/PIK3CA^H1047R relative to control. e, f Fold-change values of
experimental groups relative to control for genes in the Mak and Tong
pan-cancer EMT signature (e) and the Hallmark EMT signature (f),
separated based on direction of gene expression change in siARID1A.
Statistic represented is paired Mann–Whitney U (two-tailed).
Box-and-whiskers plotted in the style of Tukey without outliers. g
Intersection between siARID1A differentially expressed genes relative
to control and siARID1A/PIK3CA^H1047R relative to siARID1A. h Heat map
detailing relative expression of intersecting genes (N = 127) (Fig. 6g)
in control, siARID1A, PIK3CA^H1047R, and siARID1A/PIK3CA^H1047R, and
ARID1A promoter binding. These genes were enriched for ARID1A promoter
binding (p < 10^−18, hypergeometric enrichment). i Expression level of
intersect genes (Fig. 6g) in siARID1A, PIK3CA^H1047R, and
siARID1A/PIK3CA^H1047R relative to control. Statistic represented is
paired Mann–Whitney U (two-tailed). Box-and-whiskers plotted in the
style of Tukey without outliers. j mSigDb Hallmark pathway enrichment
for intersecting genes (N = 127) (Fig. 6g). k Changes in relative EMT
gene expression upon ARID1A loss and PIK3CA^H1047R overexpression as
measured by qRT-PCR. Data represents three biological replicates
We next performed RNA-seq on cells transfected with siARID1A,
pPIK3CA^H1047R, or both. We found that while ARID1A loss resulted in
differential gene expression of 2565 genes, PIK3CA^H1047R expression
resulted in differential gene expression of only 233 genes
(FDR < 0.0001) (Fig. [196]6b). Some genes differentially expressed by
PIK3CA^H1047R overlapped with siARID1A and siARID1A/PIK3CA^H1047R
samples, displaying unique patterns of gene expression (Supplementary
Fig. [197]6). Among Hallmark pathways, we observed siARID1A and
PIK3CA^H1047R convergence on the NFκB pathway, as previously described
in ovarian clear cell carcinoma^[198]43, and the EMT pathway
(Fig. [199]6c). Differentially expressed genes from siARID1A,
PIK3CA^H1047R, and siARID1A/PIK3CA^H1047R samples compared to controls
were enriched for the LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl
gene signature and the Mak and Tong pan-cancer EMT gene signature
(Fig. [200]6d). For genes found in the Mak and Tong signature and the
hallmark EMT pathway, we identified an antagonistic relationship
between siARID1A and PIK3CA^H1047R, such that gene expression changes
observed in siARID1A samples were reduced in siARID1A/PIK3CA^H1047R
samples (Fig. [201]6e, f).
To further explore the antagonistic relationship between ARID1A loss
and PIK3CA^H1047R, we identified a unique group of genes at the
intersection between differentially expressed genes in siARID1A
relative to control and siARID1A/PIK3CA^H1047R relative to siARID1A
(Fig. [202]6g). These 127 genes represent genes, which are
differentially expressed by siARID1A, and further altered by the
addition of PIK3CA^H1047R. Of these genes, 47.2% were bound by ARID1A
at the promoter in wild-type 12Z cells (p < 10^−18, hypergeometric
enrichment) (Fig. [203]6h). We observed significant upregulation of
these genes in siARID1A samples, and downregulation in siARID1A/
PIK3CA^H1047R (Fig. [204]6h, i). These genes were enriched for the
hallmark EMT pathway, which was the most significant result
(Fig. [205]6j). The differential gene expression of EMT genes upon
ARID1A loss was confirmed by quantitative reverse transcriptase
(qRT)-PCR (Fig. [206]6k). These data provide further evidence that
ARID1A loss induces a mesenchymal phenotype, which is antagonized by
the PIK3CA^H1047R mutation, resulting in a partial EMT phenotype.
ARID1A loss and PIK3CA^H1047R promote invasive phenotypes
Partial EMT is associated with invasive phenotypes^[207]38, and EMT
pathways play key roles in EC disease progression by promoting the
invasion of epithelial cells into the myometrium^[208]44. To
distinguish between the effect of ARID1A loss or PIK3CA^H1047R on
invasive phenotypes, we co-transfected 12Z cells with a PIK3CA^H1047R
expression plasmid and lentivirus expressing ARID1A short-hairpin RNAs
(shRNAs) (shARID1A) (Fig. [209]7a). ARID1A knockdown induced migratory
and invasive phenotypes in 12Z cells, and co-transfection with
pPIK3CA^H1047R significantly enhanced migration and invasion
(Fig. [210]7b, c). Cells treated with shARID1A displayed increased
expression of F-actin (Fig. [211]7c). These results suggest that the
co-mutation of ARID1A and PIK3CA in the endometrial epithelium promotes
an invasive phenotype.
Fig. 7.
[212]Fig. 7
[213]Open in a new tab
ARID1A loss and PIK3CA^H1047R promote myometrial invasion in vivo and
migration in vitro. a Western blot of ARID1A, β-Actin, AKT, P-AKT,
following co-transfection of shNONtg and empty vector (control),
shARID1A and empty vector (shARID1A), shNONtg and pPIK3CA^H1047R
(PIK3CA^H1047R) or shARID1A and pPIK3CA^H1047R
(shARID1A/PIK3CA^H1047R). b Invasion assay of 12Z cells with ARID1A
loss and PIK3CA^H1047R overexpression. Representative images of calcein
AM-stained cells are and total invaded cell counts are shown (scale
bar = 500 μm). Data represents four biological replicates (mean ± s.d;
*p < 0.05, **p < 0.01, ****p < 0.0001, unpaired t-test, two-tailed). c
Migration assay of 12Z cells with ARID1A loss and PIK3CA^H1047R
overexpression. Upper images are representative of cells 24 h following
removal of insert (scale bar = 500 μm). Lower images are maximum
intensity confocal projections of cells stained with fluorescent
phalloidin to label with F-actin (scale bar = 50 μm). Average Migration
represents the average difference distance across each migration front
from 0 to 24 h. Migrating cell counts represent number of cells in
migration area after 24 h. Data represents three biological replicates
(mean ± s.d; *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001,
unpaired t-test, two-tailed). d Myometrial invasion observed in
LtfCre^0/+; Arid1a^fl/fl, and LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl. H&E staining and IHC for KRT8 at 3.33–6.66 × (scale
bar = 300–600 μm, as stated on figure) and x20 (scale bar = 100 μm)
magnification, with x20 magnifications representing portion panel to
the right surrounded by yellow box. White arrows indicate invasive
endometrial epithelium. Endo, endometrium; Myo, myometrium. e Images of
maximum intensity confocal projections of control and LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl endometrium sections stained with
α-smooth muscle actin (α-SMA) (red), KRT8 (green) and counter-stained
with DAPI (blue) (N ≥ 3). White arrows indicate invasive endometrial
epithelium (scale bar = 50 or 10 μm, as stated on figure). f Diagram
representation of EMT-induced invasive endometrial epithelium following
ARID1A loss and PIK3CA^H1047R mutation
In vivo, we observed a requirement for both ARID1A loss and PI3K
activation for invasive phenotypes. In LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/+ and LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl mice, we observed invasion of
endometrial epithelium into the myometrium (Fig. [214]7d).
KRT8-positive epithelial cells migrated outside of the endometrium,
invading α-smooth muscle actin (α-SMA)-positive myometrial cells and
formed tumors (Fig. [215]7e). Invading epithelial cells contained a
narrow leading edge and strand-like morphology, suggesting a collective
migration of cells^[216]45. Some invasive sites formed
well-differentiated adenomas (Fig. [217]7d), while others were poorly
differentiated clusters of tumor cells (Fig. [218]7d, e). Invasive
KRT8-positive epithelial glands were observed in direct contact with
myometrial cells, often appearing as strands of epithelial cells
trailing through the myometrial layers (Fig. [219]7d). These results
suggest that ARID1A loss and PIK3CA^H1047R expression in the
endometrial epithelium results in a partial EMT phenotype, promoting
lesion formation and myometrial invasion (Fig. [220]7f).
Discussion
In this study, we found that ARID1A functions as a haploinsufficient
tumor suppressor in the endometrial epithelium. LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/+ is sufficient to drive tumorigenesis
and is nearly identical to LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl with respect to tumor burden, survival, gene expression,
and chromatin accessibility changes. This is consistent with the
spectrum of single-hit ARID1A mutations observed in EC, in which a
majority of patients have only a single ARID1A nonsense mutation.
Previous studies suggested ARID1A functions as a haploinsufficient
tumor suppressor^[221]31 in ovarian^[222]10,[223]11, breast^[224]46,
gastric^[225]47, and liver cancer^[226]48. ARID1A expression or
mutation may not predict disease status, as single-hit mutations or
epigenetic silencing may be sufficient for ARID1A-dependent changes in
gene expression or transformation. Additionally, heterozygous loss of
ARID1A may promote metastasis at late stages of the tumor progression,
as observed in liver cancer^[227]49. ARID1A levels may be regulated
throughout the menstrual cycle and mediate dissociation of decidua from
the uterus. In this case, ARID1A heterozygosity may suffice for
oncogenesis during points of low ARID1A expression, which may account
for the ARID1A genetic differences observed between the present mouse
model and epithelial ovarian cancer models^[228]29,[229]30. This would
explain the high ARID1A mutation rates in EC.
Previously, Raab et al.^[230]50 identified ARID1A binding
preferentially at promoters in HepG2 liver cancer cells. In the present
study, we show ARID1A enrichment at promoters, which was significantly
correlated with chromatin accessibility. We observed increased
accessibility at promoters upon ARID1A loss in human endometrial
epithelial cells and, in vivo, in sorted mouse endometrial epithelial
cells. Among direct ARID1A target genes, we also observed significant
correlations between increasing promoter accessibility and increasing
transcription of mesenchymal genes upon ARID1A loss. In addition, we
observed greater activation of ARID1A target genes following ARID1A
loss, as compared to ARID1A non-target genes. These data suggest
ARID1A-containing SWI/SNF complexes maintain endometrial epithelial
cell identity by repressing genes required for transdifferentiation of
epithelial cells into mesenchyme. ARID1A may promote endometrial
plasticity by limiting the differentiation capacity of the epithelial
cells. Repressive nucleosome positioning by ARID1A-containing SWI/SNF
complexes may provide a barrier to transcriptional activation, as has
been observed at the HIV LTR^[231]51.
The data presented here demonstrate a cell-autonomous role for ARID1A
in the preservation of endometrial epithelial cell identity and EMT
regulation. In addition, we show LtfCre^0/+; (Gt)R26Pik3ca^*H1047R;
Arid1a^fl/fl cells gain VIM and ICAM-1 and invade the myometrium, but
retain CDH1, EPCAM and KRT8 expression, suggesting an incomplete EMT
phenotype^[232]38. VIM expression is upregulated in epithelial tumors
of uterine corpus origin, but not epithelial tumors of ovarian
origin^[233]52. ICAM-1 is expressed in migratory EC^[234]53, and is
linked to increased peritoneal adhesion in endometriosis^[235]54. VIM
and ICAM-1 may serve as markers of ARID1A-negative tumors of
endometrial origin.
Partial or incomplete EMT is associated with invasive phenotypes in
various cancers^[236]38,[237]45. In EC, EMT is thought to play a role
in myometrial invasion^[238]44. In this study, we found that ARID1A
loss and PI3K activation in endometrial epithelium leads to enhanced
migration and invasion in vitro and myometrial invasion in vivo,
reflecting the myometrial invasion phenotypes observed clinically.
Myometrial invasion in EC correlates with distal metastases, disease
recurrence, and adenomyosis^[239]55–[240]57. EC patients with gene
expression signatures most similar to LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl had greater tumor invasion and
higher tumor grade. The collective migration of mutant endometrial
epithelium undergoing partial EMT may enhance the invasive properties
of EC, permitting myometrial invasion.
The retention of some epithelial characteristics upon PIK3CA^H1047R
expression may facilitate the establishment of epithelial
tumors^[241]58. Epithelial transdifferentiation is a proposed mechanism
by which normal epithelia convert into abnormal epithelia without
undergoing an mesenchymal cell intermediate^[242]59. PIK3CA mutation is
an early event in atypical hyperplasia^[243]24, whereas loss of ARID1A
immunoreactivity correlates with malignant transformation in
endometrial cancer^[244]15. A recent study identified PIK3CA as being
commonly mutated in endometrial glands, often without transformation,
suggesting PIK3CA mutation as an early event, with ARID1A mutation
coming later in the progression of endometriosis^[245]17. ARID1A
mutations have previously been implicated in invasion during
metastasis^[246]49,[247]60–[248]62. In the LtfCre^0/+;
(Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl endometrial epithelium, PI3K
activation may partially suppress the full acquisition of mesenchymal
phenotypes upon ARID1A loss, resulting in an abnormal epithelial state
with invasive properties. PI3K activation may also allow cells to
bypass the endometrial epithelial cell apoptosis observed in
LtfCre^0/+; Arid1a^fl/fl mice. This may be another reason why ARID1A
mutations are commonly observed alongside activating PI3K mutations in
neoplasms originating from the endometrial epithelium.
The partial EMT phenotype may increase the invasive potential of the
endometrium. The expression of EMT factors is increased at the
myoinvasive front of ECs^[249]44, suggesting collective migration
rather than single cell metastasis^[250]63. Endometriotic lesions
retain CDH1 expression^[251]64, suggesting collective migration rather
than metastasis via a single cell^[252]63. Within primary tumors,
adjacent cells may differentiate into different intermediate stages
along the EMT-spectrum due to differing stimulus within the tumor
microenvironment, including surrounding stroma^[253]38. Invasive,
mesenchymal-like cells may lead the way for cohorts of epithelial cells
with which they retain some cell-cell junctions^[254]63. Upon arrival
to metastatic sites, lack of stromal signals present at the site of
origin may allow for epithelial gland formation^[255]58. This may
explain the formation of endometrial glands outside of the endometrium
derived from cells with mesenchymal-like invasiveness.
EC survival rates are high if the disease is detected at an early stage
when the tumors are still confined to the endometrium. Myometrial
invasion or tumor dissemination to other sites in the body correlates
with poor survival. The notion that collective epithelial invasion
promotes EC metastasis may lead to therapeutic options for patients
with disseminated disease. The identification of pathways involved in
the collective invasion may lead to the development of anti-metastatic
drugs.
Methods
Mice
All mice were maintained on an outbred genetic background using CD-1
mice (Charles River). (Gt)R26Pik3ca^*H1047R and LtfCre
(Tg(Ltf-iCre)14Mmul) alleles were purchased from The Jackson Laboratory
and identified by PCR using published methods^[256]32,[257]33.
Arid1a^fl and Arid1a^V1068G alleles were distinguished by
PCR^[258]20,[259]30. For detection of Arid1a^V1068G allele, PCR product
was treated with HincII at 37 °C for 1 h. Genotyping primers are listed
in Supplementary Table [260]1. Uncropped genotyping gels can be found
in Supplementary Fig. [261]7. Endpoints were vaginal bleeding, severe
abdominal distension, and signs of severe illness, such as dehydration,
hunching, jaundice, ruffled fur, signs of infection, or
non-responsiveness. Sample sizes within each genotype were chosen based
on the proportions of animals with vaginal bleeding between each
experimental group or a Kaplan–Meyer log-rank test for survival
differences. For weight measurements, uteri were collected at time of
sacrifice and placed immediately into neutral-buffered formalin at
4 °C. After 24 h, tissues were washed with phosphate-buffered saline
(PBS) and 50% EtOH, placed in 70% EtOH, and then weighed. Mice were
housed at the Van Andel Research Institute Animal Facility and the
Michigan State University Grand Rapids Research Center in accordance
with protocols approved by Michigan State University.
Cell lines
12Z immortalized human endometrial epithelial cells were provided by
the laboratory of Asgi Fazleabas^[262]40. 12Z cells were maintained in
Dulbecco's Modified Eagle Media (DMEM)/F12 media supplemented with 10%
fetal bovine serum (FBS), 1% L-glutamine and 1% penicillin/streptomycin
(P/S). Lenti-X^TM 293T (Clontech, Cat# 632180, CVCL_0063) cells were
maintained in DMEM + 110 mg/L Sodium Pyruvate (Gibco) supplemented with
10% FBS, 1% l-glutamine, 1% P/S. Cell line validation for the 12Z cell
line was performed by IDEXX BioResearch: the 12Z cell line has a unique
profile not found in the current public databases. The 12Z and Lenti-X
293T cell lines tested negative for mycoplasma contamination. Testing
was performed using the Mycoplasma PCR Detection Kit (Applied
Biological Materials). No commonly misidentified cell lines were used
in this study.
Histology and immunohistochemistry
For indirect immunohistochemistry (IHC), 10% neutral-buffered formalin
(NBF)-fixed paraffin sections were processed for heat-based antigen
unmasking in 10 mM sodium citrate [pH 6.0]. Sections were incubated
with antibodies at the following dilutions: 1:200 ARID1A (D2A8U)
(12354, Cell Signaling); 1:400 Phospho-S6 (4585, Cell Signaling); 1:100
KRT8 (TROMA1, DHSB); 1:100 EPCAM (G8.8-s, DHSB); 1:400 PGR (SAB5500165,
Sigma). TROMA-I antibody was deposited to the DSHB by Brulet,
P./Kemler, R. (DSHB Hybridoma Product TROMA-I). EPCAM antibody (G8.8)
was deposited to the DSHB by Farr, A.G. (DSHB Hybridoma Product G8.8).
Antibody details are listed in Supplementary Table [263]2. The
following Biotin-conjugated secondary antibodies were used: donkey
anti-rabbit IgG (711-065-152, Jackson Immuno-research Lab) and donkey
anti-rat IgG (#705-065-153, Jackson Immuno-research Lab). Secondary
antibodies were detected using VECTASTAIN Elite ABC HRP Kit (Vector).
Sections for IHC were lightly counter-stained with Hematoxylin QS or
Methyl Green (Vector Labs). Routine Hematoxylin and Eosin (H&E)
staining of sections was performed by the Van Andel Research Institute
(VARI) Histology and Pathology Core. A VARI animal pathologist reviewed
histological tumor assessments.
Immunofluorescence
For indirect immunofluorescence, tissues were fixed in 4%
paraformaldehyde. Frozen samples were sectioned at 10 μm on a CM3050 S
cryostat (Leica) and collected on white frosted, positive charged
ultra-clear microscope slides (Denville). Frozen slides were post-fixed
with 2% PFA/1 PBS, and permeabilized with 0.3% TX100 in PBS, and
treated with 100 mM glycine/1x PBS [pH 7.3]. Primary antibodies were
applied to slides at the following dilutions: 1:200 ARID1A (D2A8U)
(12354, Cell Signaling); 1:100 KRT8 (TROMA1, DHSB); 1:100 EPCAM
(G8.8-s, DHSB); 1:50 ZO-1 (61–7300, ThermoFisher); 1:200 CDH1 (3195,
Cell Signaling); 1:100 CLDN10 (38–8400, ThermoFisher); 1:100 VIM (5741,
Cell Signaling); 1:400 PGR (SAB5500165, Sigma); 1:200 ERα (ab32063,
abcam); 1:2000 SMA (Sigma, C618); 1:100 SNAI2 (9585, Cell Signaling);
1:40 ICAM-1 (AF796-SP, R&D Systems). Secondary antibodies used were:
1:500 donkey anti-rabbit IgG, alexa fluor 555-conjugated antibody
(#A-31572, ThermoFisher); 1:500 goat anti-rabbit IgG, alexa fluor
555-conjugated antibody (#A-21428, ThermoFisher); 1:500 goat anti-rat
IgG, alexa fluor 647-conjugated antibody (A-21247, ThermoFisher); 1:250
donkey anti-rat IgG, alexa fluor 647-conjugated antibody (712-605-153,
Jackson Immuno-Research Lab); 1:250 donkey anti-goat fluor
488-conjugated antibody (705-545-147, Jackson Immuno-Research Lab).
Phalloidin-iFluor 594 (1:1000, abcam) was used to stain F-actin.
Auto-fluorescence was quenched using the TrueVIEW Auto-fluorescence
Quenching Kit (Vector Laboratories). ProLong Gold Antifade Reagent with
DAPI (8961, Cell Signaling) was used for DAPI staining.
Microscopy and imaging
Confocal images were taken on a Nikon Eclipse Ti inverted microscope
using a Nikon C2 + confocal microscope laser scanner. Confocal
immunofluorescent images are representative maximum intensity
projections.
Cell sorting
Mouse uteri were surgically removed and minced using scissors. Tissues
were digested using the MACS Multi Tissue Dissociation Kit II (Miltenyi
Biotec) for 80 min at 37 °C. Digested tissues were strained through a
40 μm nylon mesh (ThermoFisher). The Red Cell Lysis Buffer (Miltenyi
Biotec) was used to remove red blood cells. Dead cells removed using
the MACS Dead Cell Removal Kit (Miltenyi Biotec), and EPCAM-positive
cells were positively selected and purified using a PE-conjugated EPCAM
antibody and anti-PE MicroBeads (Miltenyi Biotec), per the
manufacturers’ instructions. A BD Accuri C6 flow cytometer (BD
Biosciences) was used to confirm purity of EPCAM-positive population.
RNA isolation and qRT-PCR
The Arcturus PicoPure RNA Isolation Kit (ThemoFisher), including an
on-column DNA digestion using the RNAse-free DNAse set (Qiagen), was
used to purify RNA from in vivo EPCAM-sorted endometrial epithelial
cells. To confirm loss of ARID1A transcript in EPCAM-positive
LtfCre^0/+; (Gt)R26Pik3ca^*H1047R; Arid1a^fl/fl cells, complementary
DNA (cDNA) was synthesized from RNA, and qRT-PCR was performed using
Ssofast PCR master mix (Biorad) using previously described
primers^[264]20 and the Applied Biosystems ViiA7 real-time PCR system.
ARID1A expression was normalized to GAPDH. For in vitro experiments,
RNA samples were collected 72 h post siRNA transfection using the
Quick-RNA Miniprep Kit (Zymo Research). cDNA was synthesized from RNA,
and qRT-PCR was performed using PowerUp SYBR Green Master Mix
(ThermoFisher) and the Applied Biosystems ViiA7 real-time PCR system.
Primer pairs for human genes are described in Supplementary
Table [265]1.
RNA-seq
Libraries were prepared by the Van Andel Genomics Core from 100 ng of
total RNA for mouse samples, and Lexogen SIRV-set2 RNAs (Lexogen GmbH,
Vienna Austria) were spiked into RNA prior to library preparation at a
concentration of 1% by mass. For human samples, 500 ng of total RNA
material was used as input, with no spike in. For all samples,
libraries were generated using the KAPA Stranded mRNA-Seq Kit (v4.16)
(Kapa Biosystems, Wilmington, MA USA). RNA was sheared to 250–300 bp
and reverse transcribed. Prior to PCR amplification, cDNA fragments
were ligated to Bio Scientific NEXTflex Adapters (Bio Scientific,
Austin, TX, USA). Quality and quantity of the finished libraries were
assessed using a combination of Agilent DNA High Sensitivity chip
(Agilent Technologies, Inc.), QuantiFluor dsDNA System (Promega Corp.,
Madison, WI, USA), and Kapa Illumina Library Quantification qPCR assays
(Kapa Biosystems). All libraries were pooled equimolarly, and single
end sequencing to a minimum depth of 30 M reads per library was
performed using an Illumina NextSeq 500 sequencer using a 75 bp
sequencing kit (v2) (Illumina Inc., San Diego, CA, USA). Base calling
was done by Illumina NextSeq Control Software (NCS) v2.0 and output of
NCS was demultiplexed and converted to FastQ format with Illumina
Bcl2fastq v1.9.0.
RNA-seq analysis
Raw 75 bp reads were trimmed with cutadapt^[266]65 and Trim Galore!
([267]http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/)
followed by quality control analysis via FastQC^[268]66. Trimmed mouse
reads were aligned to mm10 genome assembly and indexed to
GENCODE^[269]67 vM16 GFF3 annotation via STAR^[270]68 aligner with flag
‘–quantMode GeneCounts’ for feature counting, and human reads were
aligned to GRCh38.p12 and indexed to GENCODE v28. For mouse libraries,
Lexogen SIRVome was independently aligned and quantified for
qualitative assessment of library concordance. Output gene count files
were constructed into an experimental read count matrix in R. Low count
genes were filtered (1 count per sample on average) prior to
DESeq2^[271]69,[272]70 count normalization and subsequent differential
expression analysis. Calculated differential expression probabilities
were corrected for multiple testing by independent hypothesis weighting
(IHW)^[273]71 for downstream analysis. Differentially expressed gene
thresholds were set at FDR < 0.05 for mouse data and FDR < 0.0001 for
human data. All reported instances of log[2](fold-change) data from
RNA-seq are adjusted by DESeq2 original shrinkage estimator except for
TCGA-UCEC comparisons and statistical comparisons between log[2](FC)
values, which use non-adjusted values. Principal component analysis was
calculated using DESeq2 from top 500 genes by variance across samples.
RNA-seq heatmaps were generated using scaled regularized-logarithm
(rlog) counts for visualization, or relative to controls by subtracting
mean rlog counts. LtfCre^0/+; (Gt)R26^Pik3ca*H1047R; Arid1a^fl/fl
signature genes were defined by FDR < 10^−5 and |log[2](FC)| > 1.
ATAC-seq
Libraries were prepared following previously described
methods^[274]39,[275]72,[276]73. Mouse endometrial cells were isolated
using methods described above. For purified mouse endometrial
epithelium and 12Z cells, between 25,000 and 50,000 cells were
resuspended in cold lysis buffer (10 mM Tris-HCL [pH 7.4], 10 mM NaCl,
3 mM MgCl[2], 0.1% NP-40) and centrifuged at 500 × g, 4 °C for 10 min
to isolate nuclei. Nuclei were treated with Tn5 Transposase for 30 min
at 37 °C using the Nextera DNA Library Prep Kit (Illumina). DNA was
isolated using the Qiagen MinElute Reaction Cleanup Kit. Libraries were
amplified using barcoded primers for 1–8 cycles as described^[277]39.
Libraries were purified using Kapa Pure Beads to remove primer dimers
and >1000 bp fragments. Libraries were sequenced by the Van Andel
Genomics Core. Quality and quantity of the finished libraries were
assessed using a combination of Agilent DNA High Sensitivity chip
(Agilent Technologies, Inc.), QuantiFluor dsDNA System (Promega Corp.,
Madison, WI, USA), and Kapa Illumina Library Quantification qPCR assays
(Kapa Biosystems). All libraries were pooled equimolarly, and paired
end sequencing to a minimum depth of 20 M reads per library was
performed using an Illumina NextSeq 500 sequencer using a 150 bp
sequencing kit (v2) (Illumina Inc., San Diego, CA, USA). Base calling
was done by Illumina NextSeq Control Software (NCS) v2.0 and output of
NCS was demultiplexed and converted to FastQ format with Illumina
Bcl2fastq v1.9.0.
ATAC-seq analysis
Libraries were combined across flow cells and trimmed with cutadapt and
Trim Galore! followed by quality control analysis via FastQC. Trimmed
reads were aligned to mm10 mouse reference genome via Bowtie2^[278]74
with flags ‘–very-sensitive’ and ‘-X 1000’ in concordance with the
library size-selection step, and, similarly, human reads were aligned
to GRCh38.p12 using the same parameters^[279]75. Reads were then sorted
and indexed with samtools^[280]76. Mitochondrial reads were then
discarded from BAMs, using Harvard ATAC-seq module removeChrom script
([281]https://github.com/harvardinformatics/ATAC-seq), and subsequently
filtered for only properly paired reads by samtools view -f 3. At this
step, working library complexity was estimated by
ATACseqQC::estimateLibComplexity^[282]77,[283]78. To compensate for
differing library complexities within an experimental design, we
normalized by randomly subsampling libraries to a calculated fraction
of the original library, as estimated by the bootstrap interpolation,
via samtools view with flag ‘-s’ to achieve normalized library sizes.
After subsampling libraries to lowest complexity, PCR duplicates were
removed with Picard MarkDuplicates
([284]http://broadinstitute.github.io/picard/), and reads were finally
name-sorted prior to conversion to BEDPE format with bedtools^[285]79
bamtobed with flag ‘-bedpe’. BEDPE coordinates were then shifted 4 and
5 bp to correct for TN5 transposase integration^[286]39, and the
standard BEDPE files were re-written to a minimal BEDPE format, as
defined by MACS2 manual, through an awk script. MACS2^[287]80 was used
to call broad peaks from final minimal BEDPE fragment coordinates with
FDR < 0.05 threshold and no control input, and the resulting peaks were
repeat-masked by blacklist filtering^[288]81. A naive overlap peak set,
as defined by ENCODE, was constructed for each biological condition by
combining replicates and calling broad peaks on pooled BEDPE files
followed by intersectBed to select for peaks of at least 50% overlap
with each biological replicate.
Differential accessibility was calculated by firstly defining a more
relaxed consensus peak set
[MATH: p=(∩j=1nej)∪(∩j=1ncj) :MATH]
for any partial intersect where e[1], …, e[n] are MACS2 peak sets from
biological replicates of the experimental condition, and c[1], …, c[n]
are peak sets from control biological replicates. This consensus peak
set was used in csaw^[289]82 as coordinates for counting reads within
specified windows, with additional parameters set to restrict windows
to standard chromosomes and non-blacklisted regions. Windows >1
kilobase in width were filtered along with low read-abundance windows
(logCPM < −3). In order to compensate for differing efficiencies of
reactions between libraries, a non-linear loess-based normalization
approach was employed to remove trended biases. This method was
empirically determined to elicit the most conservative results as
opposed to other approaches to window count normalization. csaw uses
edgeR^[290]83 quasi-likelihood functionality to calculate differential
accessibility, for which FDR thresholds were used to determine final
differential peak sets (FDR < 0.20 mouse data; FDR < 0.05 human data).
Finally, proximal windows within 500 bp were merged, and the most
significant window statistic was used to represent the merged window.
Significant differentially accessible genomic regions were annotated by
HOMER^[291]84 with a modification to cis-promoter classification as
within 3000 bp of a canonical gene TSS, which remains consistent
throughout all reported analyses. HOMER de novo motif enrichment and
genome ontology was performed on all significant differentially
accessible genomic regions. Common differential mouse ATAC/RNA genes
were selected by the presence of a differentially accessible promoter
ATAC peak (FDR < 0.20) and RNA-seq differential expression
(FDR < 0.05).
Analysis of TCGA-UCEC data
ARID1A alteration incidence analysis was calculated using the TCGA
Pan-Can UCEC^[292]22 cohort (N = 509) retrieved from
cBioPortal^[293]85. All molecular data for subsequent analyses was
pulled from the 28th January, 2016 release of Broad GDAC Firehose
(10.7908/C11G0KM9). For molecular comparisons, patients were considered
ARID1A^mut if they had somatic alterations (excluding missense and
synonymous mutations) and ARID1A^wt if no alterations were detected at
the ARID1A locus. RNASeqV2 RSEM^[294]86 normalized gene counts were
quantile normalized prior to filtering low-count genes (one count per
sample on average) and fitting linear models via limma^[295]87 for
differential expression analysis in subsets of patients. Moderated
statistics were calculated by empirical Bayes moderation via
limma::eBayes with arguments ‘trend = TRUE’ and ‘robust = TRUE’, and
probabilities were adjusted for multiple testing by FDR. Additional
metrics for clinical staging and tumor invasion were acquired from the
GDC^[296]88 TCGA-UCEC dataset (N = 605) in UCSC Xena^[297]28. Broad
GSEA^[298]89 for mSigDb v6.2 Hallmark pathways^[299]90 was performed on
ortholog-converted DESeq2 normalized counts from generated mouse data
and RNASeqV2 RSEM normalized counts from TCGA-UCEC data. Broad
ssGSEA^[300]91 was also performed on RNASeqV2 RSEM normalized counts
from TCGA-UCEC data. Orthologs of the mouse gene signature established
herein were used to define UCEC endometrioid patients in
ssGSEA-enriched or unenriched quartiles, which reflect mouse model
transcriptome.
Bioinformatics and statistics
The 77 gene Pan-Cancer EMT signature was extracted from Supplementary
Table [301]S2 of Mak et al.^[302]37. Various ClusterProfiler^[303]92
functions were used to calculate and visualize pathway enrichment from
a list of gene symbols or Entrez^[304]93 IDs with respective gene
universes. biomaRt^[305]94,[306]95 was used for all gene nomenclature
and ortholog conversions. ggplot2^[307]96 was used for various plotting
applications. ComplexHeatmap^[308]97 was used for hierarchical
clustering by Euclidean distance and visualization. eulerr was used to
produce proportional Euler diagrams^[309]98. The cumulative
hypergeometric distribution was used for enrichment tests performed
throughout this manuscript. The statistical computing language R was
used for many applications throughout this manuscript^[310]99. HOMER
was used to compute integer read counts at loci of interest for tag
density heatmaps and scatter plots. TxDb.Hsapiens.UCSC.hg38.knownGene
was used to generate promoter regions for all standard hg38
genes^[311]100.
Transfection of 12Z cells with siRNA and plasmid DNA
12Z cells were seeded at a density of 40,000 cells/mL in DMEM/F12 media
supplemented with 10% FBS and 1% l-glutamine. The following day, cells
were transfected with 50 pmol/mL of siRNA (Dharmacon, ON-TARGETplus
Non-targeting Pool and human ARID1A #8289 SMARTpool) using the RNAiMax
(ThermoFisher) lipofectamine reagent according to the manufacturer’s
instructions at a ratio of 1:1 volume:volume in OptiMEM (Gibco). After
24 h, the media was replaced. ATAC samples were collected after 48 h.
For plasmid co-transfection experiments, 24 h after siRNA transfection,
cells were transfected with 500 ng pBabe vector containing
PIK3CA^H1047R (pPIK3CA^H1047R) or pBabe empty vector using the FuGene
HD transfection reagent (Promega) according to the manufacturers’
instructions at a ratio of 2:1 volume:mass, and media was replaced
after 4 h. The pPIK3CA^H1047R was a gift from Jean Zhao (Addgene
plasmid 12524)^[312]101. The following day, media was replaced with
DMEM/F12 media supplemented with 0.5% FBS, 1% P/S, and 1% l-glutamine.
Cells were collected 72 h post siRNA transfection using the Quick-RNA
Miniprep Kit (Zymo Research) for RNA or RIPA buffer (Cell Signaling)
for protein.
Generation of lentiviral shRNA particles
Lentiviral particles expressing shRNAs were produced in 293T cells
according to the manufacturers’ instructions. Briefly, Lenti-X^TM 293T
cells were transfected with lentiviral packaging mix (Sigma) and
MISSION pKLO.1 plasmid containing non-targeting shRNA (shNONtg) or
pooled ARID1A shRNAs (shARID1A) (Sigma) using polyethylenimine (PEI) in
DMEM + 4.5 g/L d-Glucose, 110 mg/L Sodium Pyruvate, 10% FBS, 1%
l-glutamine. After 4 h, media was replaced with DMEM/F12, 10% FBS, 1%
L-glutamine, 1% P/S. Viral particles were collected after 48 and 96 h,
and viral titers were calculated using the qPCR Lentiviral Titration
Kit (ABM).
Migration assay
12Z cells were seeded into 35 mm dishes containing four-well culture
inserts at a density of 4000 cells per well. After 24 h, cells were
transfected with 125 ng pBabe vector or pPIK3CA^H1047R using the FuGene
HD as described above. After 4 h, cells were treated with lentiviral
particles expressing shNONtg or shARID1A at a multiplicity of infection
of 100. After 24 h, the media was replaced. At 48 h post transfection,
media was replaced with serum-free DMEM/F12 containing 1% l-glutamine
and 1% P/S. After 16 h of serum deprivation, culture inserts were
removed and serum-free media was added. At 0 and 24 h, images were
taken using a Nikon Eclipse Ti microscope. Distances between migration
fronts were measured using NIS Elements Advanced Research software at
16 different points 100 μm apart. Migration distance was calculated by
subtracting the average distance across migration fronts at 24 h from
the average distance at 0 h. Cells counts were conducted within a
1500 μm by 700 μm window surrounding the migration area.
Invasion assay
12Z cells were seeded in six-well dishes at a density of 50,000 cells
per well. After 24 h, cells were transfected with pPIK3CA or empty
vector as described above. After 4 h, cells were treated with
lentiviral particles expressing shNONtg or shARID1A at a multiplicity
of infection of 100. Media was replaced after 24 h. At 48 h post
transfection, cells were trypsinized, and 100 μL of cell mixture
containing 30,000 cells and 0.3 mg/mL Matrigel was seeded into
transwell plates (8 μm pore polycarbonate membrane, Corning) pre-coated
with 100 μL of 0.3 mg/mL Matrigel. After 1 h, serum-free DMEM/F12 1%
P/S, 1% l-glutamine media was added to the top chamber and DMEM/F12, 5%
FBS, 1% P/S, 1% l-glutamine was added to the bottom chamber. After
16 h, transwell units were transferred to plates containing 4 μg/mL
calcein AM in DMEM/F12. After 1 h, media was aspirated from the top
chamber and unmigrated cells were removed with a cotton swab. Images
were collected using a Nikon Eclipse Ti microscope in five
non-overlapping fields per well. ImageJ software (National Institutes
of Health) was used to quantify cells based on size and intensity.
Western blotting
Protein lysates were quantified using the Micro BCA Protein Assay Kit
(ThermoFisher) and a FlexSystem3 plate reader. Protein lysates were run
on a 4–15% gradient sodium dodecyl sulfate polyacrylamide gel
electrophoresis (SDS-PAGE) gel (BioRad) and transferred to PVDF
membrane using the TransBlot Turbo system (BioRad). Primary antibodies
were used at the following dilutions: 1:1000 ARID1A (D2A8U) (12354,
Cell Signaling); 1:1000 Akt (4691, Cell Signaling); 1:1000 β-Actin
(8457, Cell Signaling); E-Cadherin (3195, Cell Signaling); 1:2000
Phospho-Akt (Ser473) (4060, Cell Signaling); 1:1000 Slug (9585, Cell
Signaling); 1:1000 Snail (3879, Cell Signaling); 1:1000 Twist1 (T6451,
Sigma); 1:100 ARID1B (sc-32762, Santa Cruz); 1:1000 Brg1 (ab110641,
Abcam); 1:1000 BRM (11966, Cell Signaling); 1:100 ARID1A (PSG3)
(sc-32761, Santa Cruz). Horseradish peroxidase (HRP) conjugated
secondary antibodies (Cell Signaling) were used at a dilution of
1:2000. Clarity Western ECL Substrate (BioRad) was used for protein
band visualization, and western blot exposures were captured using the
ChemiDoc XRS + imaging system (BioRad). Uncropped western blot images
can be found in Supplementary Fig. [313]7.
Chromatin immunoprecipitation
Wild-type 12Z cells were treated with 1% formaldehyde in DMEM/F12 media
for 10 min at room temperature. Formaldehyde was quenched by the
addition of 0.125 M Glycine and incubation for 5 min at room
temperature, followed by wash with PBS. In all, 1 × 10^7 crosslinked
cells were used per IP. Chromatin from crosslinked cells was
fractionated by digestion with micrococcal nuclease using the
SimpleChIP Enzymatic Chromatin IP Kit (Cell Signaling) as per the
manufacturers’ instructions, followed by 30 s of sonication. IPs were
performed using the SimpleChIP Enzymatic Chromatin IP Kit per the
manufacturers’ instructions with 1:100 anti-ARID1A (D2A8U) (12354, Cell
Signaling). Crosslinks were reversed with 0.4 mg/mL Proteinase K
(ThermoFisher) and 0.2 M NaCl at 65 °C for 2 h. DNA was purified using
the ChIP DNA Clean & Concentrator Kit (Zymo).
Chromatin immunoprecipitation sequencing (ChIP-seq)
Libraries for input and IP samples were prepared by the Van Andel
Genomics Core from 10 ng of input material and IP material using the
KAPA Hyper Prep Kit (v5.16) (Kapa Biosystems, Wilmington, MA USA).
Prior to PCR amplification, end repaired and A-tailed DNA fragments
were ligated to Bioo Scientific NEXTflex Adapters (Bioo Scientific,
Austin, TX, USA). Quality and quantity of the finished libraries were
assessed using a combination of Agilent DNA High Sensitivity chip
(Agilent Technologies, Inc.), QuantiFluor dsDNA System (Promega Corp.,
Madison, WI, USA), and Kapa Illumina Library Quantification qPCR assays
(Kapa Biosystems). Individually indexed libraries were pooled and
75 bp, single-end sequencing was performed on an Illumina NextSeq 500
sequencer using 75 cycle HO sequencing kits (v2) (Illumina Inc., San
Diego, CA, USA), with all libraries run across two flow cells to return
a minimum read depth of 80 M reads per input library and 40 M read per
IP library. Base calling was done by Illumina NextSeq Control Software
(NCS) v2.0 and output of NCS was demultiplexed and converted to FastQ
format with Illumina Bcl2fastq v1.9.0.
ChIP-seq analysis
Technical replicate libraries were combined across flow cells and
trimmed with cutadapt and Trim Galore! followed by quality control
analysis via FastQC. Trimmed reads were aligned to GRCh38.p12 reference
genome via Bowtie2^[314]74 with flag ‘–very-sensitive’. Reads were then
sorted and indexed with samtools^[315]76. PCR duplicates were removed
with Picard MarkDuplicates
([316]http://broadinstitute.github.io/picard/), and again sorted and
indexed. MACS2^[317]80 was used to call broad peaks with FDR < 0.05
threshold on each ChIP replicate against the input control, and the
resulting peaks were repeat-masked by blacklist filtering^[318]81. A
naive overlap peak set, as defined by ENCODE, was constructed by
combining replicates and calling broad peaks on pooled BAM files
followed by intersectBed to select for peaks of at least 50% overlap
with each biological replicate. Naive overlapping ChIP peaks were
annotated by HOMER, and de novo motif enrichment and genome ontology
were performed on genome-wide and promoter (within 3 kb of a TSS) peak
sets. Overlapping genes between ChIP/ATAC and ChIP/ATAC/RNA were
selected by the presence of a significant ChIP peak and differentially
accessible promoter ATAC peak (FDR < 0.05) located in the same promoter
region (within 3 kb of TSS).
Co-immunoprecipitation (co-IP)
Small-scale nuclear extracts and co-IPs from wild-type 12Z cells were
performed^[319]20. Briefly, Protein A or Protein G Dynabeads
(Invitrogen) were conjugated with anti-ARID1A (D2A8U) (12354, Cell
Signaling) anti-ARID1A (PSG3) (sc-32761, Santa Cruz), or anti-ARID1B
(E9J4T) (92964, Cell Signaling) in PBS + 0.5% BSA overnight at 4 C.
Four-hundred micrograms of nuclear lysate was added to a final volume
of 1 mL IP buffer (20 mM HEPES [pH 7.9], 250 mM KCl, 10% glycerol,
0.2 mM EDTA, 0.1% Tween-20, 0.5 mM DTT, 0.5 mM PMSF), clarified by
high-speed centrifugation and added to antibody-conjugated beads
(D2A8U, 1:200; PSG3, 1:40; E9J4T, 1:200) and incubated overnight at
4 °C. IP samples were washed in a series of IP buffers with varying
salt concentrations as follows: 150 mM KCl, 300 mM KCl, 500 mM KCl,
300 mM KCl, 100 mM KCl. IP samples were washed a final time in 60 mM
KCl IP buffer in the absence of EDTA or Tween-20. Proteins were eluted
twice with 100 mM glycine pH 2.5 on ice and neutralized by the addition
of 1:10 (v:v) of 1 M Tris-HCl pH 8.0.
Co-IP followed by mass spectrometry
Nuclear lysates from wild-type 12Z cells were prepared as described in
the previous section. Protein A Dynabeads (Invitrogen) were conjugated
with 8.3 μg anti-ARID1A (D2A8U) (12354, Cell Signaling) or IgG (2729,
Cell Signaling) in PBS + 0.5% BSA + 0.01% Tween-20 overnight at 4 °C.
Antibody-bead conjugates were crosslinked in BS^[320]3 (ThermoFisher)
as described by the manufacturer protocol, and excess unlinked antibody
was removed by one wash of 0.11 M glycine followed by quenching with
Tris-HCl. 4.3 mg of nuclear lysate was added to a final volume of 14 mL
IP buffer (20 mM HEPES [pH 7.9], 150 mM KCl, 10% glycerol, 0.2 mM EDTA,
0.1% Tween-20, 0.5 mM DTT, 0.5 mM PMSF) and clarified by high-speed
centrifugation. Diluted nuclear lysate was added to
antibody-crosslinked beads and incubated overnight at 4 °C. IP samples
were washed in an IP buffer series with varying salt concentrations as
follows: twice with 150 mM KCl, three times with 300 mM KCl, twice with
100 mM KCl. IP samples were washed a final time in 60 mM KCl IP buffer
in the absence of EDTA or Tween-20. Proteins were eluted in 2x
Laemmli + 100 µM DTT at 70 °C for 10 min. Eluates were processed for
short-gel SDS-PAGE and mass spectrometry by the University of
Massachusetts Mass Spectrometry core.
Mass spectrometry analysis
All MS/MS samples were analyzed using Mascot (version 2.1.1.21, Matrix
Science, London, UK). Mascot was set-up to search UniProtKB Swiss-Prot
(Human) assuming the digestion enzyme as strict trypsin. Mascot was
searched with a fragment ion mass tolerance of 0.050 Da and a parent
ion tolerance of 10.0 PPM. Carbamidomethyl of cysteine was specified in
Mascot as a fixed modification. Gln- > pyro-Glu of glutamine and the
N-terminus, oxidation of methionine and acetyl of the N-terminus were
specified in Mascot as variable modifications. Scaffold (version 4.8.8,
Proteome Software Inc., Portland, OR) was used to validate MS/MS-based
peptide and protein identifications. Peptide identifications were
accepted if they could be established at >85.0% probability by the
Peptide Prophet algorithm^[321]102 with Scaffold delta-mass correction.
Protein identifications were accepted if they could be established at
greater than 99.0% probability and contained at least two identified
peptides. Protein probabilities were assigned by the Protein Prophet
algorithm^[322]103. Proteins that contained similar peptides and could
not be differentiated based on MS/MS analysis alone were grouped to
satisfy the principles of parsimony. Proteins sharing significant
peptide evidence were grouped into clusters.
Reporting summary
Further information on research design is available in the [323]Nature
Research Reporting Summary linked to this article.
Supplementary information
[324]Supplementary Information^ (1.5MB, pdf)
[325]Reporting Summary^ (84.1KB, pdf)
Acknowledgements