Abstract
Growing evidence implies a link between DNA methylation and tumor
immunity/immunotherapy. However, the global influence of DNA
methylation on the characteristics of the tumor microenvironment and
the efficacy of immunotherapy remains to be clarified. In this study,
we systematically evaluated the DNA methylation regulator patterns and
tumor microenvironment characteristics of 1,619 gastric cancer patients
by clustering the gene expression of 20 DNA methylation regulators.
Three gastric cancer subtypes that had different DNA methylation
modification patterns and distinct tumor microenvironment
characteristics were recognized. Then, a DNA methylation score (DMS)
was constructed to evaluate DNA methylation modification individually.
High DMS was characterized by immune activation status, increased tumor
mutation burden, and tumor neoantigens, with a favorable prognosis.
Conversely, activation of the stroma and absence of immune cell
infiltration were observed in the low DMS group, with relatively poor
survival. High DMS was also certified to be correlated with enhanced
efficacy of immunotherapy in four immune checkpoint blocking treatment
cohorts. In conclusion, the characterization of DNA methylation
modification patterns may help to enhance our recognition of the tumor
immune microenvironment of gastric cancer and guide more personalized
immunotherapy strategies in the future.
Keywords: DNA methylation, tumor microenvironment, immunotherapy,
biomarker, gastric cancer
Graphical abstract
graphic file with name fx1.jpg
[43]Open in a new tab
__________________________________________________________________
Qiu and colleagues identified three distinct immune subtypes in GC from
the perspective of DNA methylation and constructed an individual DNA
methylation score (DMS). DMS is a valuable tool for the prediction of
survival, clinicopathological characteristics, and the efficacy of
immunotherapy and might guide more personalized immunotherapy
strategies.
Introduction
As one of the best characterized epigenetic modifications, DNA
methylation has been reported to be related to numerous biological
processes.[44]^1^,[45]^2 5mC, which means that DNA methylation occurs
at the fifth carbon atom of the cytosine residues within CpG
dinucleotides, represents the major form of DNA methylation
modification in mammals.[46]^3
Gastric cancer (GC) is the fourth most common cancer and the second
leading cause of tumor-related deaths worldwide.[47]^4 Although great
efforts have been devoted to the treatment of GC, effectively
individualized therapeutic strategies remain to be
explored.[48]^5^,[49]^6 Several genes that play essential roles in the
initiation and progression of GC were reported to be regulated by
hypermethylation or hypomethylation of the promoter. For instance,
Higashimori et al.[50]^7 reported that FOXF2 was preferentially
methylated in GC, which could suppress GC through the
FOXF2/IRF2BPL/β-catenin signaling axis. In addition, approximately 9%
of GC is found to have Epstein-Barr virus (EBV) infection.[51]^8 The
major epigenetic change in EBV-positive GC is aberrant CpG island
hypermethylation in the promoter regions of several tumor suppressor
genes, which are involved in cell cycle regulation, apoptosis, cell
adhesion, metastases, and DNA repair pathways.[52]^9
Immune checkpoint blocking therapy has been demonstrated to improve
survival across multiple tumor types, including GC.[53]10, [54]11,
[55]12, [56]13, [57]14, [58]15, [59]16, [60]17 However, the overall
efficiency was less than 20%. The tumor mutation burden (TMB), PD-L1
expression, microsatellite instability (MSI) status, and EBV infection
were reported as potential biomarkers that favor PD-1 blockade-based
immunotherapy.[61]^18 Growing evidence implies a link between DNA
methylation and tumor immunity/immunotherapy.[62]19, [63]20, [64]21 For
example, Xu and colleagues[65]^22 found that the interferon
(IFN)-γ/JAK/STAT/TET2 signaling pathway is involved in tumor immunity,
and stimulating TET2 activity could enhance anti-PD-L1 efficacy in
solid tumors.[66]^23 We have recently reported TET1 mutation as a
potential biomarker for immune checkpoint blocking therapy across
multiple cancers. However, to date, the global influence of all DNA
methylation regulators on the immune microenvironment and the efficacy
of immunotherapy remains unknown.
In this study, we integrated 1,619 patients from eight independent GC
cohorts to identify DNA methylation regulator modification patterns by
unsupervised clustering of the gene expression of 20 DNA methylation
regulators. We chose to investigate the expression of genes that
regulate DNA methylation rather than DNA methylation itself, since the
function of DNA methylation can vary based on genomic context. Three
distinct DNA methylation regulator patterns with different immune
microenvironment characteristics were recognized and found to be
consistent with the three classical immune phenotypes: immune-inflamed,
immune-excluded, and immune desert. Importantly, a DNA methylation
score (DMS) system was constructed to evaluate the DNA methylation
status individually. Our results indicated that DMS may serve as an
alternative biomarker for survival and efficacy of immunotherapy.
Results
Multi-omics landscape of DNA methylation regulators in GC
After systematic review of published articles about DNA methylation, a
total of 20 DNA methylation regulators were collected and enrolled in
our analysis, including 3 writers (DNMT1, DNMT3A, DNMT3B), 3 erasers
(TET1, TET2, TET3), and 14 readers (MBD1, MBD2, MBD3, MBD4, ZBTB33,
ZBTB38, ZBTB4, UHRF1, UHRF2, MECP2, UNG, TDG, NTHL1, SMUG1), as shown
in [67]Figure 1A.
Figure 1.
[68]Figure 1
[69]Open in a new tab
Multi-omics landscape of DNA methylation regulators in GC
(A) Overview of the 20 DNA methylation regulators and their major
biological functions. (B) The mutation frequency of 20 DNA methylation
regulators in TCGA STAD cohort. Each column of the figure represents an
individual patient. The upper bar plot represents TMB. The number on
the right shows the mutation frequency of each regulator. The right bar
plot indicates the proportion of each variant type. The lower bar
represents the sample annotations. (C) The CNV frequency of DNA
methylation regulators in TCGA STAD cohort. Gain, red; loss, blue. (D)
The protein-protein interaction network among DNA methylation
regulators. The size of the node represents the number of proteins
interacting with this modifier. (E) Boxplot shows the expression of 20
DNA methylation regulators between tumor and normal tissues in TCGA
STAD cohort. ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001, ∗∗∗∗p < 0.0001.
The overall mutation rate of all regulators is relatively low in the GC
genome ([70]Figure 1B). Among all 20 modifiers, the eraser TET1, whose
mutation has been reported to be a favorable prognostic marker in
patients receiving immunotherapy,[71]^22 and the reader ZBTB38 had the
highest mutation rates (5%), while two readers, TDG and UNG, exhibited
extremely low mutation rates in GC patients (0%) ([72]Figure 1B).
Co-occurrence mutation patterns were found in several regulators, such
as TET1 and DNMT1 or TET3 and DNMT3B, even though they might have
opposite biological functions. However, no obvious mutation-exclusive
phenomenon was found among these regulators ([73]Figure S1B). For copy
number variation (CNV), MECP2, DNMT3B, and ZBTB38 showed a relatively
high frequency of amplification, while MBD1 and MBD2 were mainly copy
number deletions ([74]Figure 1C). The protein-protein interaction (PPI)
network analyzed by STRING depicted widespread protein interactions
among these modifiers ([75]Figure 1D). As shown in [76]Figure 1E, most
regulators except ZBTB4 showed relatively higher RNA expression in
tumors than in normal gastric tissues, indicating that they might play
crucial roles in the initiation and progression of GC. Furthermore,
most regulators presented obvious differential expression among
distinct The Cancer Genome Atlas (TCGA) molecular subtypes or CpG
island methylator phenotype (CIMP) subtypes ([77]Figures S1C and S1D).
Also, some regulators individually or in combination were significantly
associated with specific molecular subtypes or CIMP subtypes
([78]Figures S1C and S1D), which indicated that they also contribute to
the heterogeneity of GC. The results were also validated in the Asian
Cancer Research Group (ACRG) cohort ([79]Figure S1E). In addition, we
speculated about the potential regulation of these regulators by
promoter DNA methylation. We found a relatively strong negative
correlation between the mean methylation of promoter sites and RNA
expression for TET1 and TDG ([80]Figures S1F and S1G). In conclusion,
the above results revealed that crosstalk among these DNA methylation
regulators might play crucial roles in the initiation, progression, and
the heterogeneity of GC.
Prognosis and immune characteristics of DNA methylation regulators
To further clarify the role of DNA methylation regulators in GC, we
collected 1,619 patients with survival information from eight
independent cohorts ([81]Table S1), which is the largest GC cohort to
our knowledge.
The probable molecular functions of these DNA methylation regulators
were first investigated using 10 classical oncogenic pathways. Almost
all regulators had a positive correlation with the cell cycle pathway
([82]Figure S1H), which has been reported to be frequently dysregulated
in cancer.[83]^24 MBD1, ZBTB38, and ZBTB4 had a strong correlation with
the transforming growth factor (TGF)-β pathway, indicating the
potential functions of these readers ([84]Figure S1H). Further analysis
of RNA expression revealed a relatively strong positive correlation
among DNMT1, UHRF1, and UNG ([85]Figure 2A; [86]Table S2). The
co-expression phenomenon of these genes might indicate a functional
correlation, which needs further validation. The forest plot of a
univariate Cox regression depicted the prognostic value of 20 DNA
methylation modifiers ([87]Figure S1I). The interactions and prognostic
significance of these modifiers were further visualized in the network
plot ([88]Figure 2A).
Figure 2.
[89]Figure 2
[90]Open in a new tab
Prognosis and immune characteristics of DNA methylation regulators
(A) Correlations and prognosis of DNA methylation regulators in GC
patients. The red line represents a positive correlation with p <
0.0001, and the blue line represents a negative correlation with p <
0.0001. The size of the node represents the p value of the log-rank
test. Green points represent favorable factors for OS. Black points
represent risk factors for OS. (B) Correlation heatmap between 20 DNA
methylation regulators and immune cells in the gathered GC cohort.
Orange indicates positive correlation; blue indicates negative
correlation. ∗p < 0.05, ∗∗p < 0.01. (C) GSEA analysis indicated that
six immune/inflammation-related pathways were enriched in the MBD2-high
expression group in the gathered GC cohort. The top of the figure
represents the enrichment score of the pathways. The bottom of the
figure represents the ranked differential expression gene list between
the high and low MBD expression groups. (D) Boxplot of MBD2 expression
between the MSI and no-MSI groups in the ACRG cohort. (E) Boxplot of
MBD2 expression between the MSI and no-MSI groups in TCGA STAD cohort.
(F) Overall survival analysis of high (n = 32) and low (n = 59) MBD2
expression groups in the PRJEB23709 immunotherapy cohort. Log-rank
test, p = 0.026. (G) Boxplot of relative MBD2 expression for distinct
clinical response groups. Kruskal-Wallis test, p = 0.016. (H) Overall
survival analysis of high (n = 159) and low (n = 139) MBD2 expression
groups in the UC_Atezo immunotherapy cohort. Log-rank test, p = 0.0042.
DNA methylation has been reported to play significant roles in the
immune system and tumor microenvironment.[91]^22^,[92]^23 Therefore, we
also investigated the relationship between DNA methylation regulators
and tumor immunology. The expression of MBD2, TET2, and DNMT1 was
positively correlated with most immune cells ([93]Figure 2B), which
might explain their favorable prognostic value. It was also reported
that patients with MSI status or EBV-positive phenotype usually had
better immune checkpoint blockade treatment efficacy.[94]^25 The
expression of DNMT1, MBD2, UHRF1, UNG, and TET2 was significantly
upregulated in the MSI group in both the ACRG and TCGA cohorts
([95]Figures S2A and S2B). The expression of DNMT1, DNMT3A, MBD2, MBD3,
MBD4, NTHL1, and ZBTB4 was clearly high in the EBV-positive group
([96]Figure S2C).
Considering the relatively higher correlation between MBD2 and
activated CD8 T cells, and its favorable prognostic value, we further
thoroughly analyzed the role of MBD2 in tumor immunology/immunotherapy.
First, the MBD2 high-expression group had relatively higher immune cell
infiltration abundance, especially for activated CD8^+ T cells,
activated B cells, and activated CD4^+ T cells ([97]Figure S2D).
Second, gene set enrichment analysis (GSEA) analysis indicated that
several immune or inflammation-related pathways, such as the T cell
receptor signaling pathway, B cell receptor signaling pathway, antigen
processing and presentation pathway, PD-L1 expression and PD-1
checkpoint pathway, chemokine signaling pathway, and natural killer
(NK) cell-mediated cytotoxicity pathway, were significantly enriched in
the MBD2 high-expression group ([98]Figure 2C). Third, MBD2 had a
relatively higher expression level not only in the MSI group, for both
the ACRG ([99]Figure 2D) and TCGA STAD ([100]Figure 2E) cohorts, but
also in the EBV-positive group ([101]Figure S2C). Owing to the
important role of MBD2 in tumor immunology, we also investigated
whether the expression of MBD2 could help predict the efficacy of
immune checkpoint blockade treatment. Survival analysis indicated that
MBD2 high expression patients had better prognosis in the PRJEB23709
cohort ([102]Figure 2F) and UC_Atezo cohort ([103]Figure 2G).
Furthermore, the high expression of MBD2 represented a better clinical
response ([104]Figure 2H).
Above all, these results revealed the intimate relationship between DNA
methylation regulators and the tumor microenvironment. The reader MBD2
was positively correlated with the infiltration of immune cells and
might be a favorable biomarker for prognosis and the efficacy of
immunotherapy in GC.
DNA methylation modification patterns in the gathered GC cohort
As DNA methylation regulators might contribute to the heterogeneity of
GC and they also were associated closely with the tumor
microenvironment, to further recognize new probable DNA methylation
regulator patterns, unsupervised clustering was conducted based on the
expression of 20 DNA methylation regulators in the gathered GC cohort.
As shown in [105]Figure S3A, three clusters could achieve the best
clustering efficacy. Accordingly, patients were classified into DNA
methylation modification pattern A (n = 510), pattern B (n = 749), and
pattern C (n = 360). DNA methylation regulator pattern A was
characterized by high expression of ZBTB4, ZBTB38, MBD1, UHRF2, and
TET2; DNA methylation regulator pattern B showed relatively high
expression of UHRF1, DNMT1, UNG, and NTHL1; DNA methylation regulator
pattern C showed relatively high expression of ZBTB33, TET3, TET1, TDG,
DNMT3A, DNMT3B, and MBD3 ([106]Figures 3A, [107]S3B, and S3C).
Three-dimensional principal component analysis (3D-PCA) for the
expression of 20 DNA methylation modifiers to distinguish three DNA
methylation regulator patterns showed that three groups were obviously
separated, indicating that they were well distinguished based on the
expression of 20 DNA methylation regulators ([108]Figure 3B). Survival
analysis indicated that DNA methylation regulator pattern B was
correlated with a notably favorable prognosis ([109]Figure 3C) in
accordance with its highest activated CD8 T cell, activated CD4 T cell,
and activated dendritic cell infiltration ([110]Figure 3D; [111]Table
S3).
Figure 3.
[112]Figure 3
[113]Open in a new tab
DNA methylation modification patterns in the gathered GC cohort
(A) The table shows DNA methylation regulators enriched in different
DNA methylation regulator patterns. (B) Three-dimensional principal
component analysis (3D-PCA) for the expression of 20 DNA methylation
regulators to distinguish three DNA methylation regulator patterns in
1,619 patients. (C) Survival analysis of three DNA methylation
regulator patterns in 1,619 patients from eight cohorts, including 510
cases of DNA methylation regulator pattern A, 749 cases of DNA
methylation regulator pattern B, and 360 cases of DNA methylation
regulator pattern C. Log-rank test, p < 0.0001. (D) Boxplot of relative
immune cell abundance for three DNA methylation modification patterns
in 1,619 patients. ∗∗p < 0.01; ∗∗∗∗p < 0.0001. (E) GSVA analysis of
relatively activated hallmark gene sets among three DNA methylation
modification patterns. Red represents activated pathways, and blue
represents inhibited pathways. The GC cohorts and DNA methylation
regulator patterns were used as sample annotations.
Then, we explored the differences in signaling pathways among the three
DNA methylation regulator patterns. DNA methylation regulator pattern B
was enriched in immune/inflammation-related pathways, such as the IFN-γ
pathway, interleukin (IL)-6/JAK/STAT3 pathway, inflammatory response
pathway, and complement pathway ([114]Figure 3E; [115]Table S4),
indicating an immune/inflammation activation status in pattern B. DNA
methylation regulator pattern A represented a stroma activation
phenotype, with many enriched pathways, such as the
epithelial-mesenchymal transition (EMT) pathway, hypoxia pathway, TGF-β
pathway, and angiogenesis pathway ([116]Figure 3E).
Different clinical and transcriptome characteristics of the three DNA
methylation regulator patterns in the ACRG cohort
To further explore the potential DNA methylation regulator patterns in
GC, we focused our attention on the ACRG cohort, which had the most
comprehensive clinical information. Unsupervised clustering of the gene
expression of 20 DNA methylation regulators in the ACRG cohort also
identified three DNA methylation-related patterns ([117]Figures 4A,
[118]S4A, and S4D). The composition of DNA methylation regulators in
the three patterns was almost similar to that of the gathered cohort
([119]Figure S4E).
Figure 4.
[120]Figure 4
[121]Open in a new tab
Different clinical and transcriptome characteristics of the three DNA
methylation regulator patterns in the ACRG cohort
(A) Unsupervised clustering of 20 DNA methylation regulators in the
ACRG cohort. Red represents high expression, and blue represents low
expression. The DNA methylation regulator patterns, ACRG subtypes, MSI
status, stage, histology subtype, age, and survival status were used as
sample annotations. (B) Survival analysis of three DNA methylation
regulator patterns in the ACRG cohort, including 100 cases of DNA
methylation regulator pattern A, 132 cases of DNA methylation regulator
pattern B, and 68 cases of DNA methylation regulator pattern C.
Log-rank test, p < 0.0001. (C) Boxplot of several immune signatures for
three DNA methylation regulator patterns in the ACRG cohort. ∗∗∗∗p <
0.0001. (D) Stacked bar plot of ACRG molecular subtype for three DNA
methylation regulator patterns. MSI subtype, light blue; EMT subtype,
blue; MSS/TP53^+ subtype, red; MSS/TP53^− subtype, orange. (E) The
heatmap shows the mean differences of immune-related gene expression in
the three DNA methylation regulator patterns. The red square indicates
upregulation, while the blue square indicates downregulation.
Stimulatory molecular, yellow; inhibitory molecular, dark; MHC
molecular, gray. (F) Gene Ontology (GO) analysis depicted the enriched
pathways of DNA methylation-related genes. The color of the bar plot
represents the number of genes enriched in this pathway.
PCA in the ACRG cohort also indicated that patients were well
distinguished based on the expression of 20 DNA methylation regulators
([122]Figure S4F). Survival analysis revealed that DNA methylation
regulator pattern B showed a significantly more favorable prognosis
than did patterns A and C ([123]Figure 4B, B versus A, p = 0.00016, B
versus C, p = 0.00232), in accordance with its highest activated CD8
T cell, activated CD4 T cell, and activated dendritic cell infiltration
([124]Figure S4G). However, DNA methylation regulator pattern A did not
exhibit a survival advantage over pattern C ([125]Figure 4B, p =
0.70553), even though pattern A also had obvious CD8 T cell and B cell
infiltration ([126]Figure S4G). Previous studies reported that TGF-β
could restrain the antitumor immune response by restricting T cell
infiltration.[127]^26 We speculated that the stroma surrounding the
tumor, which inhibited the infiltration of immune cells, caused poor
prognosis of DNA methylation pattern A. Therefore, we identified
several immune infiltration-related signatures from Mariathasan
et al.’s[128]^26 study ([129]Table S5) and analyzed their association
with DNA methylation regulator patterns. The results confirmed that
pattern B was enriched in immune activation signatures such as CD8 T
effector, co-stimulation antigen-presenting cell (APC), immune
checkpoint, and type I IFN responses. Pattern A represented a stroma
activation phenotype, with many signatures, such as angiogenesis,
pan-F-TBRS, and EMT1–3 ([130]Figure 4C). In line with the
characteristics of immune cell infiltration and immune signatures, many
stimulatory immunomodulators or immune checkpoint molecules were
generally unregulated in DNA methylation regulator pattern B,
indicating a relatively hot tumor immune microenvironment
([131]Figure 4D). Moreover, based on the molecular subtypes of the ACRG
cohort, we found that most patients of the MSI subtype were classified
into pattern B, while most patients of the EMT subtype belonged to
pattern A, which further confirmed the conclusion ([132]Figures 4E and
[133]S4H).
The above results revealed that GC patients could be classified into
three immune phenotype groups based on the expression of 20 DNA
methylation modifiers. In the three groups, pattern B represented an
immune-inflamed phenotype characterized by a relatively hot immune
microenvironment and a high proportion of immune cell infiltration.
Pattern A represented an immune-excluded phenotype characterized by
stroma activation and immune cells surrounding the tumor. Pattern C
represented an immune-desert phenotype characterized by little immune
cell infiltration and immune repression status.
The clinical and transcriptomic characteristics of DNA methylation-related
gene clusters and the construction of DMS
To further explore the heterogeneity of different DNA methylation
regulator patterns, we recognized the differentially expressed genes
(DEGs) among groups. A total of 859 DNA methylation regulator
pattern-related genes were identified ([134]Figure S4I). Gene Ontology
(GO) analysis showed that the pathways were enriched in immune and DNA
methylation-related events ([135]Figure 4F), indicating that the
different clinical and transcriptomic characteristics among DNA
methylation regulator patterns might result from these differential DNA
methylation signature genes. Subsequently, a univariate Cox regression
analysis certified that 265 genes had prognostic value ([136]Table S6).
Unsupervised clustering analysis based on the expression of these 265
genes also divided GC patients into three clusters, which we called DNA
methylation gene clusters ([137]Figures 5A and [138]S5A–S5E). The
clinical analysis showed that patients in gene cluster A tended to have
an MSI status, earlier staging, and a better histology subtype
([139]Figure 5A). Additionally, survival analysis indicated that gene
cluster A had a better prognosis ([140]Figure 5B). In line with the
clinical characteristics, patients in gene cluster A presented higher
CD8 T effector, co-stimulation of APC, and type I IFN response
signatures ([141]Figure S5F) as well as a relatively hot immune
microenvironment ([142]Figure 5C). Patients of gene cluster C had
higher scores of angiogenesis, pan-F-TBRS, and EMT1-3 signatures
([143]Figure S5F). The result indicated that gene cluster C presented
an immune-excluded phenotype as DNA methylation pattern A. Above all,
these results reinforced the proposal that there were indeed three
different immune phenotype groups in GC, which represented different
clinical and immune characteristics.
Figure 5.
[144]Figure 5
[145]Open in a new tab
Clinical and transcriptomic characteristics of the DNA methylation gene
cluster and the construction of DMS
(A) Unsupervised clustering of 265 DNA methylation-related genes in the
ACRG cohort. The top clinical annotation included DNA methylation
regulator patterns, ACRG molecular subtypes, MSI status, stage,
histology, age, and survival status. Red represents high expression,
and blue represents low expression. (B) Survival analysis of three DNA
methylation-related gene clusters, including 108 cases in gene cluster
A, 117 cases in gene cluster B, and 75 cases in gene cluster C.
Log-rank test, p = 0.00013. (C) Heatmap showing the mean differences of
immune-related gene expression in three DNA methylation gene clusters.
The red square indicates upregulation, while the blue square indicates
downregulation. Stimulatory molecular, yellow; inhibitory molecular,
dark; MHC molecular, gray. (D) Boxplot of DMS for three DNA methylation
regulator patterns. The upper and lower ends of the boxes represent the
interquartile range of values. The lines in the boxes represent the
median value, and black dots show outliers. Kruskal-Wallis test, p <
0.0001. (E) Boxplot of DMS for three DNA methylation gene clusters.
Kruskal-Wallis test, p < 0.0001. (F) Boxplot of DMS for four ACRG
molecular subtypes. Kruskal-Wallis test, p < 0.0001. (G) Sankey diagram
depicting the relationship of DNA methylation regulator pattern, ACRG
molecular subtype, gene cluster, and DMS group.
To evaluate DNA methylation status individually, we further constructed
a risk score system based on 265 DNA methylation-related signature
genes, that is, the DMS. From this perspective, DNA methylation
regulator pattern B showed the highest median DMS, while DNA
methylation regulator pattern A showed the lowest median DMS
([146]Figure 5D). For DNA methylation gene clusters, cluster A showed
the highest median DMS, while cluster C showed the lowest median DMS
([147]Figure 5E). Among the four ACRG subtypes, the MSI subtype tended
to have the highest DMS, while the EMT subtype tended to have the
lowest DMS ([148]Figure 5F). With a median cutoff value of 0.0429, we
divided the patients into DMS-high and DMS-low groups. The relationship
of the DNA methylation regulator pattern, ACRG molecular subtype, gene
cluster, and DMS group is summarized in a Sankey diagram
([149]Figure 5G; [150]Table S7).
Prognostic value of DMS and validation of GC subtypes in TCGA cohort
Next, we explored the prognostic impact of DMS in GC patients. Survival
analysis indicated that the DMS-high group had prolonged survival time
in the ACRG cohort ([151]Figure 6A, p < 0.0001), which was further
validated in TCGA STAD cohort ([152]Figure 6B, p = 0.00039). The
multivariate Cox regression model confirmed that DMS was an independent
prognostic biomarker for evaluating patient survival in both the ACRG
and TCGA cohorts ([153]Figures S6A and S6B).
Figure 6.
[154]Figure 6
[155]Open in a new tab
Prognostic value of DMS and validation of GC subtypes in TCGA cohort
(A) Survival analysis of the high (n = 150) and low (n = 150) DMS
groups in the ACRG cohort based on median value. Log-rank test, p <
0.0001. (B) Survival analysis of the high (n = 177) and low (n = 193)
DMS groups in TCGA STAD cohort. Log-rank test, p = 0.00039. (C) Bar
chart depicting the relationship of DMS and various clinical
characteristics in the ACRG cohort. The ACRG subtypes, MSI status,
stage, histology subtype, age, and survival status were used as sample
annotations. (D) Bar chart depicting the relationship of DMS and
various clinical characteristics in TCGA cohort. TCGA subtypes, MSI
status, CIMP status, stage, histology subtype, age, and survival status
were used as sample annotations. (E) Boxplot of DMS for three
MethClusters in TCGA STAD cohort. Kruskal-Wallis test, p < 0.0001. (F)
Alluvial diagram depicting the relationship of DNA methylation
regulator pattern, TCGA molecular subtype, CIMP status, MethCluster,
and DMS group.
Additionally, we also conducted multi-cohort validation of the
prognostic value of DMS. In accord with the results in the ACRG and
TCGA STAD cohorts, the DMS-high group had prolonged overall survival
(OS) in the GEO: [156]GSE15459 (p = 0.023), [157]GSE26899 (p = 0.037),
[158]GSE26901 (p = 0.0032), and [159]GSE84437 (p = 0.014) cohorts, as
well as the gathered eight GEO GC cohorts (p < 0.0001) ([160]Figures
S6C–S6G). Additionally, DMS was also correlated with disease-free
survival (DFS), with DMS-high group patients having prolonged DFS (p <
0.0001) ([161]Figure S6H). In the three cohorts with recurrence
information, DMS was also correlated with recurrence-free survival
(RFS) in the GEO: [162]GSE26253 (p = 0.0019), [163]GSE26899 (p = 0.04),
and [164]GSE26901 (p = 0.005) cohorts ([165]Figures S6I–S6K).
In addition, DMS could also contribute to evaluating some clinical
characteristics. The DMS-high group tended to have more MSI status,
earlier staging, and better histology subtypes in the ACRG cohort
([166]Figure 6C). It has been reported that MSI status and EBV-positive
phenotype might be related to better immunotherapy efficacy.[167]^25 Of
the four TCGA STAD molecular subtypes, the MSI and EBV-positive groups
showed higher DMS scores ([168]Figures 6D and [169]S7A). The results of
the Seung GC cohort were also in accord with the results of TCGA cohort
([170]Figure S7B).
The above results revealed that DMS was a significant prognostic
biomarker that could effectively predict OS, DFS, RFS, and some
clinical characteristics of GC patients and offer potential clinical
application value.
To further investigate the relationship between DNA methylation
regulator patterns, DMS, and the methylation status of GC, we first
adopted four CIMP subtypes that were annotated by Liu et al.[171]^27
Results showed that patients with CIMP status tended to have high DMS
([172]Figure 6D). Additionally, we also downloaded TCGA STAD 450K data.
To our surprise, unsupervised clustering based on the 2,000 most
variable transcription start site (TSS) CpG sites also classified
patients into three groups ([173]Figure S7C), which we called
MethCluster. Survival analysis showed that MethCluster C2 had the best
prognosis, with the highest DMS ([174]Figures 6E and [175]S7D). The
proportion of immune cells and immune signature analysis showed that
MethCluster C2 had relatively high activated CD8 T cell and CD4 T cell
abundance and relatively high CD8 effector T cell, cytolytic activity,
and antigen presentation pathway scores ([176]Figures S7E and S7F). In
addition, MethCluster C2 showed abundant stimulatory or major
histocompatibility complex (MHC) molecule expression ([177]Figure S7G),
which represented a relatively hot immune microenvironment and might
explain its better prognosis. The relationship of DNA methylation
regulator patterns, TCGA subtype, CIMP status, MethCluster, and DMS was
visualized in an alluvial diagram ([178]Figure 6F). Above all, the
results of the DNA methylome further validated that there were three
DNA methylation-related molecular subtypes in GC, which revealed the
consistency of clustering between the transcriptome and DNA methylome.
Then, we also analyzed the correlation of intrinsic immune escape
mechanisms with DMS in TCGA STAD patients. The results showed that TMB
and tumor neoantigen burden (TNB) were relatively higher in the
DMS-high group, indicating the relatively high immunogenicity (Figures
S8A and S8B). While homologous recombination deficiency (HRD),
cancer-testis antigen (CTA), aneuploidy score, intratumor heterogeneity
(ITH), and stroma fraction were relatively higher in the DMS-low group
([179]Figures S8C–S8G), indicating a higher degree of malignancy and
lower immune cell infiltration.
Predictive value of DMS in immunotherapy
Immunotherapy has manifested improved survival in the treatment of
multiple tumor types, and it is urgent to identify patients who will
benefit most. Thus, we further explored whether DMS could predict the
efficacy of immunotherapy using four immune checkpoint blockade
treatment cohorts.
In the Seung CC cohort, which was the only public transcriptome dataset
in GC immunotherapy cohorts to our knowledge, patients in the response
group tended to have higher DMS ([180]Figures 7A, 7B, and [181]S9A). It
was reported that patients in MSI-H and EBV-positive groups showed
higher response rates.[182]^28 In line with this, these patients also
exhibited higher DMS ([183]Figure 7C). In addition, the mesenchymal
subtype has been demonstrated to be a negative predictor of response to
immunotherapy in GC.[184]^28 The mesenchymal subtype showed relatively
low DMS in this cohort ([185]Figure S9B). Moreover, we also found a
positive correlation between DMS and the expression of PD-L1
([186]Figure S9C), whose high expression indicated better efficacy of
immunotherapy. To our regret, the survival data of the patients in this
cohort were not accessible. Thus, three pan-cancer immunotherapy
cohorts were further adopted to evaluate the predictive value of DMS.
In the UC_Atezo cohort, with urothelial cancer patients treated with
atezolizumab, the high DMS group had prolonged OS (p = 0.038)
([187]Figure 7D). At the same time, the partial response (PR) and
complete response (CR) groups had relatively higher DMS (Kruskal-Wallis
test, p = 0.0014) ([188]Figures 7E, [189]S9D, and S9E). We also
evaluated the prognostic prediction efficiency of the combination of
TMB and DMS. The results showed that the DMS-high and TMB-high groups
had the best overall survival compared with the other groups
([190]Figure 7F). The immune phenotypes in this cohort have been
certified by experiments, and so we further investigated the
correlation of DMS with three classic immune phenotypes. As shown in
[191]Figure 7G, the patients with an immune-inflamed phenotype had the
highest DMS, which further confirmed our analysis above. The same
results were also found in the PRJEB23709 cohort and David Liu cohort.
The DMS-high group exhibited better prognosis and higher clinical
response rates ([192]Figures 7H–7K and [193]S9F–S9H).
Figure 7.
[194]Figure 7
[195]Open in a new tab
The role of DMS in four immune checkpoint treatment cohorts
(A) Boxplot of DMS for distinct clinical response groups in the Seung
GC immunotherapy cohort. No response, blue; response, red. Wilcoxon
test, p = 0.00017. (B) Waterfall plot of DMS for distinct clinical
response groups in the Seung GC immunotherapy cohort. (C) Boxplot of
DMS for different TCGA subtypes in the Seung GC immunotherapy cohort.
The MSI-H and EBV groups showed higher DMS. Kruskal-Wallis test, p =
0.0024. (D) Survival analysis of the high (n = 204) and low (n = 144)
DMS groups in the UC_Atezo immunotherapy cohort. Log-rank test, p =
0.038. (E) Boxplot of DMS for distinct clinical response groups. The PR
and CR groups had relatively higher DMS. Kruskal-Wallis test, p =
0.0014. (F) Survival analysis of distinct groups stratified by both TMB
and DMS. (G) Boxplot of DMS for three established immune phenotypes in
the UC_Atezo immunotherapy cohort. Kruskal-Wallis test, p < 0.001. (H)
Survival analysis of the high (n = 54) and low (n = 37) DMS groups in
the PRJEB23709 immunotherapy cohort. Log-rank test, p < 0.0001. (I)
Stacked bar plot depicting different fractions of clinical response
patients in the high and low DMS groups in the PRJEB23709 immunotherapy
cohort. PR/CR, red; PD/SD, blue. (J) Survival analysis of the high (n =
21) and low (n = 100) DMS groups in the David Liu immunotherapy cohort.
Log-rank test, p = 0.028. (K) Boxplot of DMS for distinct clinical
response groups in the David Liu immunotherapy cohort. PD/SD, blue;
CR/PR, red. Wilcoxon test, p = 0.022. (L) The multiple bar plots for
the AUC values of the 11 ICT response signatures.
Importantly, in order to further evaluate the predictive performance of
the DMS in GC, we compared the DMS with 10 previously published gene
expression immune-related signatures studied in response to immune
checkpoint inhibitors (ICIs).[196]29, [197]30, [198]31, [199]32,
[200]33, [201]34, [202]35 The results showed that the DMS was the best
signature for predicting response to immunotherapy with an area under
the curve (AUC) value of 0.843 ([203]Figures 7L and [204]S9I).
Above all, the results of these four immunotherapy cohorts solidly
certified that DMS had the ability to efficiently predict the efficacy
of immunotherapy and might achieve better predictive value when
combined with TMB.
Discussion
In recent years, growing evidence has revealed that DNA methylation
plays crucial roles in the regulation of antitumor immunity and the
response to immunotherapy.[205]21, [206]22, [207]23 However, the global
profiling of DNA methylation regulator patterns and its impact on the
immune microenvironment of GC remains to be clarified. Identifying the
relationship between different DNA methylation modification patterns
and immune cell infiltration in GC is beneficial to deepen our
understanding of the tumor immune microenvironment and to guide more
effective immunotherapy strategies.
In this study, we first recognized three DNA methylation regulator
patterns that have distinct immune characteristics by unsupervised
clustering of the gene expression of DNA methylation regulators. The
results were consistent with three classical immune phenotypes:
immune-inflamed, immune-excluded, and immune desert.[208]^26 DNA
methylation regulator pattern B was characterized by activation of
multiple immune/inflammation-related pathways and a higher proportion
of CD8^+ effector cell and APC infiltration. In addition, many
stimulatory immunomodulators were generally upregulated, indicating
that DNA methylation regulator pattern B had a hot immune
microenvironment. In line with this, patients with pattern B had the
best prognosis. Contrary to pattern B, DNA methylation regulator
pattern C had little immune cell infiltration and extremely low
expression of stimulatory immunomodulators or MHC molecules, which
revealed a cold microenvironment. Although DNA methylation regulator
pattern A also had a high proportion of CD8^+ effector cell and other
immune cell infiltration, the activation of the stroma and TGF-β
pathway hindered the penetration of immune cells into the parenchyma of
the tumor, which presented an immune-excluded phenotype. Therefore, it
was not surprising that DNA methylation regulator pattern A has poorer
prognosis than pattern B.
Furthermore, DEGs among these three DNA methylation regulator patterns
were considered DNA methylation-related signature genes and might be
directly or indirectly regulated by DNA methylation events. Similar to
the results of the DNA methylation regulator patterns, three DNA
methylation gene clusters that correlated with immune or stromal
activation were recognized based on these DNA methylation signature
genes. These results demonstrated that there were indeed three distinct
immune subtypes in GC.
Considering the heterogeneity of DNA methylation modification
individually, there was a need to quantify the DNA methylation
modification profiles of a single tumor. Thus, a DNA
methylation-related scoring system called DMS was further constructed
and validated in multiple GC datasets. The immune-inflamed subtype has
higher DMS, which was further validated in a cohort whose immune
phenotype was already determined by experiments.[209]^26 Multivariate
analysis also indicated that DMS was an independent prognostic factor
in GC. Patients with EBV-positive or MSI status that tended to benefit
from immunotherapy[210]^25 had relatively high DMS. In addition, DMS
also showed excellent predictive value in multiple immunotherapy
cohorts.
We also conducted unsupervised clustering using promoter DNA
methylation sites of GC in TCGA. To our surprise, patients were also
divided into three clusters. Patients of MethCluster C2 had a high
proportion of activated CD4 and CD8 T cell infiltration and a hot
immune microenvironment, in line with better prognosis and higher DMS.
These results revealed a consistency between the transcriptome and DNA
methylome and further validated that there were indeed three distinct
immune subtypes in GC.
In view of the clinical significance of our study, we constructed a DMS
that can evaluate the DNA methylation profiles individually. The
predictive value of DMS for survival was validated in multiple GC
cohorts. Additionally, DMS could also be used for assessing the
clinicopathological features of patients, such as MSI status, EBV
status, histology subtype, molecular subtypes, clinical stages, and
TMB, among others. In addition, DMS had the ability to predict the
efficacy of PD-1/PD-L1 immune checkpoint blockade therapy, which was
validated in a GC immunotherapy cohort and three pan-cancer
immunotherapy cohorts. Moreover, when combined with TMB, the widely
accepted immunotherapy biomarker DMS revealed better predictive
performance. Therefore, DMS might be an excellent biomarker for
predicting the efficacy of immune checkpoint therapy and might promote
personalized GC immunotherapy in the future.
There were also some limitations to our study. First, the survival data
of the GC immunotherapy cohort were not accessible. The predictive
performance of DMS for GC needs to be further certified in the future.
Second, only the median cutoff of DMS in the ACRG cohort was used to
classify the GC, and the optimal cutoff value of the DMS might be
needed to better stratify the GC patients.
In conclusion, we identified three distinct immune subtypes in GC from
the perspective of DNA methylation and constructed an individual DNA
methylation profile scoring system. DMS is a valuable tool for the
prediction of survival, clinicopathological characteristics, and the
efficacy of immunotherapy and might help to promote personalized GC
immunotherapy in the future.
Materials and methods
Collection of GC datasets and preprocessing
The workflow of our work is shown in [211]Figure S1A. Publicly
available data from the GEO and TCGA databases were used in this study.
Patients without survival information were removed from further
evaluation. In total, we gathered eight GC cohorts (ACRG, GEO:
[212]GSE15459, [213]GSE34942, [214]GSE57303, [215]GSE84437,
[216]GSE26899, and [217]GSE26901, and TCGA) including 1,619 patients
for further analysis.
For TCGA STAD cohort, RNA sequencing data (fragments per kilobase of
transcript per million mapped reads [FPKM] values) were downloaded via
the R package TCGAbiolinks.[218]^36 Then, FPKM values were transformed
into transcripts per kilobase million (TPM) values that were more
similar to those resulting from microarrays. Somatic mutation (SNPs and
small INDELs) and methylation 450K data were downloaded from the
University of California Santa Cruz (UCSC) Xena browser
([219]https://xenabrowser.net). Clinical data were collected from (1)
the corresponding GEO dataset metadata, and (2) the supplemental files
of relevant articles.[220]^27^,[221]37, [222]38, [223]39 Batch effects
resulting from non-biological technical biases were corrected using the
ComBat algorithm of the sva package. All baseline information of
eligible GC datasets is summarized in [224]Table S1.
Collection of immune-related data
The immune-related features and genes of TCGA GC patients are available
at [225]https://gdc.cancer.gov/about-data/publications/panimmune. Four
immune checkpoint blockade treatment cohorts with available expression
and clinical information were used in our study: (1) Seung GC cohort,
metastatic GC with pembrolizumab treatment; (2) David Liu cohort,
metastatic melanoma with nivolumab or pembrolizumab treatment; (3)
UC_Atezo cohort, urothelial cancer with atezolizumab treatment; and (4)
PRJEB23709 cohort, melanoma with ipilimumab + nivolumab/pembrolizumab
or nivolumab/pembrolizumab treatment.
Ten previously published immune related signatures studied in response
to ICI were collected to compare with our established signature DMS
([226]Table S8).
Crosstalk among DNA methylation regulators
The protein-protein interactions among DNA regulators were identified
based on the STRING[227]^40 interaction database and were further
visualized by Cytoscape.[228]^41 The size of a node indicates the
modifier numbers interacting with it.
Unsupervised clustering for DNA methylation genes
Unsupervised clustering methods were used to identify different DNA
methylation modification patterns and classify patients for further
analysis. A total of 20 regulators that were collected from published
articles were extracted from eight integrated GEO big datasets or the
ACRG cohort to identify different DNA modification patterns mediated by
DNA methylation modifiers. A consensus clustering algorithm was
performed using the R package ConsensuClusterPlus[229]^42 and was
repeated 1,000 times in order to ensure the stability of clustering.
Gene set variation analysis (GSVA) and single-sample GSEA (ssGSEA)
The R package GSVA[230]^43 was used to quantify the activity of
biological pathways. Immune gene signatures were collected from
previously published works.[231]^26^,[232]^38^,[233]^44 All hallmark
gene sets were downloaded from the Molecular Signature Database
(MSigDB) to compare differences between DNA modification patterns. The
10 most common oncogenic hallmarks were obtained from the supplementary
table of Sanchez-Vega et al.’s[234]^45 work. The ssGSEA algorithm in
the R package GSVA was used to estimate the relative abundance of each
immune cell in GC. The gene sets defining each immune cell type were
downloaded from the study of Charoentong[235]^46 ([236]Table S9).
DEGs among DNA patterns
DEGs among different DNA modification patterns were determined using
the R package limma.[237]^47 The significance criterion for DEGs was
set as an adjusted p value of <0.01.
Functional and pathway enrichment analysis
GO analysis was performed to identify enriched GO terms using the
R package clusterProfiler[238]^48 with a cutoff of p value of <0.05 and
an adjusted p value of <0.2. To identify the most related pathways of
DNA modifiers, the gseKEGG function of the R package clusterProfiler
was used. The DEGs list was estimated between groups with high and low
expression of this gene and ordered according to the fold change.
Generation of the DMS
First, the prognostic analysis was performed for each gene in the 859
DEGs using a univariate Cox regression model. A total of 265 genes with
significant prognosis were extracted for further analysis. Then, the
expression of these genes was transformed into a Z score. PCA was
conducted to construct a DNA methylation relevant score, which we
called DMS. Both PC1 and PC2 were selected to serve as signature
scores. This method offered the advantage of focusing the score on the
set with the largest block of well-correlated (or anticorrelated) genes
in the set, while downweighting contributions from genes that do not
track with other set members:[239]^38^,[240]^49
[MATH: DMS=∑PC1i + PC2i, :MATH]
where i is the expression of DNA methylation regulator pattern-related
signature genes.
Statistical analysis
The normality of the variables was tested using the Shapiro-Wilk
normality test.[241]^50 For comparisons of two normally distributed
groups, statistical analysis was performed by unpaired t tests, and for
nonnormally distributed variables, statistical analysis was analyzed by
a Wilcoxon rank-sum test. For comparisons of three groups,
Kruskal-Wallis tests or one-way ANOVA were used as nonparametric or
parametric methods, respectively. Correlation coefficients were
computed by Spearman and distance correlation analyses. The best cutoff
values of each cohort were evaluated using the surv-cutpoint function
in the survminer package. The survival curves for the prognostic
analysis were conducted via the Kaplan-Meier method, and log-rank tests
were utilized to judge differences between groups. The univariate Cox
regression model was utilized to calculate the hazard ratios (HRs) for
DNA regulators and DNA methylation regulator pattern-related genes. All
statistical p values were two-sided, with p <0.05 considered as
statistically significant. All statistical analyses were conducted
using R 3.6.1 ([242]https://www.r-project.org/).
Acknowledgments