Abstract
Background: DNA methylation played essential roles in regulating gene
expression. The impact of DNA methylation status on the occurrence and
development of cancers has been well demonstrated. However, little is
known about its prognostic role in breast cancer (BC).
Materials: The Illumina Human Methylation450 array (450k array) data of
BC was downloaded from the UCSC xena database. Transcriptomic data of
BC was downloaded from the Cancer Genome Atlas (TCGA) database.
Firstly, we used univariate and multivariate Cox regression analysis to
screen out independent prognostic CpGs, and then we identified
methylation-associated prognosis subgroups by consensus clustering.
Next, a methylation prognostic model was developed using multivariate
Cox analysis and was validated with the Illumina Human Methylation27
array (27k array) dataset of BC. We then screened out differentially
expressed genes (DEGs) between methylation high-risk and low-risk
groups and constructed a methylation-based gene prognostic signature.
Further, we validated the gene signature with three subgroups of the
TCGA-BRCA dataset and an external dataset [41]GSE146558 from the Gene
Expression Omnibus (GEO) database.
Results: We established a methylation prognostic signature and a
methylation-based gene prognostic signature, and there was a close
positive correlation between them. The gene prognostic signature
involved six genes: IRF2, KCNJ11, ZDHHC9, LRP11, PCMT1, and TMEM70. We
verified their expression in mRNA and protein levels in BC. Both
methylation and methylation-based gene prognostic signatures showed
good prognostic stratification ability. The AUC values of 3-years,
5-years overall survival (OS) were 0.737, 0.744 in the methylation
signature and 0.725, 0.715 in the gene signature, respectively. In the
validation groups, high-risk patients were confirmed to have poorer OS.
The AUC values of 3 years were 0.757, 0.735, 0.733 in the three
subgroups of TCGA dataset and 0.635 in [42]GSE146558 dataset.
Conclusion: This study revealed the DNA methylation landscape and
established promising methylation and methylation-based gene prognostic
signatures that could serve as potential prognostic biomarkers and
therapeutic targets.
Keywords: DNA methylation, gene expression, breast cancer, prognostic
signature, the cancer genome atlas (TCGA)
Introduction
Breast cancer (BC) is the most prevalent cancer and the leading cause
of cancer mortality in women worldwide ([43]Freddie et al., 2020). In
the United States, it is estimated that around 13% of women will suffer
from BC in their lifetime ([44]DeSantis et al., 2019). In recent years,
the mortality of BC has gradually declined and the 5-years OS rate has
reaches 90% attributed to early detection and improved treatment
([45]Allemani et al., 2018).
Breast cancers are categorized into ER+, ER+/HER2-, HER2+ and
Triple-negative subtypes based on the expression of estrogen receptor
(ER), progesterone receptor (PR) and human epidermal growth factor
receptor 2 (HER2). Similarly, gene expression analysis of these
receptors further recognizes four subsets of BC: luminal A, luminal B,
HER2-enriched (HER2-E) and Basal-like ([46]Parker et al., 2009). These
classification systems not only help predict the prognosis of BC
patients, but also guide treatment choices. Conventional therapies such
as surgery, radiotherapy, and chemotherapy form the basis of BC
treatment. In addition, endocrine therapy for hormone receptor-positive
BC and anti-HER2 treatment for HER2 expressing BC have greatly improved
the prognosis of patients. Unfortunately, triple negative breast cancer
(TNBC) still lacks effective therapeutic targets. Recent studies
demonstrated that poly ADP-ribose polymerase 1 (PARP1) inhibitors and
immune checkpoint inhibitors (ICIs) showed potential effect in TNBC
([47]Mittendorf et al., 2020), ([48]Schmid et al., 2020). Despite the
great achievements in treatment, about 25–40% of BC patients will
develop metastases ([49]Siegel et al., 20172017). Among them, bone
metastases are the most common, and approximately 75% of late-stage BC
patients are diagnosed with bone metastases ([50]Tulotta and Ottewell,
2018). Moreover, 5–20% of BC patients would have brain metastases
([51]Achrol et al., 2019). Once the patient develops metastasis, the
prognosis is poor, with the median OS of only 1–2 years ([52]Redig and
McAllister, 2013), ([53]Martin et al., 2017). Therefore, it is urgent
to find potential prognosis-related biomarkers to accurately predict
the prognosis of BC patients.
Epigenetics events such as DNA methylation, histone modifications,
chromatin remodeling, and non-coding RNAs play essential roles in the
regulation of gene expression and actively participate in the
development and progression of cancers. DNA methylation, which affects
gene expression without changing the DNA sequence, is the most common
epigenetic modification ([54]Nakao, 2001), ([55]Strahl and Allis,
2000). Stefansson et al. demonstrated that abnormal methylation of CpG
islands in the promoter regions might activate proto-oncogenes or
silence tumor suppressor genes, thereby contributing to the occurrence
and development of tumors ([56]Stefansson and Esteller, 2013).
Accumulating evidences showed that decreased levels of genome-wide
methylation were a critical sign of early cancers and were related to
cancer grade and metastasis ([57]Yang and Schwartz, 2011), ([58]Ding et
al., 2019). Indeed, DNA methylation was associated with most
malignancies including bladder cancer ([59]Chen et al., 2020a), lung
cancer ([60]Liang et al., 2019), and gastrointestinal tumors ([61]Woo
et al., 2018), ([62]Huang et al., 2018).
Emerging studies have revealed the important roles of DNA methylation
in BC ([63]Pasculli et al., 2018; [64]Kresovich et al., 2019; [65]Xu et
al., 2020). For instance, distinct DNA methylation patterns and
associated gene expression profiles were found in different molecular
subtypes of BC. SFRP1, a tumor suppressor gene, was down-regulated by
hypermethylation in ER + breast cancer, leading poor prognosis
([66]Stefansson et al., 2015), ([67]Park et al., 2012). Other genes
such as BRCA1, CDH1, and PTEN, were also abnormally methylated in BC.
These events could serve as potential therapeutic and prognostic
biomarkers ([68]FitzGerald et al., 1998; [69]Pharoah et al., 2001;
[70]King et al., 2003; [71]Walsh et al., 2006; [72]Suijkerbuijk et al.,
2008; [73]Luo et al., 2016). However, the prognostic role of DNA
methylation in BC remains incompletely demonstrated.
In this study, we used bioinformatics methods to determine the
prognostic role of DNA methylation and constructed
methylation-associated prognostic signatures for BC. This study will
help unveil the significance of DNA methylation in BC and might help
discover novel prognostic biomarkers.
Materials and Methods
Data Acquisition and Processing
RNA-seq data in fragments per kilobase of transcript per million mapped
reads (FPKM) form and clinical information of BC were downloaded from
the Cancer Genome Atlas (TCGA: [74]https://portal.gdc.cancer.gov/)
database. Illumina Human Methylation450 BeadChip array (450k array) and
Illumina Human Methylation27 BeadChip array (27k array) data of TCGA
database were downloaded from UCSC xena ([75]https://xenabrowser.net/)
([76]Goldman et al., 2020). DNA Methylation levels were evaluated by
the β value, which ranged from 0 to 1 (0 means unmethylated and 1 means
fully methylated). Probes with over 70% of missing values and probes
located at chromosomes X and Y were removed. The missing values of the
remaining probes were imputed using the k-nearest neighbours (knn)
imputation algorithm of the impute R package. Since DNA methylation in
promoter regions would strongly influence gene expression, we focused
on the methylation probes in promoter regions defined as 2.0 kb
upstream to 0.5 kb downstream from transcription start sites (TSS).
Batch effects were removed by the ComBat algorithm of the sva R package
([77]Leek et al., 2012). Ultimately, 560 patients including methylation
data (from 450k array) and corresponding clinical data, 986 patients
(from TCGA database) containing both gene expression data and
corresponding clinical data were used for mainly analysis. And we
obtained 557 overlapping patients (from the above two datasets) with
complete gene expression data, methylation data and clinical data.
Moreover, RNA-seq data and clinical information of 106 samples from
[78]GSE146558 were downloaded from NCBI (GEO:
[79]https://www.ncbi.nlm.nih.gov/) as external validation dataset. The
mRNA expression profile from GEO dataset was normalized by the Robust
Multichip Average (RMA) algorithm with background adjustment, quantile
normalization, and final summarization. The workflow of our study was
illustrated in [80]Figure 1.
FIGURE 1.
[81]FIGURE 1
[82]Open in a new tab
The workflow for this study. DEGs, differentially expressed genes.
Independent Prognostic CpG Loci Screening
Univariate Cox regression analysis was performed to screen out the
prognosis-related CpGs of the 560 BC patients. In our study, we used
the OS as clinical parameter of prognosis. Next, all these CpGs were
subjected to multivariate Cox regression analysis, with age,
pathological stage, and TNM stage as covariates to identify independent
prognostic CpGs.
GO and KEGG Analysis
The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes
(KEGG) ([83]http://www.genome.jp/kegg/) ([84]Kanehisa et al., 2017)
analysis were performed using the R ClusterProfiler package ([85]Yu et
al., 2012). A p value < 0.05 was set as the cut-off value for both GO
and KEGG analyses in our study.
Consensus Clustering and Evaluation of CpG-Related Subtypes
Consensus clustering ([86]Wilkerson and Hayes, 2010) was performed to
determine subgroups of different methylation characteristics of the 560
BC patients based on the independent prognostic CpG loci using the
ConsensusClusterPlus package of R. The criteria to determine the number
of clusters were: ([87]Freddie et al., 2020) The consistency within the
cluster was relatively high; ([88]DeSantis et al., 2019) There was no
significant increase in the area under the CDF curve; ([89]Allemani et
al., 2018) The relative change in area under CDF curve tended to be
stable. We then generated a consensus matrix to better visualize and
help determine the number of clusters.
Construction and Validation of the Methylation Prognostic Model
Differentially methylated independent prognostic CpG loci between the
different prognosis clusters were screened out using Wilcoxon test. The
filtering conditions were false discovery rate (FDR) < 0.05 and | log2
(fold change) | >0.585. On the basis of these differentially methylated
CpG loci, a methylation prognostic model of 560 BC patients was
constructed using multivariate Cox analysis. The formula of the risk
score was as follows:
[MATH: Risk score=∑coefiβi :MATH]
where coef[i] was the multivariate Cox regression coefficient, and βi
was the corresponding methylation β value. According to the median risk
score ([90]Chen et al., 2020b), ([91]Shen et al., 2019), patients were
divided into methylation low-risk (n = 280) and high-risk (n = 280)
groups. Survival curves were employed to compare the OS of the two
groups. Risk curve was plotted to visualize the relationship of the
risk score, survival status and the methylated levels of the six
signature CpG loci. Univariate and multivariate Cox regression analysis
were performed to explore whether the risk score could be an
independent predictor of OS. The sensitivity and specificity of the
methylation prognostic model were evaluated by calculating the area
under the curve (AUC) of the receiver operating characteristic (ROC)
curve.
Construction and Evaluation of Gene Prognostic Model
Differentially expressed genes (DEGs) were identified from methylation
high-risk and low-risk groups (of 557 BC patients’ dataset) with the
filter conditions of adjusted p < 0.05 and | log2 (fold change) | >1.
And then we further extracted the above-mentioned DEGs from the
transcriptomic profiles of 986 BC patients’ dataset for subsequent
analysis. Kaplan-Meier analysis and univariate Cox regression analysis
were employed to investigate prognosis-related DEGs of the 986 samples.
Similarly, a gene prognostic signature was constructed by multivariate
Cox regression analysis. And the risk formula was:
[MATH:
Risk score=∑i=1<
/mn>n(ex
pressio<
/mi>ni∗coefi)
mrow> :MATH]
Patients were also categorized into low-risk (n = 493) and high-risk (n
= 493) groups based on the median risk score. Kaplan-Meier survival
curve, risk curve, ROC curve, univariate and multivariate Cox
regression analysis were also used for evaluating and validating the
prognostic signature. Besides, two subgroups from 986 BC patients, the
557 patients’ dataset and [92]GSE146558 dataset were used to validate
the prognostic value of the gene signature.
Antitumor Drug Sensitivity Analysis
CellMiner ([93]https://discover.nci.nih.gov/cellminer/home.do) is a
robust, user-friendly online database that integrates drug sensitivity
and genomic data ([94]Reinhold et al., 2017), ([95]Wang et al., 2016).
Anti-tumor activity data obtained from NCI-60 tumor cell line panel of
the developmental therapeutics program (DTP) and RNA-seq data for the
60 cell lines of the NCI DTP drug screen were downloaded from this
website. Subsequently, correlation between the sensitivity of
anti-tumor drugs and the signature genes was analyzed.
Statistical Analysis
R 3.6.3 (version 3.6.3,
[96]https://pan.baidu.com/s/1sufVf2lmoj9GYG_j5_fJKQ) was used for
statistical analysis and plotting. Consensus clustering was performed
using the ConsensusClusterPlus package of R; COX regression analysis
was performed with the coxph function in survival package of R
([97]Zhang, 2016); Kaplan-Meier curve was plotted using the survival
and survminer packages of R; Pheatmap was plotted using the pheatmap
package of R; The forest plots were plotted by the forestplot package
of R; ROC curve was plotted by the survival ROC package of R. GO and
KEGG analyses were performed using the ClusterProfiler package of R.
Mann-Whitney test was used to estimate the statistical significance of
two groups of skewed distributed continuous variables, and
Kruskal-Wallis test was used to evaluate the statistical significance
of multiple groups of skewed distributed continuous variables (with
Bonferroni correction for pairwise comparisons among multiple groups).
All tests were two-sided and for all statistical tests, p < 0.05 was
considered to be statistically significant unless otherwise specified.
Results
Screening of Prognosis-Related CpG Loci in Breast Cancer
In our study, 450k array dataset was defined as train group and 27k
array dataset was defined as test group ([98]Table 1). Firstly, 144 CpG
loci with p < 0.001 by univariate Cox analysis were screened out and
identified as prognosis-related CpG loci. Using age, pathological
staging and TNM staging as covariates, 66 CpG loci (49 CpG loci
associated with favorable prognosis, 17 CpG loci associated with poor
prognosis) with p < 0.001 by multivariate Cox analysis were further
selected and used as the methylation classification features
([99]Figure 2).
TABLE 1.
General clinical characteristics of 560 BC patients.
Parameters Methylation train group Methylation test group
(n = 560) (n = 278)
Age — —
≤65 years 438 (78.3) 191 (68.7)
>65 years 122 (21.7) 87 (31.3)
Pathologic stage[100] ^a — —
I/II 418 (74.6) 215 (77.3)
III/IV 142 (25.4) 57 (20.5)
unknow 0 6 (2.2)
T stage[101] ^a — —
T1/2 479 (85.5) 238 (85.6)
T3/4 81 (14.5) 40 (14.4)
N stage[102] ^a — —
N0 254 (45.4) 143 (51.4)
N1/3 306 (54.6) 128 (46.1)
unknow 0 7 (2.5)
[103]Open in a new tab
^a
Staging according to Seventh Edition AJCC, Guidelines (Edge SB, Byrd
DR, Compton CC, Fritz AG, Greene FL, Trotti A, eds. AJCC, Cancer
Staging Manual. Seventh ed New York, NY: springer; 2010).
FIGURE 2.
[104]FIGURE 2
[105]Open in a new tab
Significance and hazard ratio values of 66 independent
prognosis-related CpG loci obtained from multivariate Cox regression
analysis.
Identification of DNA Methylation-Based Prognosis Subgroups
Then, the 560 patients were categorized into clusters of different
methylation characteristics with consensus clustering based on the
methylation of the 66 independent prognostic CpG loci. When the
patients were assigned to 5 categories, the consistency within the
clusters was high, the area under the cumulative distribution function
(CDF) curve began to stabilize, and the relative change in area under
CDF curve tended to be stable ([106]Figures 3A,B). A consensus matrix
representing the consensus for k = 5 also displayed a well-defined
5-block structure ([107]Figure 3C). Accordingly, the optimal number of
clusters was determined to be 5.
FIGURE 3.
[108]FIGURE 3
[109]Open in a new tab
The methylated levels and prognosis of consensus clustering subgroups
of breast cancer. (A,B) Consensus clustering cumulative distribution
function for k = 2–9. (C) Consensus clustering matrix for k = 5. (D)
Heatmap of the differentially methylated levels of the five subgroups.
(E) Methylated levels of the 66 independent prognosis-related CpG loci
among the five clusters. (F) Survival analysis of the five subgroups.
Subsequently, we conducted a subgroup analysis for the 5 clusters.
Firstly, we compared the methylation levels of these 66 independent
prognostic methylation loci among the five clusters. As illustrated in
[110]Figures 3D,E, Cluster 5 had the lowest methylation levels,
followed by Clusters 2 and 4, Clusters 1 and 3. And the methylated
difference between Cluster 5 and each of the remaining clusters was
statistically significant. To explore the prognostic significance of
the five clusters, Kaplan-Meier survival analysis was performed. We
found that the prognosis was statistically significantly different
among the 5 clusters, where Cluster 1 and Cluster 3 had the best OS,
while Cluster 5 had the worst ([111]Figure 3F).
Construction and Evaluation of Methylation Prognosis Model
The five clusters were significantly prognosis-associated, and
therefore were used to identify potential prognostic biomarkers. Given
that Cluster 5 had the lowest methylation level and the worst OS, it
was reasonable to be selected as the reference cluster. Next, 20
differentially methylated independent prognostic CpG loci were
identified from Cluster 5 and the rest clusters ([112]Table 2).
Ultimately, a methylation prognostic model was constructed which
included six CpG loci (cg00945507, cg05406101, cg10092957, cg13060154,
cg14992108, cg18678121) determined by multivariate Cox analysis
([113]Table 3). Kaplan-Meier analysis showed that cg00945507,
cg05406101, cg10092957, cg14992108, cg18678121 were associated with
improved survival, and cg13060154 was associated with poor survival
([114]Figure 4).
TABLE 2.
Characteristics of the differential prognosis-related CpGs by wilcoxon
rank-sum test (cluster 5 vs. the rest clusters).
CpGs Log2 FC p value FDR
cg00945507 −1.078388031 6.76E-07 1.35E-06
cg02022375 −1.043706338 1.58E-12 1.30E-11
cg02630888 −0.851143204 1.04E-12 1.15E-11
cg05406101 −1.143837139 1.23E-15 8.12E-14
cg06646021 −0.804998375 5.45E-10 1.89E-09
cg10092957 −1.057286596 1.11E-07 2.44E-07
cg11072113 −0.594310951 1.86E-11 1.02E-10
cg13060154 0.6679634 0.030369072 0.04090528
cg14992108 −0.673767411 2.43E-10 1.07E-09
cg15798153 −0.959416408 2.90E-14 9.57E-13
cg16466334 −0.985246418 4.22E-12 3.09E-11
cg16522484 −0.696605561 9.22E-14 2.03E-12
cg18678121 −0.729884099 1.14E-07 2.44E-07
cg19094438 −1.478722276 5.50E-12 3.63E-11
cg21032583 −1.145936985 2.48E-13 3.28E-12
cg22197830 −1.050980669 1.22E-12 1.15E-11
cg24194539 1.144078012 0.010277521 0.014432263
cg26065841 −0.62551186 2.25E-13 3.28E-12
cg26147480 −0.695572224 1.37E-10 6.47E-10
cg27653134 −0.904508613 3.80E-11 1.93E-10
[115]Open in a new tab
TABLE 3.
Formula of Methylation prognostic model.
CpG loci Coef HR
cg00945507 −4.01607 0.018024
cg05406101 1.117929 3.058513
cg10092957 −1.78259 0.168202
cg13060154 1.655685 5.236664
cg14992108 −1.28305 0.27719
cg18678121 −1.1362 0.321038
[116]Open in a new tab
FIGURE 4.
[117]FIGURE 4
[118]Open in a new tab
The survival difference of hypermethylation and hypomethylation of the
six signature CpG loci.
Then we explored the mechanisms by which these signature CpG loci might
act on BC. The six CpG loci, cg00945507, cg05406101, cg10092957,
cg14992108, cg18678121, and cg13060154, were located at gene promoter
regions of SEC61G, RWDD2B, NCCRP1, SNTB1, SEC61A2, DAB2IP,
respectively. We firstly analyzed the correlation of these CpG loci and
their corresponding target genes. The methylation of cg10092957,
cg05406101, cg18678121, cg00945507 were moderately negatively
correlated with the expression of their target genes. Whereas
cg13060154 was weakly positively correlated with its corresponding gene
([119]Supplementary Figures S1A–F). Consistent with the above results,
the increased expression of NCCRP1, RWDD2B, SEC61A2, and SEC61G was
associated with the decreased β values of cg10092957, cg05406101,
cg18678121, and cg00945507, respectively. To the opposite of them,
DAB2IP had higher expression in the presence of hypermethylated
cg13060154 ([120]Supplementary Figures S1G–L). However, there was no
relationship between the cg14992108 and its target gene SNTB1. In
addition, we took advantage of TCGA Wanderer, an interactive viewer
exploring DNA methylation and gene expression data in human cancer
([121]http://maplab.imppc.org/wanderer/) ([122]Díez-Villanueva et al.,
2015), to explore the methylated difference of the six CpG loci between
breast cancer and normal tissues. We found that the methylation levels
of cg05406101, cg18678121, cg14992108 were higher in normal tissues.
However, the methylation levels of cg13060154 and cg10092957 were
higher in breast cancer ([123]Supplementary Figures S1M–R). We also
explored the prognostic roles of the six CpG loci in breast cancer
through the public database MethSurv ([124]Modhukur et al., 2018)
([125]https://biit.cs.ut.ee/methsurv/). High methylation levels of
cg00945507, cg05406101, cg10092957, cg14992108, cg18678121 were
associated with favorable prognosis. On the contrary, high methylation
level of cg13060154 was associated with poor survival
([126]Supplementary Figures S1S–X).
On the basis of multivariate Cox regression, we developed the following
risk model:
Risk score = −4.016 × cg00945507 + 1.117 × cg05406101−1.78 × cg10092957
+ 1.655 × cg13060154−1.283 × cg14992108−1.136 × cg18678121 ([127]Table
3).
According to the formula, we computed the risk score of each BC patient
in the train group (n = 560) and assigned them into high-risk (n = 280)
and low-risk groups (n = 280) with reference to the median risk score.
The methylation levels of the six CpG loci between the high-risk group
and low-risk group were showed in [128]Supplementary Figures S2A.
Kaplan-Meier curve indicated that the high-risk group had significantly
poorer OS than the low-risk group ([129]Figure 5A). To confirm the
methylation risk score could effectively predict the BC patients’
prognosis, we plotted the ROC curve. Notably, we observed that the risk
score had the highest prediction performance of prognosis compared with
the conventional clinical features, with the 3-years and 5-years AUC
values being 0.739 and 0.744 ([130]Figures 5B,C). The relationship of
methylation risk score, survival status and methylation levels of the
six signature CpG loci was shown in [131]Figures 5D–F. Univariate Cox
analysis indicated that age, stage, T stage, N stage and risk score
were significantly associated with OS. However, when they were
introduced into multivariate Cox analysis, only age [hazard ratio,
1.026 (95% CI, 1.008–1.045), p = 0.005], and risk score [hazard ratio,
2.823 (95% CI, 2.131–3.741), p < 0.001] remained as independent
prognostic predictors ([132]Figures 5G,H). Similar prognostic
significance was observed in the validation cohort (27k array dataset),
with AUC values of 0.603 and 0.657 for 3 and 5 years, respectively
([133]Supplementary Figures S2B–D).
FIGURE 5.
[134]FIGURE 5
[135]Open in a new tab
Methylation prognostic model assessment in 560 breast cancer samples.
(A) Survival analysis between the high-risk and low-risk groups. (B,C)
The time-dependent receiver operating characteristic (ROC) curves at 3
and 5 years. (D) The risk score distribution. (E) Survival status
scatter plots. (F) Heatmap of the six signature CpG loci. (G)
Univariate Cox regression analysis. (H) Multivariate Cox regression
analysis.
Identification of Differentially Expressed Genes From the Methylation
High-Risk and Low-Risk Groups
Using the thresholds of adjusted p < 0.05 and | log2 (fold change) |
>1, a total of 413 differentially expressed genes (DEGs) between
methylation high-risk and low-risk groups were obtained. The volcano
and heatmap visually displayed the DEGs ([136]Figures 6A,B). To further
investigate the biological characteristics of the DEGs, function and
pathway annotations were performed. GO analysis indicated that these
genes were involved in the regulation of cell cycle processes and
mitotic cell cycle phase transition. KEGG analysis showed that these
DEGs were mainly enriched in p53 and TGF-β signaling pathways
([137]Figures 6C–F).
FIGURE 6.
[138]FIGURE 6
[139]Open in a new tab
(A) Volcano plot of differentially expressed genes (DEGs); (B) Heatmap
of the DEGs; (C–F) GO and KEGG enrichment analysis for DEGs. KEGG
pathway (C), BP (D), CC (E), and MF (F).
Construction and Evaluation of Gene Prognostic Signature
To explore the prognostic values of the identified 413 DEGs, we
extracted the expression of these DEGs from the transcriptomic profiles
of 986 BC patients in the TCGA-BRCA database for subsequent analysis.
Firstly, 50 DEGs significantly related with OS were selected by
Kaplan-Meier analysis (p < 0.05). All of these prognostic genes were
subjected to univariate Cox regression analysis and 13 of them with p <
0.05 were further identified as prognosis-associated genes
([140]Supplementary Table S1). Ultimately, six prognosis-associated
genes, namely IRF2, KCNJ11, ZDHHC9, LRP11, PCMT1, and TMEM70, were
included in developing a gene prognostic signature by multivariate Cox
regression analysis. Among them, four genes (ZDHHC9, LRP11, PCMT1,
TMEM70) were related with poor survival and two genes (IRF2, KCNJ11)
were associated with good survival ([141]Supplementary Figures S3). The
risk formula was as follows:
Risk score = −(0.06128 × IRF2) − (0.0342 × KCNJ11) + (0.0228 × ZDHHC9)
+ (0.01257 × LRP11) + (0.01082 × PCMT1) + (0.02917 × TMEM70).
Gene expression analysis of the six signature genes in the Oncomine
database ([142]https://www.oncomine.org) revealed that ZDHHC9, LRP11,
PCMT1, TMEM70 were highly expressed in breast cancer, and IRF2, KCNJ11
were highly expressed in normal tissues ([143]Figure 7A). Then we
explored the protein levels of these six genes between breast cancer
and normal tissues in the Human Protein Atlas database ([144]Uhlen et
al., 2010) (HPA:
[145]https://www.proteinatlas.org/humanproteome/pathology). In
accordance with the gene expression levels, the protein levels of
ZDHHC9, PCMT1, TMEM70 were significantly higher in breast cancer, and
the protein levels of IRF2, KCNJ11 were higher in normal tissues
([146]Figure 7B). Moreover, we further checked the prognostic values of
our six genes in the public database TCGA portal (version 1.0)
([147]http://tcgaportal.org/TCGA/Breast_TCGA_BRCA/process.php), and we
found that ZDHHC9, LRP11, PCMT1, TMEM70 were associated with poor
prognosis, while IRF2 and KCNJ11 were related with good prognosis of BC
patients ([148]Figure 7C).
FIGURE 7.
[149]FIGURE 7
[150]Open in a new tab
Validation of the six signature genes. (A) The expression of the six
signature genes in breast cancer and normal tissue in Oncomine
database. (B) The protein expression levels of the six prognostic genes
in Human Protein Atlas database. (C) Survival analysis of the six
prognostic genes based on TCGA portal.
Association of Methylation Prognostic Model and Methylation-Based Gene
Prognostic Model
Perhaps not surprisingly, a positive and significant correlation was
observed between the two prognostic signatures, which was mainly
reflected at the following aspects: firstly, a moderate correlation was
found between the six signature CpG loci and six signature genes.
Specifically, the expression of IFR2 was positively related with the
methylation of cg00945507, cg05406101, cg14992108, and cg18678121,
above of which were all good prognostic factors, while IFR2 expression
was negatively related with the methylation of poor prognostic CpG
locus: cg13060154; And KCNJ11 expression was positively correlated with
the methylation of favorable prognostic CpG loci: cg05406101 and
cg10092957, and negatively related with the methylation of cg13060154;
To the opposite of these two genes, ZDHHC9 and TMEM70 expression were
negatively related with the methylation of cg05406101 and cg18678121,
and positively associated with the methylation of cg13060154;
Similarly, the expression of PCMT1 was negatively correlated with the
methylation of cg05406101, cg14992108, cg18678121; And LRP11 expression
was negatively related with the methylation of cg00945507, cg05406101,
cg18678121 ([151]Supplementary Figures S4A, [152]Supplementary Figures
S4I).
Subsequently, we examined the correlation between the six CpG loci and
the gene risk score, and we observed that except for cg13060154 having
the trend being positively correlated with the risk score, cg05406101
(R2 = −0.34, p < 0.001), cg18678121 (R2 = −0.28, p < 0.001), cg14992108
(R2 = −0.26, p < 0.001), cg00945507 (R2 = −0.26, p < 0.001), and
cg10092957 (R2 = −0.11, p < 0.01) were all negatively correlated with
the risk score ([153]Supplementary Figures S4B–G, [154]Supplementary
Figures S4I ). Besides, we also explored the relationship between
methylation risk score and gene risk score. Interestingly, we found
that these two established risk scores were positively correlated with
each other (R2 = 0.34, p < 0.001) ([155]Supplementary Figures S4H,I).
Evaluation and Validation of the Gene Prognostic Signature
The expression of the six signature genes between the gene high-risk
and low-risk groups was shown in [156]Supplementary Figures S5A.
Kaplan-Meier analysis showed that survival probability in the low-risk
group was higher ([157]Figure 8A). The AUC values of 3-years and
5-years OS were 0.725 and 0.715 ([158]Figures 8B,C). The risk score
distribution, the survival status, and the expression of the six genes
of 986 BC patients were visualized in [159]Figures 8D–F. Univariate and
multivariate Cox regression analyses indicated that the risk score was
associated with OS and could be an independent prognostic predictor,
with univariate hazard ratio, 1.187 (95% CI, 1.082–1.302, p < 0.001),
multivariate hazard ratio, 1.199 (95% CI, 1.094–1.314), p < 0.001,
respectively ([160]Figures 8G,H).
FIGURE 8.
[161]FIGURE 8
[162]Open in a new tab
Gene prognostic signature assessment of 986 breast cancer samples. (A)
Survival analysis between the high-risk and low-risk groups. (B,C) The
time-dependent receiver operating characteristic (ROC) curves at 3 and
5 years. (D) The risk score distribution. (E) Survival status scatter
plot. (F) Heatmap of the six prognostic genes. (G) Univariate Cox
regression analysis. (H) Multivariate Cox regression analysis.
To confirm the prognostic value of the six-gene signature, we tested it
with the validation subgroup comprising of 557 BC patients. The results
were consistent with our previous findings. Specifically, the OS in the
high-risk group was poorer ([163]Supplementary Figures S5B), and the
AUC values of 3 and 5 years were 0.735 and 0.696 ([164]Supplementary
Figures S5C). Univariate and multivariate COX regression analysis also
showed that the risk score was an independent prognostic predictor of
BC ([165]Supplementary Figures S5D,E). Subsequently, the 986 BC
patients’ dataset was randomly assigned into two test subgroups [test
group one (n = 494) and test group two (n = 492)] which were with
balanced baseline characteristics ([166]Table 4), and both of them were
used for validating the gene signature. The two subgroups could also
distinguish the favorable OS patients from the poor OS patients
([167]Supplementary Figures S5F,G). AUC values of 1 year, 3 years,
5 years were 0.711, 0.757, 0.721 in test group one, and 0.864, 0.733,
0.702 in test group two, respectively ([168]Supplementary Figures
S5I,J). Moreover, the external dataset [169]GSE146558 further confirmed
our gene prognostic signature. In high-risk group of [170]GSE146558
dataset, patients were with poorer OS ([171]Supplementary Figures S5H).
And AUC value of 3 years was 0.634 ([172]Supplementary Figures S5K).
TABLE 4.
Clinical characteristics of the two validation subgroups from the
dataset of 986 samples.
Covariates Type Total Test group one Test group two p Value
Gender Female 986 (100%) 492 (100%) 494 (100%) 0.9492
Age ≤65 716 (72.62%) 358 (72.76%) 358 (72.47%) 0.9742
— >65 270 (27.38%) 134 (27.24%) 136 (27.53%) —
Pathologic stage[173] ^a Stage I-II 727 (73.73%) 355 (72.15%) 372
(75.3%) 0.1261
— Stage III-IV 239 (24.24%) 131 (26.63%) 108 (21.86%) —
— Unknow 20 (2.03%) 6 (1.22%) 14 (2.83%) —
T stage[174] ^a T1-2 829 (84.08%) 404 (82.11%) 425 (86.03%) 0.1253
— T3-4 154 (15.62%) 86 (17.48%) 68 (13.77%) —
— unknow 3 (0.3%) 2 (0.41%) 1 (0.2%) —
M stage[175] ^a M0 811 (82.25%) 406 (82.52%) 405 (81.98%) 0.2694
— M1 20 (2.03%) 7 (1.42%) 13 (2.63%) —
— Unknow 155 (15.72%) 79 (16.06%) 76 (15.38%) —
N stage[176] ^a N0 451 (45.74%) 215 (43.7%) 236 (47.77%) 0.1875
— N1-3 518 (52.54%) 270 (54.88%) 248 (50.2%) —
— Unknow 17 (1.72%) 7 (1.42%) 10 (2.02%) —
[177]Open in a new tab
^a
Staging according to Seventh Edition AJCC, Guidelines (Edge SB, Byrd
DR, Compton CC, Fritz AG, Greene FL, Trotti A, eds. AJCC, Cancer
Staging Manual. Seventh ed New York, NY: springer; 2010).
In addition, we confirmed the prognostic value of the six-gene
prognostic signature in the subgroups of BC patients presented with
different clinical features (age (<65 and≥65), T staging (T1-2 and
T3-4), N staging [N0 and N1-3) and stage (stage I-II and stage III-IV)]
([178]Supplementary Figure S6).
Considering the important roles of BRCA1, BRCA2, CDH1, PTEN, TP53,
PIK3CA in BC, we also evaluated these gene expression between gene
high-risk and low-risk groups, and observed that the expression of
oncogenes such as BRCA1, BRCA2 and CDH1 were significantly higher in
high-risk group. On the other hand, the expression of the tumor
suppressor gene PTEN was significantly higher in low-risk group
([179]Figure 9).
FIGURE 9.
[180]FIGURE 9
[181]Open in a new tab
The differential expression of breast cancer-associated genes in gene
high-risk and low-risk groups.
Gene Set Enrichment Analysis
To investigate potential functions and signaling pathways related to
the six-prognosis signature, we performed Gene Set Enrichment Analysis
(GSEA: [182]http://www.gsea-msigdb.org/gsea/index.jsp). Notably, we
found that more tumor-related GO terms and KEGG pathways were
associated with low-risk group ([183]Figure 10A). In detail, low-risk
group was mainly associated with the function of regulating epithelial
and endothelial cell migration, and high-risk group was related with
nuclear chromosome condensing, protein folding. Pathway enrichment
analysis indicated that JAK/STAT signaling pathway, cell adhesion
molecule signaling pathway, VEGF signaling pathway, and MAPK signaling
pathway were active in the low-risk group. On the other hand, P53
signaling pathway was active in the high-risk group ([184]Figure 10B).
FIGURE 10.
[185]FIGURE 10
[186]Open in a new tab
GSEA for the gene prognostic signature. (A) The significant enrichment
of the top 5 tumor-related GO terms in high-risk group and low-risk
group. (B) The significant enrichment of the top 5 tumor-related
pathways in high-risk group and low-risk group.
Correlation Analysis of the Six Signature Genes and the Sensitivity of
Anti-Tumor Drugs
Correlation analysis between the expression of the six prognosis genes
and the sensitivity of anti-tumor drugs was performed based on the
CellMiner database ([187]https://discover.nci.nih.gov/cellminer/), and
the results indicated that our signature genes were moderately
correlated with the response of some common anti-tumor drugs such as
PARP inhibitor (Olaparlib), chemotherapy drugs (Fluorouracil,
Decitabine, Oxaliplatin), which might imply potential value in
anti-tumor therapy ([188]Figure 11).
FIGURE 11.
[189]FIGURE 11
[190]Open in a new tab
Correlation of the expression of the six prognostic genes and the
sensitivity of anti-tumor drugs.
Discussion
With the advent of next-generation sequencing, genome-wide DNA
methylation profile analysis has become possible. Multiple studies have
suggested that DNA methylation plays an important role in early
detection, improved molecular classification, prognosis prediction of
BC. Moreover, numerous studies have demonstrated that DNA methylation
could regulate immune-related gene expression, thereby affecting the
response of anti-tumor immunotherapy and BC patients’ prognosis. For
examples, increasing researches have reported that the expression of
immune genes such as CD3D, CD6, and HLA-A was found to be negatively
correlated with DNA methylation, and was related with a better
prognosis in BC ([191]Győrffy et al., 2016). Potential targets for
immunotherapy are still being explored. Recent studies have shown that
immune cell infiltration might be a biomarker for immunotherapy.
Importantly, the methylation of immune genes could also highly
sensitively reflect the presence of tumor infiltrating lymphocytes.
Thus, DNA methylation profiles could be used to predict the proportion
of all kinds of immune cells in the tumor microenvironment
([192]Győrffy et al., 2016), ([193]Jeschke et al., 2015). Given the
important role of DNA methylation, it is not surprising that a better
understanding of the DNA methylation and the exploration of the
interaction mechanism between genes and methylation are crucial for BC
patients.
DNA methylation has a substantial impact on gene expression, and
affects the prognosis of different subtypes of BC patients (44). In
this study, we obtained six prognosis-related CpG loci, cg00945507,
cg05406101, cg10092957, cg14992108, cg18678121, cg13060154,
respectively targeting SEC61G, RWDD2B, NCCRP1, SNTB1, SEC61A2, DAB2IP
genes. SEC61G was found to be overexpressed in BC and might co-amplify
with epidermal growth factor receptor (EGFR) ([194]Reis-Filho et al.,
2006). Lu et al. reported that the expression of SEC61G in BC was
negatively correlated with its promoter methylation ([195]Lu et al.,
2021). In our research, similar trend could be found that the
expression of SEC61G was negatively related with the methylation of
cg00945507. Moreover, the methylation level of SEC61G was positively
correlated with the prognosis of patients with glioma ([196]Liu et al.,
2019a). Miwa et al. proved that NCCRP1 transcription was inhibited by
promoter hypermethylation in esophageal squamous cell carcinoma
([197]Miwa et al., 2017). And high expression of NCCRP1 in patients
with pancreatic cancer was associated with a poor prognosis ([198]Zuo
et al., 2020). In our study, we observed that the expression of NCCRP1
was negatively correlated with the methylation of cg10092957. DAB2IP
was a candidate tumor suppressor gene and its expression
down-regulation mechanism was mainly through the promoter
hypermethylation ([199]Qiu et al., 2007). Demethylation of DAB2IP gene
weakened the EMT process and suppressed hepatocellular carcinoma growth
([200]Liu et al., 2019b). However, we observed a weak positive
correlation between DAB2IP expression and cg13060154 methylation in our
study. Regrettably, studies on the correlation between the expression
of SEC61A2, RWDD2B, SNTB1 and DNA methylation in tumors were
insufficient.
On the bias of the six prognostic CpG loci, we developed a methylation
risk model that could accurately classify BC patients with different
death risk. Subsequently, we identified 413 DEGs from the methylation
high-risk and low-risk groups. Function enrichment analysis indicated
that these DEGs were related with cell cycle checkpoint, ubiquitin-like
protein ligase binding. KEGG pathway analysis showed these genes were
mainly enriched in p53 signaling pathway, and TGF-β signaling pathway.
The above functions and pathways were common and critical in tumor
proliferation, invasion, and metastasis. And then, we further extracted
the expression of the 413 DEGs from the transcriptomic profiles of 986
BC patients to evaluate the prognostic roles of these DEGs in TCGA-BRCA
dataset. The prognostic value of individual genes or gene signatures
has been extensively studied in cancers ([201]Parker et al., 2009).
Herein, we got a methylation-based gene prognostic signature using
multivariate Cox analysis.
The six signature genes were composed of IRF2, ZDHHC9, KCNJ11, LRP11,
PCMT1, and TMEM70. IRF2, a transcription factor in the interferon gamma
signal transduction pathway, was different expression in breast cancer
and normal tissues. Kriegsman et al. found that IRF2, which positively
regulated the MHC class I pathway and negatively regulated PD-L1
expression, had good implications for immunotherapy and prognosis of BC
([202]Kriegsman et al., 2019). ZDHHC9, one of risk genes of BC, was
found to participate in palmitoylating PD-L1 to keep its protein
stability, leading to immune escape. Inhibiting the ZDHHC9 expression
made breast cancer cells susceptible to T cell killing and inhibited
tumor growth. Thus, ZDHHC9 could be a biomarker of immunotherapy
response ([203]Yang et al., 2019). KCNJ11 played a key role in
glucose-stimulated insulin secretion ([204]Cook and Hales, 1984). It is
well established that diabetes is closely related to a variety of
tumors ([205]Giovannucci et al., 2010), and the mortality is higher
among women with longer diabetes duration in BC ([206]Lega et al.,
2018). Therefore, diabetes-related genes KCNJ11 may also be a potential
prognostic biomarker of BC. Yet, the relationship between KCNJ11 and
breast cancer has not been systematically reported. PCMT1 has gradually
been considered as a risk gene in tumors. Study demonstrated that BC
patients with higher PCMT1 expression had a poorer prognosis ([207]Dong
et al., 2021). Furthermore, the different expression of the six
signature genes in mRNA and protein levels was validated in public
databases.
The methylation-based gene signature could also distinguish BC patients
with a significantly increased risk from those with a decreased risk.
Moreover, correlation analysis showed that the methylation of the six
signature CpGs were closely correlated with the expression of the six
signature genes, and the established gene risk score was significantly
positively correlated with the methylation risk score.
Multigene analysis has been popularized to predict the response of
anti-tumor therapy and prognosis in BC. For instance, the EndoPredict
score (12-gene molecular signature) has been used to predict the
survival without distant recurrence up to 15 years after diagnosis.
Recently, the 12-gene MS has also been proven to predict the response
to neoadjuvant chemotherapy (NaCT) and neoendocrine therapy (NET) in
HR+, her2- BC patients, with AUC values being 0.736 for NaCT, 0.726 for
NET ([208]Dubsky et al., 2020). Other widely used multigene assays
involve Oncotype Dx, MammaPrint, and PAM50 which have been validated to
predict the treatment response, recurrence, prognosis in BC patients.
Among these multi-gene tests, MammaPrint has the best predictive
performance (AUC = 0.88), following by Oncotype Dx (AUC = 0.76), PAM50
risk of relapse based on subtype (ROR-S) (AUC = 0.68) and the PAM50
risk of relapse based on subtype and proliferation (ROR-P) (AUC = 0.55)
([209]Grimm and Mazurowski, 2020). Our study developed methylation and
methylation-based prognostic signatures, both of which had excellent
performance in predicting the prognosis of BC patients with 3-years,
5-years AUC values being 0.739, 0.744 for methylation signature, and
0.725, 0.715 for methylation-based gene signature. Three TCGA-BRCA
subgroups were used to validate the gene prognostic signature and all
of them showed powerful prediction effects with 3-years AUC values of
0.757, 0.735, 0.733, respectively. Moreover, the external dataset
[210]GSE146558 was also used to validate our gene prognostic signature.
Due to small sample size (n = 106) and inter-dataset heterogeneity, we
could not obtain a higher AUC value in the validation set, although the
3-years AUC value being 0.634 was still statistically significant.
Increasing researches reported that CDH1, BRCA2, and BRCA1 were
susceptibility genes for BC ([211]Petridis et al., 2019) and around
60–70% of women with BRCA1 or BRCA2 gene mutations would be suffered
with BC in her lifetime ([212]Antoniou et al., 2003). Besides, BRCA2
mutation carriers were more likely to develop brain metastases than
non-carriers ([213]Song et al., 2020). PTEN is a tumor suppressor gene
in BC, and researches proved that lack or decrease of PTEN expression
might be associated with poor prognosis in BC ([214]Luen et al., 2018).
Then we examined the expression alteration of these genes in gene
high-risk and low-risk groups to understand the contribution of the
gene signature to the carcinogenesis of BC. As expected, the expression
of proto-oncogenes BRCA1, BRCA2, and CDH1 was significantly higher in
high-risk group. Conversely, expression of tumor suppressor gene PTEN
was significantly higher in low-risk group.
There were some advantages of our study. First of all, we were the
first one to discuss the prognostic roles of CpG loci in breast cancer,
and constructed methylation-associated signatures. Secondly, these two
prognostic signatures were positively correlated with each other and
both of them could accurately discriminate breast cancer patients with
different death risk. Besides, three subgroups of TCGA dataset and an
external dataset [215]GSE146558 were verified the prognostic value of
our gene signature. Finally, the above results, together with risk gene
expression verification, GSEA, drug sensitivity analysis, might provide
novel treatment and prognosis biomarkers for breast cancer patients. We
believe with the advent of the era of precision medicine, clinical
trials could be designed using gene signature-based risk scores to
select the patients most likely to develop poor prognosis in which to
develop novel or more intensive postoperative therapies in future.
One major limitation for our study was that data in our study was
downloaded from the public databases, the mechanism of the six
signature genes and six signature CpGs affecting the occurrence and
development of breast cancer still needs to be further verified by vivo
and vitro experiments. Even, prospective clinical trials are needed to
check the prognostic values of these two signatures.
Conclusion
Taken together, we proposed two methylation-related prognostic
signatures. These two signatures were significantly positively
correlated with each other and both of them could predict the prognosis
of BC patients more accurately than traditional clinical predictors.
Importantly, the six key genes (IRF2, ZDHHC9, KCNJ11, LRP11, PCMT1,
TMEM70) of gene prognostic signature may act as potential prognostic
biomarkers and therapeutic targets.
Data Availability Statement
The original contributions presented in the study are included in the
article/[216]Supplementary Material, further inquiries can be directed
to the corresponding authors.
Ethics Statement
Ethical review and approval was not required for the study on human
participants in accordance with the local legislation and institutional
requirements. Written informed consent for participation was not
required for this study in accordance with the national legislation and
the institutional requirements.
Author Contributions
Conceptualization, CZ, QW, and YZ; methodology, CZ, QW, and YZ;
Software, CZ and QW; Validation, CZ, ZZ, SZ, and DL; formal analysis,
CZ, and QW; investigation, CZ, QW, and SZ; resources, CZ, QW, and DL;
data curation, CZ and QW.; Visualization, CZ, QW, and NY.;
Writing—Original Draft Preparation, CZ, QW, and YZ; Writing—Review and
Editing, CZ, QW, and YZ; Supervision, CZ, QW, and YZ; Project
Administration, CZ, QW and YZ; All authors have read and agreed to the
published version of the manuscript.
Funding
This study was supported by a grant from the Leading Discipline
Construction Project of Oncology of Zhongnan Hospital of Wuhan
University, a grant from the Science, Technology and Innovation Seed
Fund of Zhongnan Hospital of Wuhan University (Grant No. znpy2018123),
and a grant from the National Natural Science Foundation of China
(81472799).
Conflict of Interest
The authors declare that the research was conducted in the absence of
any commercial or financial relationships that could be construed as a
potential conflict of interest.
Publisher’s Note
All claims expressed in this article are solely those of the authors
and do not necessarily represent those of their affiliated
organizations, or those of the publisher, the editors and the
reviewers. Any product that may be evaluated in this article, or claim
that may be made by its manufacturer, is not guaranteed or endorsed by
the publisher.
Supplementary Material
The Supplementary Material for this article can be found online at:
[217]https://www.frontiersin.org/articles/10.3389/fgene.2021.742578/ful
l#supplementary-material
Supplementary Figure S1
(A–F) The correlation between the six signature CpGs and their target
genes. (A) DAB2IP, (B) NCCRP1, (C) RWDD2B, (D) SEC61A2, (E) SEC61G, (F)
SNTB1; (G–L) Differential expression of target genes in the
hypermethylation and hypomethylation of the six signature CpGs. (G)
DAB2IP, (H) NCCRP1, (I) RWDD2B, (J) SEC61A2, (K) SEC61G, (L) SNTB1;
(M–R) The methylated levels of the CpGs of the target genes in breast
cancer and normal tissue in MethSurv. (M) cg13060154, (N) cg14992108,
(O) cg18678121, (P) cg10092957, (Q) cg05406101, (R) cg00945507; (S–X)
The survival analysis of the high- and low-methylation levels of the
CpGs of target genes in MethSurv. (S) SEC61A2, (T) RWDD2B, (U) DAB2IP,
(V) SEC61G, (W) NCCRP1, (X) SNTB1.
Supplementary Figure S2
(A) The methylated levels of the six signature CpGs between methylation
high-risk and low-risk groups of 560 breast cancer samples. (B–D)
Methylation prognostic model assessment in the testing group of 278
breast cancer samples. (B) Survival analysis between the high-risk and
low-risk groups. (C,D) The time-dependent receiver operating
characteristic (ROC) curves at 3 and 5 years.
Supplementary Figure S3
Survival analysis of the six signature genes.
Supplementary Figure S4
(A–I) The association of the methylation prognostic model and the gene
prognostic signature. (A) The correlation of the six signature genes
and the six signature CpGs. (B–G) The correlation between the β value
of the six CpGs and the gene risk scores. (H) The correlation between
methylation risk scores and gene risk scores. (I) The correlation of
the methylation prognostic model and gene prognostic model.
Supplementary Figure S5
(A–I) (A) The expression of the 6 signature genes between gene
high-risk and low-risk groups of 986 breast cancer samples; (B–E)
Validation of the gene prognostic signature in a BRCA-TCGA subgroups
composed of 557 samples, (B) Survival analysis, (C) ROC curve, (D)
Univariate Cox regression analysis, (e) Multivariate Cox regression
analysis; (F–H) Survival analysis in the other two BRCA-TCGA subgroups
and the [218]GSE146558 dataset, (F) survival analysis in the subgroups
composed of 494 samples, (G) Survival analysis in the subgroups
composed of 492 samples, (H) Survival analysis in the [219]GSE146558
dataset; (I–K) ROC curves in the other two BRCA-TCGA subgroups and the
[220]GSE146558 dataset, (I) ROC curve in the subgroups composed of 494
samples, (J) ROC curve the subgroups composed of 492 samples, (K) ROC
curve in the [221]GSE146558 dataset.
Supplementary Figure S6
The overall survival analysis of breast cancer patients with different
clinical characteristics in gene high-risk and low-risk groups.
[222]Click here for additional data file.^ (38MB, zip)
References