Abstract
Ovarian cancer (OV) is a common type of carcinoma in females. Many
studies have reported that ferroptosis is associated with the prognosis
of OV patients. However, the mechanism by which this occurs is not well
understood. We utilized Genotype-Tissue Expression (GTEx) and The
Cancer Genome Atlas (TCGA) to identify ferroptosis-related genes in OV.
In the present study, we applied Cox regression analysis to select hub
genes and used the least absolute shrinkage and selection operator to
construct a prognosis prediction model with mRNA expression profiles
and clinical data from TCGA. A series of analyses for this signature
was performed in TCGA. We then verified the identified signature using
International Cancer Genome Consortium (ICGC) data. After a series of
analyses, we identified six hub genes (DNAJB6, RB1, VIMP/ SELENOS,
STEAP3, BACH1, and ALOX12) that were then used to construct a model
using a training data set. The model was then tested using a validation
data set and was found to have high sensitivity and specificity. The
identified ferroptosis-related hub genes might play a critical role in
the mechanism of OV development. The gene signature we identified may
be useful for future clinical applications.
Subject terms: Cancer, Biomarkers, Diseases, Oncology, Risk factors
Introduction
Ovarian cancer is a severe threat to the health of females and is one
of the gynecological cancers that can be fatal^[32]1. In some developed
countries, such as the United States, it has been the top 5 causes of
cancer-related death for females^[33]2. There are four main kinds of
ovarian cancers, accounting for almost all advanced stage cases:
ovarian serous cystadenocarcinoma (OV), peritoneal carcinoma,
carcinosarcoma, and mixed carcinoma^[34]3. OV is the most common in
clinical work and has been well-documented interpatiently^[35]4,
intertumorally^[36]5, and intratumorally^[37]6. However, the mechanism
is not well understood. The early symptoms of OV are latent, and the
early diagnostic methods of OV are not mature, which leads to the fact
that most OV is diagnosed at an advanced stage^[38]7. Due to the high
heterogeneity of OV, the prognostic prediction seems challenging.
Furthermore, there are huge differences between early-stage OV and
advanced-stage OV in the treatment efficacy and prognosis^[39]6.
Therefore, it is urgent to develop prognostic models.
Ferroptosis is a unique form of regulated cell death associated with
iron metabolism^[40]8. The lethal accumulation of lipid peroxidation
accelerates ferroptosis. Furthermore, Liang et al.^[41]9 reported that
ferroptosis is involved in cancer apoptosis, which provides a potential
therapeutic target for the treatment of malignancies.
Ferroptosis-related genes can be divided into the following three
groups: drivers, suppressors, and markers. Previous studies have
reported that ferroptosis plays a crucial role in OV tumor-initiating
cells in vivo^[42]10. Recent studies have revealed that some
ferroptosis-related genes, including TFR1, IL6, and SCD1^[43]11, are
correlated with OV development and apoptosis. However, there are still
no studies about the relationship between ferroptosis-related genes and
OV patient prognosis.
In this study, we used the mRNA expression data and clinical OV patient
data from TCGA and IGCG. Moreover, we identified ferroptosis-related
differentially expressed genes (DEGs) by comparing OV mRNA expression
and regular ovarian tissue expression from GTEx. Then, we constructed a
prognostic DEG signature with TCGA data and verified the multigene
signature in an IGCG Australian OV patient cohort. After constructing
the gene signature, we tested the model with Cox analysis to predict OV
patient prognosis. Finally, we performed GO (Gene Ontology) enrichment
analysis of the OV patient high-risk subgroup to explore the potential
ferroptosis-related gene-associated pathways in OV.
Methods
Data collection
The mRNA expression data and corresponding clinical information of 379
OV patients were downloaded from TCGA
([44]https://portal.gdc.cancer.gov/repository) on September 10, 2020.
The mRNA expression data and corresponding clinical data of 88 normal
ovarian tissues were downloaded from GTEx ([45]https://gtexportal.org/)
on September 10, 2020. We applied the normalization strategies offered
in the "limma" R packages
([46]https://bioconductor.org/packages/limma/)^[47]12 to the gene
expression of the two different databases. The RNA-seq profiles and
clinical data were acquired from the IGCG Australian OV patient cohort
([48]https://dcc.icgc.org/releases/current/Projects/OV-AU) on September
15, 2020. We strictly obeyed the guidelines of the three databases.
We screened the FerrDb database ([49]http://www.zhounan.org/ferrdb/)
and obtained 259 ferroptosis-related genes^[50]13. These genes were
grouped and are shown in Table [51]1.
Table 1.
Group of the ferroptosis-related genes.
Data set Gene count
Driver 108
Suppressor 69
Marker 111
[52]Open in a new tab
There 28 genes repeat count more than one group.
Construction and validation of a prognostic ferroptosis-related gene
signature
After data normalization, we used the "limma" R package^[53]12 to
identify the ferroptosis-related DEGs between OV in the TCGA cohort and
normal ovarian tissues in GTEx. We set a false discovery rate
(FDR) < 0.05 as a criterion. For primary screening, we performed a
univariate Cox analysis of overall survival for each
ferroptosis-related DEG. A p value < 0.05 was considered the cutoff for
the genes with prognostic values. To avoid overfitting in the gene
signature, we needed to simplify the signature as much as possible. The
least absolute shrinkage and selection operator (LASSO) regression
could offer a suitable choice for the selection of potential genes. We
carried out LASSO regression with the DEGs, which was subjected to
univariate Cox analysis using the R package "glmnet". The parameter λ
decided the complexity of the model. Λ was defined as the penalty
regularization, which was obtained at minimum partial likelihood
deviance. Each patient's risk score was based on the LASSO analysis
results and the expression of each gene. The risk score was calculated
using the following formula: riskScore =
[MATH:
esum(univariateCoxanalys
is∗correspondingcoef
ficient<
mo stretchy="false">) :MATH]
. According to the median value, the patients were divided into a
high-risk group and a low-risk group according to the signature they
scored. We performed principal component analysis (PCA) and
t-distributed stochastic neighbor embedding (tSNE) with the "stats" R
package for the two groups. Moreover, we drew a series of pictures to
visualize the differences between the two groups. Moreover, we
constructed Kaplan–Meier (KM) survival plots to evaluate the prognostic
value and receiver operating characteristic (ROC) curves with different
time cutoffs to evaluate the predictive efficiency.
Examination of the gene signature in another database
We applied the formula to the patients in the IGCG Australian OV
patient cohort who had a clear outcome indicator five years after
diagnosis. The patients were divided into high-risk and low-risk groups
based on the median risk score. PCA and tSNE were also carried out in
the next step. We performed similar tests to examine whether the model
could determine the significance of the prognostic prediction.
Functional enrichment and immune analysis
We performed Kyoto Encyclopedia of Genes and Genomes
(KEGG)^[54]14–[55]16 pathway enrichment analysis of data from TCGA and
IGCG. Through pathway enrichment analysis, we explored the potential
ferroptosis-related gene-associated pathways in OV development.
Moreover, we calculated the infiltration score of 16 kinds of immune
cells. We performed single-sample gene set enrichment analysis (ssGSEA)
for further study of the 13 immune-related pathways. ssGSEA was
finished with the help of the 'gsva' R package^[56]17.
Verification of the clinical values of the signature
First, we carried out univariate Cox analysis for the risk score and
other critical clinical factors to decide whether the risk score could
be an independent prognostic factor. We took a risk score as a novel
classification indicator for further research and compared several
characteristics between the high-risk and low-risk groups. The above
work was completed with the help of R software. A p value < 0.05
indicated that the difference was statistically significant.
Results
A total of 379 OV patients were from the TCGA-OV cohort, and 82 OV
patients were from the IGCG (OV-AUS) cohort. We downloaded the mRNA
expression data and clinical files from both databases. Some samples
were excluded for kinds of reasons that were not suitable for a further
research.
Identification of prognostic ferroptosis-related DEGs in TCGA cohort
DEGs from the comparison between OV and healthy ovaries accounted for
most of the ferroptosis-related genes (199/259). We performed
univariate Cox regression analysis for the DEGs to select prognostic
ferroptosis-related DEGs. Eighteen genes passed the univariate Cox
regression analysis with the cutoff of p < 0.05 (Fig. [57]1a). Six
genes (DNAJB6, RB1, VIMP/SELENOS, STEAP3, BACH1, and ALOX12) had a
consistent mRNA expression level in tumor samples and prognosis in the
univariate Cox analysis (Fig. [58]1b). The other 12 genes were excluded
from further study. First, we presented the correlation between the six
genes (Fig. [59]1d). Then, we constructed the six preserved genes and
associated gene network with the help of GeneMANIA
([60]http://genemania.org/)^[61]18 in Fig. [62]1c. We performed the KM
plot in the KM plotter
([63]https://kmplot.com/analysis/index.php?p=service&cancer=ovar)^[64]1
9. DNAJB6, RB1, STEAP3, BACH1, and ALOX12 were associated with
prognosis (supplementary figure [65]S1), while VIMP/SELENOS might be a
prognosis indicator in some cases (supplementary figure [66]S2).
Figure 1.
[67]Figure 1
[68]Open in a new tab
screening for the hub genes. (a) The heatmap of differently expressed
genes in the normal and tumor group (b) the univariate Cox analysis of
the genes equipped with a consistent tendency in mRNA expression in
tumor samples and prognosis (c) the network of hubgenes and related
genes with the function enrichment (d) the co-expression network
between hub-genes..
Construction of a six-gene signature in TCGA cohort
The more genes the signature included, the more complex the signature
was. We performed LASSO Cox regression analysis of the six genes. All
the genes were preserved after LASSO (Fig. [69]2b), and we constructed
a six-gene signature based on the ideal λ (Fig. [70]2a). The risk score
=
[MATH: e :MATH]
(0.347*expression level of RB1 + 0.368*expression level of
STEAP3 + 0.712*expression level of BACH1 + 0.401*expression level of
ALOX12-0.685*expression level of DNAJB6-0.421*expression level of VIMP/
SELENOS). According to the median value of the risk score, we divided
the patients into high-risk and low-risk groups (Fig. [71]3a). For
further work, we drew a Kaplan–Meier curve (Fig. [72]3b) and a survival
plot (Fig. [73]3c). Moreover, We listed some baseline information of
patients in high- and low-risk groups (Table [74]2). We found that OV
patients with high-risk scores had a higher death rate and a
significantly worse overall survival (OS) than those with low-risk
scores (supplementary figure [75]S3). As for progression-free survival
(PFS), representing the possible benefits for patients, patients in the
high-risk group still suffered more from disease progression and death
events (supplementary figure [76]S5). The ROC curves showed the
predictive value of the signature. The area under the curve (AUC)
reached 0.602 at three years and 0.710 at five years (Fig. [77]3d),
which indicated that the predictive performance of the signature worked
well. The PCA and t-SNE analyses indicated that the signature could
divide the OV patients into two groups (Fig. [78]3e,f). Further
analysis of TCGA data is shown in the [79]supplementary materials, the
high-risk group was also associated with advanced TNM stage.
Figure 2.
[80]Figure 2
[81]Open in a new tab
LASSO Cox regression analysis. (a) Partial likelihood deviance was
plotted versus log (lambda). The vertical dotted line indicates the
lambda value with the minimum error and the largest lambda value in
which deviance was within one standard error of the minimum. (b) LASSO
coefficient profiles of genes associated with survival of in patients
with Ovarian cancer. LASSO, least absolute shrinkage and selection
operator.
Figure 3.
[82]Figure 3
[83]Open in a new tab
Prognostic analysis of the six-gene signature model in the TCGA cohort.
(a) the distribution and median value of the risk scores in the TCGA
cohort. (b) Kaplan-Meier curves for the OS of patients in the high-risk
group and low-risk group in the TCGA cohort.. (c) the distributions of
OS status, OS and risk score in the TCGA cohort. (d) AUC of
time-dependent ROC curves verified the prognostic performance of the
risk score in the TCGA cohort (e) PCA plot of the TCGA cohort. (f)
t-SNE analysis of the TCGA cohort.
Table 2.
Baseline information of the patients in high- and low-risk groups.
High-risk Low-risk p value
Age Media 59 (30–87) 58 (40–87) 0.6961
Grade G1 1 0 0.294
G2 21 16
G3 145 156
G4 1 0
Stage Stage I 0 1 0.8215
Stage II 14 4
Stage III 127 146
Stage IV 32 23
Tumor residual < 10 mm 110 113 0.8092
10–20 mm 16 8
> 20 mm 30 36
Venous invasion Yes 32 27 0.2727
No 16 16
OS Events 107 87 0.0312
Without events 67 88
PFS Events 109 91 0.0444
Without events 65 84
[84]Open in a new tab
Bold values represents p<0.05 which means there exists significant
differences bewteen the two groups
Validation of the six-gene signature in the IGCG cohort
To ensure the robustness of the signature constructed in TCGA cohort,
we applied the same formula mentioned above (Fig. [85]4). The patients
who had a clear outcome in the five-year follow-up period were divided
into high-risk and low-risk groups based on the median riskScore in the
IGCG cohort (Fig. [86]4a). The PCA and t-SNE analyses were similar to
the result of the TCGA cohort in that the signature could separate the
two groups in opposite directions (Fig. [87]4e,f). Similarly, patients
in the low-risk subgroup shared a better prognosis (Fig. [88]4b,c). The
AUC of the six-gene signature in the IGCG was 0.648 at three years and
0.707 at five years (Fig. [89]4d). The test results mentioned above in
the IGCG cohort revealed that the signature passed the examination.
Figure 4.
[90]Figure 4
[91]Open in a new tab
Validation of the six-gene signature in the ICGC cohort. (a) the
distribution and median value of the risk scores in the ICGC cohort.
(b) Kaplan-Meier curves for the OS of patients in the high-risk group
and low-risk group in the ICGC cohort. (c) the distributions of OS
status, OS and risk score in the ICGC cohort. (d) AUC of time-dependent
ROC curves verified the prognostic performance of the risk score in the
ICGC cohort. (e) PCA plot of the ICGC cohort. (f) t-SNE analysis of the
ICGC cohort.
Independent prognostic and application values of the six-gene signature
We carried out Cox regression analysis among the key variables in the
clinical work and our signature to explore the application value
(Fig. [92]5). In the univariate Cox regression analysis, the risk score
was the only independent prognostic factor in TCGA (HR = 2.914 95% CI:
1.916 to 4.434, adj. p < 0.001) (Fig. [93]5a) and IGCG (HR = 1.323 95%
CI: 1.009 to 1.735, adj. p = 0.043) (Fig. [94]5b). For further
research, we performed multivariate Cox regression analysis for a
series of available variables, including age, grade, and risk score
(Fig. [95]5c). Moreover, we established a nomogram based on the
multivariate Cox regression analysis results to predict the probability
of OV patient survival at 3 and 5 years (Fig. [96]6a). We generated
calibration curves to evaluate the nomogram and obtained an ideal match
(Fig. [97]6b,c), which indicated that the model could be applied in the
clinic.
Figure 5.
[98]Figure 5
[99]Open in a new tab
Results of the univariate Cox regression analyses regarding OS in the
(a) TCGA derivation cohort and (b) the ICGC validation. (c)
multivariate Cox regression analyses regarding OS in the TCGA
derivation cohort.
Figure 6.
[100]Figure 6
[101]Open in a new tab
The application of the gene signature. (a) Nomogram to predict the
survival of patients with OV(b) Calibration curve of the nomogram in 3
years. (c) Calibration curve of the nomogram in 5 years.
Gene enrichment analysis in the OV cohort
To determine the pathways and functions correlated with the risk score,
GO (Gene Ontology) and KEGG analysis was carried out (Table [102]3).
The GO analysis results indicated that the enrichment was associated
with immunity, including "leukocyte cell–cell adhesion", "lymphocyte
differentiation" and "T cell differentiation" (Fig. [103]7). The KEGG
results were in Table [104]2^[105]20–[106]26. And we had already got
permission to use the KEGG software from the Kanehisa laboratory.
Table.3.
KEGG enrichment analysis of the hub genes.
ID Description Gene ratio p value p adjust
hsa04060 Cytokine-cytokine receptor interaction^[107]19 6/27 3.74E−05
0.003406
hsa04810 Regulation of actin cytoskeleton^[108]20 5/27 0.000626
0.027426
hsa04151 PI3K-Akt signaling pathway^[109]21 6/27 0.00092 0.027426
hsa04550 Signaling pathways regulating pluripotency of stem
cells^[110]22 4/27 0.001206 0.027426
hsa05140 Leishmaniasis^[111]23 3/27 0.002068 0.037644
hsa05410 Hypertrophic cardiomyopathy^[112]24 3/27 0.003227 0.048945
hsa05414 Dilated cardiomyopathy^[113]25 3/27 0.003873 0.050348
[114]Open in a new tab
Figure 7.
[115]Figure 7
[116]Open in a new tab
The GO (Gene Ontology) analysis for the differently expressed genes
between high-risk and low-risk subgroups..
To further explore the correlation between the signature and immune
status, we quantified the enrichment scores of common immune cell
subpopulations, related functions, or pathways with ssGSEA
(Fig. [117]8). A series of immune cells were significantly different
between the low-risk subgroup and high-risk subgroup in the TCGA cohort
(adj. p < 0.05), which led to differences in immune functions, such as
the type II IFN response and CCR (chemokine receptor). Treg and type II
IFN responses were validated in the IGCG cohort (adj. p < 0.05),
consistent with the GO analysis above.
Figure 8.
[118]Figure 8
[119]Open in a new tab
Comparison of the ssGSEA scores between different risk groups in the
TCGA cohort (a, b) and ICGC cohort (c, d). The scores of 16 immune
cells (a, c) and 13 immune-related functions (b, d) are displayed in
boxplots. CCR, cytokine-cytokine receptor. Adjusted P values were
showed as: ns, not significant; *, p < 0.05; **, p < 0.01; ***, p <
0.001.
Discussion
Our study systematically researched ferroptosis-related genes in OV
tumor tissues and their influence on prognosis. A novel six-gene
signature was based on LASSO regression analysis, and the signature was
validated with an external test set. We made full use of the value of
the signature in predicting OS by nomogram and the KM-plots showed that
our signature was a reliable factor for OV patients' prognosis both in
OS and disease progression. Functional analyses indicated that the
ferroptosis-related genes in OV were enriched in immune-related
pathways.
Although a few previous studies have reported that several genes are
associated with OV and some could be potential treatment targets, they
did not pay enough attention to the correlation between
ferroptosis-related genes and prognosis. To our surprise, most
ferroptosis-related genes were significantly expressed between OV and
healthy ovaries, which revealed that ferroptosis might play a key role
in OV and the possibility of constructing a predictive signature with
ferroptosis-related genes.
Six genes (DNAJB6, RB1, VIMP/ SELENOS, STEAP3, BACH1, and ALOX12) were
involved in the model; these genes could be roughly divided into three
groups: drivers (DNAJB6, BACH1 and ALOX12), suppressors (RB1) and
markers (VIMP/ SELENOS and STEAP3).
BTB and CNC homology 1 (BACH1) is a ubiquitously expressed
transcription factor. BACH1 plays a vital role in a series of pathways
and biological processes, including oxidative stress, heme oxidation,
the cell cycle, hematopoiesis, and immunity. Han et al.^[120]27
reported that high expression of BACH1 activates p-AKT and promotes
ovarian cancer growth as a transcriptional regulator both in vitro and
in vivo. Peng et al.^[121]28 and Rebbeck et al. ^[122]29 stated that
BACH1 was involved in the BRCA1 damage response related to an increased
risk of OV.
RB1 was reported to be related to defective DNA repair^[123]30. RB1
loss could lead to longer-term survival for OV patients. This verified
the result that high expression of RB1 indicated a poorer prognosis in
our research. Lin et al.^[124]31 declared ALOX12 related to RB1 in a
genome-wide map of humans. Chu et al.^[125]32 reported that ALOX12 is
required for p53-mediated tumor suppression through a distinct
ferroptosis pathway. The study challenged several research articles,
including ours, and concluded that ALOX12 facilitated the development
of tumors^[126]33–[127]35. Similar to RB1, ALOX12 functions in the
ferroptosis pathway through p53. The specific mechanism of RB1, ALOX12,
and p53 in ferroptosis needs more research.
A mammalian relative of DNAJ (DNAJB6) belongs to the 40 families of
heat shock proteins. Zhang et al.^[128]36 found an essential axis in OV
where DNAJB6 is located in a vital position in the functional axis that
is regulated by upstream BRCA1 and that regulated downstream KLF4. As
mentioned above, BACH1 could protect BRCA1, which indicated that BACH1
had indirect impacts on DNAJB6. The potential correlation between BACH1
and DNAJB6 revealed that the miR-9/BRCA1/DNAJB6/KLF4/AKT1 axis might
have a greater impact on the development of OV than we believed.
STEAP3 was closely related to iron homeostasis. Isobe et al.^[129]9
reported that STEAP3 maintains tumor growth under hypoferric
conditions. This gene joins the regulation of iron homeostasis and
inflammatory responses. Men et al.^[130]37 found that VIMP/ SELENOS was
associated with cell death and the cell cycle in insulinoma cells.
However, there have been few studies on these genes in OV. What roles
they play in OV cell development and apoptosis still needs to be
explored.
Although several research studies have declared that ferroptosis plays
a crucial role in development and apoptosis, the mechanisms remain
elusive. According to the DEGs between the high-risk and low-risk
groups, we performed GO enrichment analysis. To our surprise, several
immune-related pathways were enriched. The discovery revealed that
ferroptosis might exert an influence on OV through tumor immunity.
However, few modulations have been reported in the association between
ferroptosis and tumor immunity.
Interestingly, the T cells related to immune cells were significantly
different in the two subgroups. Moreover, the type II IFN response was
the only significantly different immune function in both databases. We
speculated that ferroptosis cells promoted the inhibition of the
activity of effector T cells by regulating Treg cells. Yin et
al.^[131]38 and Magdalena et al.^[132]39 declared that increased
macrophages and Treg cells in tumor tissue would result in a poorer
prognosis in OV patients, consistent with our results. Based on immune
gene enrichment, we need to pay more attention to abnormal T cell
behaviors in OV patients, and immune treatment deserves more research
in future work.
Limitations
The results were obtained from the data expression matrix. They were
not proved by in vivo and clinical studies. Moreover, there might be
several essential genes missed in multiple continuous processes.
Conclusion
Our research constructed a model based on six ferroptosis-related
genes. The signature performed well in predicting the prognosis of OV
patients both in the training set and test set. The risk score could be
an independent factor associated with OS. The potential mechanisms
between ferroptosis and tumor immunity in OV are still unclear, and T
cell-related immunity changes in OV deserve more investigation.
Supplementary Information
[133]Supplementary Information.^ (2.1MB, doc)
Acknowledgements