Abstract
Aberrant long noncoding RNA (lncRNA) expression and fatty acid
signaling dysfunction both contribute to hepatocellular carcinoma (HCC)
occurrence and development. However, the relationship and interaction
mechanism between lncRNAs and fatty acid signaling in HCC remain
unclear. Data regarding RNA expression and clinical outcomes for
patients with HCC were obtained from The Cancer Genome Atlas (TCGA),
HCCDB, and the Gene Expression Omnibus (GEO) databases. Hallmark
pathways were identified using the single-sample gene set enrichment
analysis (ssGSEA) method. ConsensusClusterPlus was used to establish a
consistency matrix for classifying samples into three subtypes. A risk
signature was established, and predictive values for key lncRNAs
related to prognosis were evaluated using Kaplan–Meier analysis and
receiver operating characteristic curves. The ESTIMATE algorithm,
MCP-Counter, and ssGSEA were used to evaluate the characteristics of
the tumor immune microenvironment. The CTRP2.0 and PRISM were used to
analyze drug sensitivity in HCC subtypes. We discovered seven
fatty-acid-associated lncRNAs with predictive prognostic capabilities,
including TRAF3IP2-AS1, SNHG10, [32]AL157392.2, LINC02641,
[33]AL357079.1, [34]AC046134.2, and A1BG-AS. Three subtypes were
obtained, which presented with differences in prognosis, clinical
information, mutation features, pathway traits, immune characteristics,
and drug sensitivity. The seven key lncRNAs identified in this study
might serve as promising biomarkers for predicting prognosis in
patients with HCC, and the three HCC subtypes classified according to
lncRNA expression profiles could improve HCC classification.
Subject terms: Cancer, Biomarkers
Introduction
Among malignant tumor types, liver cancer is one of the most
predominant causes of death, with a global incidence of fatality that
ranges from 600,000 to 800,000 per year^[35]1. Hepatocellular carcinoma
(HCC) is the most commonly encountered type of liver cancer, accounting
for approximately 80–90% of all liver cancer cases^[36]2. Various
therapeutic strategies have emerged for HCC, including traditional
surgery, radiotherapy, chemotherapy, vascular interventional
treatments, and immunotherapy^[37]3. However, the 5-year survival rate
among HCC patients remains unsatisfactory due to a tendency for
patients to be diagnosed after reaching an advanced stage^[38]4.
Therefore, further exploration of the underlying pathogenesis of HCC
and the identification of early diagnostic biomarkers are necessary
steps to improve the currently dismal prognosis.
Long noncoding RNAs (lncRNAs), which are described as RNAs longer than
200 nucleotides in length and do not encode protein product, have
recently attracted a great deal of attention^[39]5. Numerous studies
have revealed that lncRNAs participate in a variety of biological
processes by mediating gene expression^[40]6–[41]8. Abnormal lncRNA
expression impacts the initiation and development of various
diseases^[42]9–[43]11. The lncRNA PKMYT1AR has been shown to sponge
miR-485-5p, promoting PKMYT1 expression and supporting the maintenance
of cancer stem cells during non-small cell lung cancer^[44]12. In HCC
tissues, the lncRNA TSLNC8 expressed in lower levels than in normal
tissues, and the upregulation of TSLNC8 was shown to prevent
proliferation and metastasis in HCC^[45]13, suggesting its potential
for a prognostic predictor.
Increasing evidence has revealed that metabolic dysregulation serves as
a hallmark of tumorigenesis and progression in malignant
tumors^[46]14,[47]15. In particular, lipid metabolic reprogramming is
found to significantly alter in cancer cells. Fatty acid (FA)
accumulation plays crucial roles in membrane synthesis, energy storage,
and the generation of signaling molecules. Aberrant FA oxidation may be
significantly involved in the pathogenic mechanisms of multiple
cancers, including gastric cancer (GC)^[48]13 and breast cancer
(BC)^[49]16. The lncRNA AGAP2-AS1 is thought to modulate FA oxidation
to promote trastuzumab resistance in BC^[50]17,[51]18. In GC,
mesenchymal stem cells decreased the drug sensitivity of cancer cells
due to the effects on FA oxidation mediated by increased lncRNA
MACC1-AS1 expression^[52]19,[53]20. However, the associations between
lncRNAs and FA pathways in HCC remain uncharted.
In our study, gene expression profiles and clinical data for patients
with HCC were obtained from The Cancer Genome Atlas (TCGA), Gene
Expression Omnibus (GEO) and HCCDB databases. We used bioinformatics
tools to identify corresponding FA pathways and detected seven
FA-associated lncRNAs related to HCC prognosis. Samples from two HCC
cohorts were classified into three molecular subtypes based on the
expression patterns of these seven lncRNAs. We further explored the
biological characteristics and clinical significance of these newly
defined molecular subtypes. These findings contribute to the current
understanding of the relationships between lncRNAs and FA signaling in
HCC and provide guidance for significantly prolonging HCC prognosis.
Results
Identification of FA-associated lncRNAs
We adopted the ssGSEA method to calculate hallmark pathway scores for
samples obtained from the TCGA, HCCDB18, and [54]GSE14520 datasets,
followed by univariate Cox analyses to identify significant hallmark
pathways associated with prognosis. The log2 (HR) > 0 represents the
risk factor, and the log2 (HR) < 0 represents the protective factor.
FA-metabolism signaling in the TCGA, [55]GSE14520 datasets, and HCCDB18
showed a significant relationship with prognosis and served as a
protective factor in HCC (Fig. [56]1). The number of lncRNA in HCCDB18
as well as in the [57]GSE14520 dataset was less than 100, so we focused
on TCGA and [58]GSE76427 to identified corresponding lncRNA. Using the
methods described in the “[59]Materials and methods” section, 155
lncRNAs in the TCGA dataset and 663 lncRNAs in the [60]GSE76427
datasets were identified as FA-associated lncRNAs, indicating an
unsatisfactory consistency in the detection of lncRNAs associated with
FA activity between datasets obtained from different platforms. The
intersection of the TCGA and GSE764277 datasets contained 75 lncRNAs
(Supplementary Fig. [61]1A). There are numerous lncRNAs positively
enriched at the top of their respective ordered gene lists
(Fig. [62]2A–I), and we selected top nine lncRNAs to exhibit and
detected nine lnRNAs expression in L02 and Huh7 cell lines. Results
showed that the expression levels of [63]AC012499.1, DRAIC, LINC01625,
AF127577.4, [64]AC068631.1, LINC01124, and [65]AP003498.1 was highly
expressed in Huh7 cell compared to L02 cell (Fig. [66]2J).
Figure 1.
[67]Figure 1
[68]Open in a new tab
Hallmark pathways associated with prognosis (A) Univariate analysis of
hallmark pathways in the TCGA database. (B) Univariate analysis of
hallmark pathways in the [69]GSE14520 database. (C) Univariate analysis
of hallmark pathways in the HCCDB18 database.
Figure 2.
[70]Figure 2
[71]Open in a new tab
Validation of fatty acid-associated long noncoding RNAs (lncRNA)s.
(A–I) Graphical display of the gene set enrichment analysis (GSEA) for
top nine fatty acid-associated lncRNAs in fatty acid pathways
(TCGA-LIHC). (J) Relative expression levels of nine lncRNAs in Huh7 and
L02 cell lines.
Establishment of molecular types based on lncRNA expression
We used 743 lncRNAs from the combined TCGA and [72]GSE76427 datasets to
conduct univariate Cox regression analyses. Seven lncRNAs were
confirmed to significantly correlate with prognosis (Supplementary
Table [73]1), including TRAF3IP2-AS1, SNHG10, [74]AL157392.2,
LINC02641, [75]AL357079.1, [76]AC046134.2, and A1BG-AS. We clustered
liver cancer samples in the TCGA and [77]GSE76427 cohorts using
ConsensusClusterPlus and obtained three molecular subtypes according to
the cumulative distribution function (Fig. [78]3A,B). The prognostic
features of these three subtypes showed significantly different. In the
TCGA dataset, a better prognosis was associated with the C1 subtype
than with the other subtypes, and patients with the C3 subtype had the
shortest overall survival (OS) (Fig. [79]3C). A similar phenomenon was
observed in the [80]GSE76427 cohort (Supplementary Fig. [81]1D). These
results suggested that the three subtypes defined by FA-associated
lncRNA expression are consistent across diverse cohorts.
Figure 3.
[82]Figure 3
[83]Open in a new tab
The fatty acid-associated long noncoding RNA (lncRNA) subtypes in the
TCGA dataset and their clinical feature. (A) Cumulative distribution
function (CDF) curves for the TCGA cohort samples. (B) Heatmap of the
TCGA samples at consensus k = 3. (C) Prognostic overall survival (OS)
curves for the fatty acid-associated lncRNA subtypes in the TCGA
cohort. (D–F) Differences in the distributions of different clinical
features across the molecular subtypes.
Clinical and mutational characteristics of the molecular subtypes
In the TCGA dataset, we compared the distribution of different clinical
features across the three molecular subtypes, which revealed
significant differences in age, tumor grade, and T stage between
subtypes. Compared with C1 subtype, more patients older than 60 years
were observed in C3 subtype (Fig. [84]3D). In addition, we found
patients with C3 subtype might have worse tumor grade and T stage
(Fig. [85]3E,F). The above results implied poorer prognosis for
patients with C3 subtype. We examined the incidence of gene mutations
in each subtype and found that some gene mutations were correlated with
different prognostic outcomes for each subtype. The proportion of TP53
mutations was significantly higher for the C3 subtype, which was
associated with the worst prognosis, compared with the proportions in
the C1 and C2 subtypes. By contrast, the proportion of CTNNB1 mutants
was significantly lower in the C3 subtype than in C1 and C2
(Fig. [86]4A).
Figure 4.
[87]Figure 4
[88]Open in a new tab
Mutational features of the defined molecular subtypes and differences
in the hallmark signaling scores across three subtypes. (A)
Differential somatic mutation analysis across the three molecular
subtypes. (B) Boxplots showing tumor-related ssGSEA pathway scores in
the TCGA dataset.
Hallmark signaling scores for each subtype
We used ssGSEA to score the samples from the TCGA and [89]GSE76427
datasets and compared differences across subtypes. A total of 37 (74%)
of 50 possible hallmark pathways in the TCGA data set were found to
display significant differences across subtypes, including pathways
associated with hypoxia, tumor necrosis factor (TNF)-α signaling via
nuclear factor kappa B (NF-κB), and fatty-acid-metabolism signaling
(Fig. [90]4B). In the [91]GSE76427 dataset, 20 pathways with
significant differences between subtypes were identified out of 50
possible pathways, containing mototic-spindle, xenobiotic-metabolism,
oxidative-phosphorylation, fatty-acid-metabolism, bile-acid- metabolism
signaling (Supplementary Fig. [92]1E). We could recognize that
fatty-acid-metabolism pathway was hallmark signaling in three subtypes
and presented observable difference.
Immunological infiltration associated with each molecular subtype
To further clarify differences in the immune microenvironments
associated with molecular subtypes defined by the expression of
FA-associated lncRNAs, we evaluated the immune cell infiltration of
patients in two HCC cohorts based on gene expression levels. A variety
of immunocyte marker genes were identified in the literature^[93]21,
and ESTIMATE and MCP-Counter software were used to assess the
immunochemical environment. Differences in immune cell distributions
were observed among the molecular subtypes defined according to
FA-associated lncRNA expression. In TCGA cohorts, three subtypes in
Stromal score, Immune score and ESTIMATE score embodied remarkable
differences, and the C3 subtype, which was associated with poor
prognosis, displayed lower scores (Fig. [94]5A). The three molecular
subtypes displayed significant differences in part immune cell types,
such as T cells, CD8 T cell, Cytotoxic lymphocytes and B lineage cells
(Fig. [95]5B). Several immune-related pathways were found to be
downregulated in the C3 subtype, whereas the prognosis associated with
the immunophenotype associated with the C1 subtype was improved,
according to the ESTIMATE score (Fig. [96]5C). In [97]GSE76427 cohorts,
it was observed parallel findings. These results suggest that the
subtypes displayed stable and consistent molecular characteristics
(Supplementary Fig. [98]2).
Figure 5.
[99]Figure 5
[100]Open in a new tab
Immune features across the three subtypes in the TCGA datasets. (A)
Immune microenvironment scores across subtypes (TCGA). (B) Differential
infiltration of immune cells in three subtypes (TCGA). (C) Differences
in scores for immune-related pathways across the three subtypes.
These combined analyses showed that the C3 subtype was associated with
the worst prognosis and displayed the lowest immune score among all
three subtypes. To explore why the C3 subtype had the worst prognosis
and lowest immune scores, we assessed the distribution of immune
checkpoints in all three subtypes. In the TCGA dataset, 18 (38.30%) of
47 immune checkpoints was presented in significant differences between
subtypes, with most presenting high expression in the C3 subtype
associated with poor prognosis (Fig. [101]6A). By contrast, 15 (33.33%)
of the 45 immune checkpoints identified in the [102]GSE76427 dataset
were significantly different, most of which showed downregulated
expression in the C3 subtype (Supplementary Fig. [103]3A).
Figure 6.
[104]Figure 6
[105]Open in a new tab
Immunotherapy differences in molecular subtypes. (A) Differential
distribution of immune checkpoint expression across subtypes from the
TCGA. (B,C) Differences in chemokine and chemokine receptor expression
across subtypes in the TCGA. (D) Differences in TIDE and CAF scores and
survival for TCGA immunotherapy groups.
Chemokines play key roles in tumorigenesis and cancer development,
attracting multiple immune cells to the tumor microenvironment.
Chemokines are thought to assist T cell entry into tumors, influencing
the tumor response to the immune system and therapy. Thus, we analyzed
whether chemokines differentially expressed across these three
subtypes. We calculated differences in chemokine gene expression in the
TCGA cohort, as shown in Fig. [106]6C, which revealed that 18 of 41
chemokines (43.90%) were differentially expressed across subtypes.
Similarly, in the [107]GSE76427 cohort, we identified 10 (25%)
chemokines with significant differences across the three subtypes
(Supplementary Fig. [108]3C). These data suggest that the degree of
immune cell infiltration may vary across different metabolic subtypes,
which may contribute to observed differences in tumor progression and
immunotherapy efficacy. We also compared the expression of chemokine
receptor genes between the different metabolic subtypes. As shown in
Fig. [109]6B, in the TCGA data set, 5 of 18 (27.78%) chemokine receptor
genes showed significant differential expression across metabolic
subtypes. In the [110]GSE76427 dataset, 5 of 17 chemokine receptor
genes displayed significant differential expression across metabolic
subtypes (Supplementary Fig. [111]3B).
TIDE analysis
We used the TIDE to evaluate the potential clinical response to
immunotherapy in the three molecular subtypes defined using the TCGA
dataset. A higher TIDE predictive score indicated a higher probability
of immune escape, suggesting that patients were less likely to benefit
from immunotherapy. As shown in Fig. [112]6D, the TIDE score for the C3
subtype was significantly higher than the TIDE scores for the C1 and C2
subtypes in the TCGA cohort, suggesting a higher likelihood of immune
escape in the C3 subtype and a likely unsatisfactory response to
immunotherapy. Compared with the C3 subtype, the dysfunction scores for
the C1 and C2 subtypes were higher, whereas the exclusion scores of the
C1 and C2 subtypes were lower (Fig. [113]6D). We also observed similar
results when TIDE analysis was applied to the [114]GSE76427 dataset
(Supplementary Fig. [115]3D).
Functional enrichment analysis of the molecular subtypes
In exploration of clinical feature, we could find that significant
difference from C3 and C1 subtype were testified. To verify whether
functional differences exist between our defined molecular subtypes, we
compared the C3 and C1 subtypes using “limma” to identify
differentially expressed genes (DEGs). In the TCGA cohort, 557 DEGs
were identified, including 387 genes that were upregulated in C3 and
170 genes that were downregulated in C3 relative to C1. In the
[116]GSE76427 dataset, 564 DEGs were defined, including 309 genes
upregulated in C3 and 255 genes downregulated in C3 relative to C1. We
conducted KEGG pathway analysis and Gene Ontology (GO) functional
enrichment analysis on identified DEGs via the R package WebGestaltR.
In the GO analysis of the TCGA database, 266 pathways were associated
with biological processes (BP), 28 pathways were associated with
molecular functions (MF), and 60 pathways were associated with cellular
components (CC), and we selected the top 10 pathways for each analysis
(Fig. [117]7A–C). KEGG pathway enrichment analysis identified 13
significantly represented signaling pathways. Partial annotation
results demonstrated that DNA replication, cell cycle, and p53
signaling pathways, as well as tumor and immune-related pathways, were
strongly correlated with the identified DEGs (Fig. [118]7D). In
addition, metabolism-related pathways, including drug metabolism and
retinol metabolism, were significantly enriched.
Figure 7.
[119]Figure 7
[120]Open in a new tab
Functional enrichment of fatty acid-related long noncoding RNAs
(lncRNAs). (A–D) Gene Ontology (GO) and Kyoto Encyclopedia of Genes and
Genomes (KEGG) analysis of differentially expressed genes (DEGs) from
the TCGA dataset. (E) Gene set enrichment analysis (GSEA) of the C1 and
C3 subtypes (TCGA).
The GO functional annotation of DEGs in the [121]GSE76427 dataset
revealed 119 BP pathways, 31 MF pathways, and 13 CC pathways. KEGG
enrichment analysis revealed 21 significant pathways, including several
pathways related to cancer, such as cell cycle and microRNAs.
Metabolism-related pathways, including drug metabolism, metabolism of
xenobiotics by cytochrome P450, and FA elongation, were also
significantly enriched (Supplementary Fig. [122]4A–D).
To further investigate the biological pathways associated with the
different molecular subtypes, we used GSEA for pathway analysis,
comparing the C1 and C3 subtypes using all candidate gene sets included
in the KEGG database. Pathways related to metabolism, such as primary
blue acid biosynthesis and FA metabolism, were significantly enriched
in the C1 subtype relative to the C3 subtype in the TCGA dataset. In
addition, pathways such as homologous recombination and P53 signaling
were associated with the C3 subtype (Fig. [123]7E). The enrichment
results in the [124]GSE76427 dataset were consistent with the results
in the TCGA dataset (Supplementary Fig. [125]4E), and tumor-associated
pathways were remarkably enriched in the C2 subtype.
Comparison with existing molecular subtypes
The six immune infiltration subtypes identified in human tumors include
C1 (wound-healing), C2 (interferon-γ-dominant), C3 (inflammation), C4
(lymphocyte depletion), C5 (immunologically silenced), and C6
(transforming growth factor-β-dominant). Studies have shown that
existing C1, C2, and C6 are associated with poor prognosis. Most LIHC
patients in the TCGA-LIHC database are categorized as presenting with
existing C3 and C4 immune subtypes, whereas existing C5 immune subtype
was not detected in any sample in the HCC TCGA dataset. Survival curve
analysis showed significant differences in OS among these previously
defined subtypes, revealing poor prognosis associated with existing C1,
C2, and C4 (Fig. [126]8C). We further compared the sample distribution
between our molecular subtypes and existing subtypes, which revealed
that our C3 molecular subtype, which was associated with poor
prognosis, contained a higher proportion of immune subtypes associated
with poor prognosis, including existing C1, C2, and C4. Our molecular
subtypes C1 and C2, which were associated with better prognosis and
presented in larger proportions of existing C3 immune subtype
(Fig. [127]8A,B).
Figure 8.
[128]Figure 8
[129]Open in a new tab
Comparison between existing subtypes and our molecular subtypes and
drug sensitivity analysis. (A) Sanki diagram comparing our molecular
subtypes and existing molecular subtypes. (B) Comparison immune subtype
distributions across our three defined molecular subtypes. (C) Survival
curves associated with existing immune subtypes. (D) Kaplan–Meier (KM)
curves for high- and low-risk groups in the TCGA cohort. (E) Drug
sensitivity to chemotherapy in patients from the TCGA dataset based on
the CTRP2.0 database. (F) Drug sensitivity to chemotherapy in patients
from the TCGA dataset based on the PRISM database.
Potential drug therapy analysis
Based on the expression profiles of the seven identified lncRNAs, we
conducted a multivariate Cox analysis using the TCGA dataset to obtain
correlation coefficients. Samples were divided into high- and low-risk
groups relative to the median risk value. We also validated the
expression of these seven lncRNAs in the [130]GSE76427 dataset using
multivariate Cox regression. Prognosis was significantly different
between the high- and low-risk groups for both HCC datasets, indicating
that these seven lncRNAs might be useful for predicting prognosis in
HCC patients. We analyzed the drug sensitivity associated with these
lncRNAs by comparing the high-risk and low-risk groups. Compared with
the low-risk group, we found that the high-risk group was more
sensitive to five drugs in the CTRP2.0 database, including BI 2536,
CAY10618, GSK461364, paclitaxel, and SB 743921 (Fig. [131]8D–F).
Analogously, the high-risk group showed high sensitivity to
epothilone-b, ispinesib, LY2606368, and YM 155 in the PRISM database
(Supplementary Fig. [132]5A–C). These results broaden new perspectives
for drug targets for personalized treatment of HCC.
Discussion
Emerging evidence indicates that lncRNAs function as promoters or
suppressors that affect the occurrence and progression of
cancer^[133]10,[134]22. However, lncRNAs do not act alone, and the
contributions of other signaling factors are indispensable. FA is a
significant component of the lipid metabolic process and can be
synthesized or converted into complex lipid species, participating in
the activation of various signaling cascades. The meditation effects of
lncRNAs in FA signaling pathways have been slowly revealed in diverse
tumor types^[135]23–[136]25. However, there is not much corresponding
study regarding association between FA and lncRNAs in HCC.
We obtained samples from the TCGA, HCCDB18, and [137]GSE14520 databases
and identified significant hallmark pathways in HCC by performing
ssGSEA and Cox regression analyses, revealing the FA pathway as an
important hallmark pathway in HCC. The current investigation confirmed
that long-chain acyl CoA synthetase 4 (ACSL4) participates in FA
pathways to advance the progression of HCC, resulting in abnormal lipid
metabolism^[138]26. We identified 743 lncRNAs associated with FA
signaling in both the TCGA and GSE764277 datasets and selected the top
seven lncRNAs strongly correlated with prognosis, including
TRAF3IP2-AS1, SNHG10, [139]AL157392.2, LINC02641, [140]AL357079.1,
[141]AC046134.2. and A1BG-AS, some of which have previously been
reported to be involved in the pathogenesis of various cancers.
Compared with normal cells, TRAF3IP2-AS1 was found to be expressed at
low levels in response to the upregulation of the NONO-TFE3 fusion
protein in NONO-TFE3 translocation renal cell carcinoma (tRCC). The
downregulation of TRAF3IP2-AS1 resulted in reduced PTEN expression,
facilitating the progression of NONO-TFE3 tRCC by inducing the m6A
modification of PARP1 mRNA. SNHG10 is expressed at higher levels in
prostate cancer tissue than in normal tissue, contributing to a shorter
survival time among patients. Thus, screening for these lncRNAs may
provide additional information regarding the underlying molecular
mechanism involved in HCC development and progression.
We used ConsensusClusterPlus to examine the samples obtained from the
TCGA and GSE764277 cohorts, which defined three molecular subtypes: C1,
C2, and C3. These three subtypes were associated with differential
prognosis. To verify and understand the mechanisms contributing to
prognostic differentiation among these three subtypes, we focused on
differences in mutational features, comparisons with existing molecular
subtypes, and identifying functional enrichment pathways. The
occurrence of TP53 mutations was observed with a higher probability in
our C3 subtype, which is associated with the worst prognosis among the
three subtypes defined in this study. TP53 plays important roles in
cell cycle arrest, apoptosis, metabolism, DNA repair, and resistance to
chemotherapy and has been shown to mutate frequently in malignant
tumors^[142]27, and TP53 mutations are detected in nearly 30% of all BC
cases^[143]28. Our defined C3 subtype was also associated with the
previously defined C1, C2, and C4 immune infiltration subtypes, which
were associated with poor prognosis in patients. The high probability
of TP53 mutations and the presence of existing immune subtypes
associated with poor prognosis support and explain the poor prognosis
observed for our C3 subtype.
The GO and KEGG analyses identified DNA replication and cell cycle
pathways as being strongly associated with DEGs identified in the
comparison between our C3 and C1 subtypes. DNA replication is inactive
in most differentiated cells, and DNA replication proteins are
typically expressed at low levels^[144]29. However, cancer cells are
known to present with highly activated DNA replication processes. In BC
stem-like cells, mini-chromosome maintenance protein 10 (MCM10)
dramatically promotes DNA replication, compensating for DNA replication
stress associated with c-Myc induction^[145]30. The cell cycle
describes the vital biological process that controls the duplication of
genetic materials, cell division, and growth^[146]31. Cell cycle
dysfunction is commonly associated with disease occurrence and
contributes to the rapid proliferation and apoptosis properties
observed in multiple cancer types. The reduced activation of the
transcription factor friend leukemia integration 1 (FLI1) could
negatively affect the expression of cyclin D1 (CCND1) and E2F
transcription factor 2 (E2F2), resulting in the arrest of the cell
cycle at the G1/S phase and facilitating the progression of
non-neoplastic lung cells^[147]32. Therefore, we assume that these DEGs
might participate in tumor-related pathways, including DNA replication
and the cell cycle, contributing to shorter survival times in the C3
subtype.
Immunotherapy has become increasingly popular for the treatment of
numerous diseases; however, conventional immunotherapy is challenging
for liver diseases due to the specific immune response tolerance of the
liver^[148]33. Therefore, we used the ESTIMATE algorithm and
MCP-Counter software to estimate the immune scores of our three defined
subtypes. The immune scores of the C3 subtype were lower than those for
the C1 and C2 subtypes. Immune checkpoint analysis revealed that 38.30%
(TCGA dataset) and 33.33% of immune checkpoints (GSE764277 dataset)
were differentially expressed across the three molecular subtypes.
However, different immune checkpoints were identified as differentially
expressed in the C3 subtype between these two datasets.
Chemokines refer to small cytokines and signaling proteins secreted by
diverse cells that are involved in enhancing the antitumor response of
the tumor microenvironment and have been shown to be beneficial for
cancer patients^[149]34. The upregulation of the chemokines CXCL10 and
CXCR3 have been positively correlated with satisfactory outcomes in
patients with HCC^[150]35. Interferon regulatory factor 1 (IRF-1)
induces HCC apoptosis through CXCL10/CXCR3 signaling^[151]36. To
understand immunotherapy efficiency in our defined C1, C2, and C3
subsets, we also analyzed the expression of chemokines and chemokine
receptor genes and found remarkable differences in expression profiles
across these three subtypes, which might provide clearer guidance for
the application of immune targeted therapy in HCC patients. The seven
identified lncRNAs were used to construct novel predictive prognostic
models for HCC patients.
Our study has some limitations. The validation and elucidation of the
complex mechanisms underlying differences in prognosis between these
three subtypes require additional experiments. The specific
associations between the expression profiles of the identified lncRNAs
and survival time among HCC patients also require further exploration.
In conclusion, we stratified two HCC cohorts according to key lncRNAs
associated with FA signaling pathways and defined C1, C2, and C3
subtypes with significant differences in prognosis. This classification
method has been further verified in the clinical characteristics,
mutation feature, and immune microenviroment, and made a more
comprehensive classification of patients with hepatocellular carcinoma,
which was conducive to the personalized diagnosis and treatment of
patients. The poor prognostic outcomes associated with our C3 subtype
might be robustly associated with alterations in the DNA replication
and cell cycle pathways, indicating the need for further studies
examining the involvement of FA-associated lncRNAs in these processes.
In addition, the seven identified lncRNAs (TRAF3IP2-AS1, SNHG10,
[152]AL157392.2, LINC02641, [153]AL357079.1, [154]AC046134.2, and
A1BG-AS) might guide the further exploration of prognostic biomarkers
in HCC.
Materials and methods
Data collection and data preprocessing
RNA sequence data and clinical information for patients with liver
cancer (LIHC) were obtained from TGCA, and the data were processed as
follows. First, samples without survival information were removed, and
tumor samples were retained. Next, Ensembl gene IDs were matched with
GeneSymbol. We downloaded the ICGC-LIRI-JP datasets from the HCCDB
website ([155]http://lifeome.net/Database/hccdb/home.html). The
[156]GSE76427 and [157]GSE14520 data sets were downloaded from the GEO
database ([158]https://www.ncbi.nlm.nih.gov/geo/). Samples with living
time and survival status were retained and SEQMAP was used to
re-annotate genes in two datasets. We obtained the gene transfer format
files for the V32 annotation from the GENCODE website
([159]https://www.gencodegenes.org/), which we used to divide the TCGA
expression profiles and [160]GSE76427 data into mRNA and lncRNA. lncRNA
expression data shared by the TCGA data and the [161]GSE76427 data were
retained. We also accessed gene sets from the H.all.v7.4.symbols.gmt
file from the gene set enrichment analysis (GSEA) website^[162]37.
Identification of significant hallmark pathways in HCC
We utilized the single-sample GSEA (ssGSEA) method to calculate
hallmark pathway scores for samples included in the TCGA, HCCDB18, and
[163]GSE14520 datasets. The hallmark pathways related to HCC prognosis
were identified by univariate Cox analysis based on the scores obtained
for each data set individually, and the intersection of important
hallmark pathways identified in all three data sets was used to define
shared hallmark pathways.
Screening lncRNAs associated with FA signaling
To validate effective lncRNAs involved in FA signaling, we adopted an
integrated pipeline based on the processes described by several
studies^[164]38,[165]39. Briefly, we used correlations between mRNA and
lncRNA expression levels to estimate target mRNA, which were ranked in
descending order. The “fgsea” R package was utilized to analyze the
ordered gene list and identify whether lncRNAs related to FA pathways
were enriched at the top or bottom of gene lists. After estimating the
total enrichment score (TES) for FA signaling from the entire lncRNA
population, several lncRNAs with significant TES values were considered
to be FA-associated lncRNAs, in accordance with the permutation test
framework. In expression matrixes for lncRNA and mRNA, an lncRNA i and
an mRNA j detected in n patients were expressed as LNC(i) = (lnc[1],
lnc[2], …, lnc[n]) and M(j) = (m[1], m[2], …, m[n]), respectively. We
used the ESTIMATE R package to quantify tumor purity in n patients,
defined as P = (p[1], p[2], …, p[n]). The first-order partial
correlation coefficient (PCC) between an lncRNA i and an mRNA j was
calculated by eliminating the effects of tumor purity:
[MATH: PCC(ij)=Rlncm-Rlncp×Rmp1-<
/mo>R2lncp×1-R2mp
:MATH]
where Rlncm, Rlncp, and Rmp are the respective Pearson’s correlation
coefficients between an lncRNA i and an mRNA j, between an lncRNA i and
tumor purity p, and between an mRNA j and tumor purity p. Subsequently,
the P-value for PCC (ij), defined as P(ij), was measured as follows:
[MATH: Pij=2×pnorm(-PCCij×n-31<
mo>-PCC2ij) :MATH]
where Pnorm is the normal distribution function, and n is the number of
samples. For an lncRNA i, the rank index (RI) for an mRNA j was
calculated as:
[MATH: RIij=-lnPij×signPCCij<
/mrow> :MATH]
The sign function is a mathematical function for extracting the signs
for PCC (ij). All mRNAs were sorted in descending RI order to perform
GSEA. The genes associated with FA signaling are presented as an
ordered gene list. For an lncRNA i, the enrichment score and P-value
(adjusted by the false-discovery rate [FDR]) were assessed using the
“fgsea” R package and then combined into a TES:
[MATH: TESi=1-2Pi<
mspace width="0.166667em">×signESi :MATH]
Therefore, TES values ranged from − 1 to 1, and lncRNAs with an
absolute TES value > 0.95 and an FDR < 0.05 were validated as
FA-associated lncRNAs.
Validation of lncRNA subtypes related to FA
Expressed lncRNAs associated with FA pathways were input into
ConsensusClusterPlus^[166]40 to construct a consistency matrix for
classifying TCGA and [167]GSE76427 samples into subtypes^[168]41. The
Kaplan–Meier (KM) algorithm and maximum distance were used to measure
distance and perform 500 Bootstraps. Each bootstrap process included
80% of the data as a training set. We selected the number of clusters
as 2 to 10 and calculated the consistency matrix and the consistency
cumulative distribution function to determine the best classification
method.
GSEA and functional annotation
To explore biological process pathways associated with various
molecular subtypes, we used the Kyoto Encyclopedia of Genes and Genomes
(KEGG)^[169]21 database to perform GSEA. In addition, we adopted the
WebGestaltR (v0.4.4) software package to present functional annotations
of different genes.
Tumor immune microenvironment analysis
We evaluated the immune infiltration scores of samples using ESTIMATE,
MCP-Counter, and ssGSEA and compared differential distributions across
different subtypes. Tumor Immune Dysfunction and Exclusion (TIDE
[170]http://tide.dfci.harvard.edu/), a computational framework, was
used to estimate the ability of tumor immune escape from the gene
expression profiles of cancers samples.
Construction of risk models based on lncRNA expression profiles
To estimate the prognostic abilities of critical lncRNAs, multivariate
regression analysis was conducted on samples from the TCGA dataset,
which resulted in a correlation coefficient for each lncRNA. The
following formula was used to calculate the risk score for each
patient: risk score = (βi × EXPi), where i represents the expression
levels of lncRNAs associated with FA, and β is the coefficient of the
gene for the corresponding lncRNA based on univariate Cox regression.
Patients were divided into high- and low-risk groups according to the
median risk score. The KM method was used to generate a survival curve
for prognosis, and a quota test was used to determine significant
differences.
Analysis of drug sensitivity
Chemotherapeutic sensitivity was predicted for the high- and low-risk
groups using the CTRP2.0 and PRISM databases, which contain sensitivity
data for 481 and 1448 compounds, respectively. Area under the receiver
operating characteristic curve (AUC) values were determined to measure
drug sensitivity in these two datasets, with a lower AUC value
indicating enhanced sensitivity to treatment^[171]42. Cell line
expression profiles obtained from the Cancer Cell Line Encyclopedia
database were used as the training set for predicting drug sensitivity,
and TCGA-LIHC was used as the test set.
Cell culture and quantitative reverse transcription PCR (qRT-PCR)
We obtained L02 and Huh7 cell lines from Chinese Academy of Sciences
(Shanghai, China). These cells were cultured in DMEM (Gibco)
supplemented with 10% FBS (Gibco) 10% fetal bovine serum at 37 °C under
5% CO[2] in a humidified incubator. Total RNA was isolated using TRIzol
reagent (Invitrogen). A HiScript II QRT SuperMix Kit (Vazyme, China)
was used to synthesize the cDNA. A ChamQ SYBR qPCR Master Mix (Vazyme,
China) was used for real-time PCR with a Light Cycler 96 detection
system (Roche). The information of primers sequences can be found in
supplementary Table [172]2.
Supplementary Information
[173]Supplementary Legends.^ (19.8KB, docx)
[174]Supplementary Figure 1.^ (3.1MB, tif)
[175]Supplementary Figure 2.^ (2.1MB, tif)
[176]Supplementary Figure 3.^ (4MB, tif)
[177]Supplementary Figure 4.^ (3.6MB, tif)
[178]Supplementary Figure 5.^ (943.9KB, tif)
[179]Supplementary Tables.^ (15.8KB, docx)
Author contributions
All of the authors worked collaboratively on the work presented here.
Y.H. and W.G. designed the review. Y.X. and Q.Z. wrote this manuscript.
X.Y. searched the articles and made figures. All authors read and
approved the final manuscript.
Funding
This work was supported by the National Natural Science Foundation of
China (81902832), Leading Talents of Zhongyuan Science and Technology
Innovation (214200510027), Henan Provincial Medical Science and
Technology Research Plan (SBGJ202102117 and SBGJ2018002), Henan Medical
Science and Technology Joint Building Program (LHGJ20210324), Science
and Technology Innovation Talents in Henan Universities (19HASTIT003),
Outstanding Foreign Scientist Studio in Henan Province (GZS2020004),
and the Gandan Xiangzhao Research Fund (GDXZ2022002).
Data availability
The datasets used and analyzed during the current study available from
the corresponding author on request.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
Contributor Information
Yuting He, Email: fccheyt1@zzu.edu.cn.
Wenzhi Guo, Email: fccguowz@zzu.edu.cn.
Supplementary Information
The online version contains supplementary material available at
10.1038/s41598-022-23681-0.
References