Abstract Liver hepatocellular carcinoma (LIHC) strongly contributes to global cancer mortality, highlighting the need for a deeper understanding of its molecular mechanisms to enhance patient prognosis and treatment approaches. We aimed to investigate the differential expression of immunogenic cell death-related genes (ICDRGs) and cellular senescence-related genes (CSRGs) in LIHC and their effects on patient prognosis. We combined the [28]GSE25097, [29]GSE46408, and [30]GSE121248 datasets by eliminating batch effects and standardizing the data. After processing, 16 genes were identified as ICDR&CSR differentially expressed genes (ICDR&CSRDEGs), including UBE2T, HJURP, PTTG1, CENPA, and FOXM1. Gene set enrichment analysis indicated a strong enrichment of these genes in pre-Notch expression and processing. Gene set variation analysis revealed 20 pathways with significant differences between the LIHC and control groups. Mutation analysis identified TP53 as the most commonly mutated gene in LIHC samples. A prognostic risk model integrating 12 ICDR&CSRDEGs was developed, showing high precision at 1 year but diminished accuracy at 2 and 3 years. Our constructed prognostic risk model provides valuable insights for predicting patient outcomes and may guide future therapeutic interventions targeting these specific genes. Further research is needed to explore the mechanistic roles of these genes in LIHC progression and treatment response. Keywords: Liver hepatocellular carcinoma, Immunogenic cell death, Cellular senescence, Risk score Subject terms: Liver cancer, Cancer genomics Introduction Liver hepatocellular carcinoma (LIHC) is a prevalent cancer with high incidence and mortality worldwide. LIHC constitutes more than 80% of liver cancers, ranking among the top three in 46 countries and top five in 90 countries^[31]1. The limited accuracy of biomarkers for early diagnosis and prognosis often leads to the detection of most LIHC cases at advanced stages, restricting effective treatment options and resulting in high mortality rates. The shift in epidemiological trends among at-risk patients, notably the increase in non-viral causes, has significantly affected primary prevention, monitoring, and treatment strategies^[32]2. Consequently, identifying new biomarkers and therapeutic targets is crucial to improve the prognosis in patients with LIHC. Immunogenic cell death (ICD) and cellular senescence are critical biological processes that have garnered much attention in cancer research. ICD is a regulated cell death process that triggers an immune response to dead cell antigens, which is crucial for antitumor immunity^[33]3. Cellular senescence, marked by permanent cell cycle arrest, initially suppresses tumorigenesis but may later promote tumor progression through a senescence-associated secretory phenotype^[34]4. Recent research underscores the pivotal roles of ICD and cellular senescence in tumor progression and treatment responses across multiple cancers, such as liver, colorectal, breast, lung, ovarian, and prostate cancers and melanoma^[35]5–[36]7. ICD is remarkable in the context of cancer treatment because it improves the effectiveness of chemotherapy in colorectal cancer by facilitating the recruitment and activation of dendritic cells and cytotoxic T lymphocytes^[37]8. Cellular senescence is linked to chemotherapy resistance in breast cancer cells, indicating that targeting senescence pathways may enhance treatment efficacy^[38]9. These findings underscore the potential of ICD and cellular senescence as therapeutic targets in cancer treatment. However, the role of immunogenic cell death-related genes (ICDRGs) and cellular senescence-related genes (CSRGs) in LIHC remains largely unexplored. Given the aggressive nature of LIHC and limited effectiveness of current therapies, investigating the differential expression of ICDRGs and CSRGs in LIHC may provide valuable insights into novel prognostic biomarkers and therapeutic targets. In this study, we aimed to investigate the differential expression of ICDRGs and CSRGs in LIHC and their effects on patient prognosis. To this end, we integrated data from three Gene Expression Omnibus datasets ([39]GSE25097, [40]GSE46408, and [41]GSE121248) with TCGA-LIHC dataset from The Cancer Genome Atlas (TCGA). We utilized DESeq2 to identify differentially expressed genes (DEGs) and intersected these with ICDRGs and CSRGs to derive genes associated with ICD and cellular senescence (ICDR&CSRGs). We performed functional enrichment analyses using gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), gene set enrichment analysis (GSEA), and gene set variation analysis (GSVA). A prognostic risk model was developed using Cox regression analysis. Single-sample GSEA (ssGSEA) was employed to evaluate immune-cell infiltration, and the prognostic model underwent validation through multiple statistical techniques. Our research revealed significant expression of ICDR&CSRDEGs in LIHC, with potential implications for patient prognosis. Functional enrichment analyses identified the involvement of these genes in key biological processes and pathways. The prognostic risk model demonstrated robust predictive accuracy, and the immune-cell infiltration analysis provided insights into the tumor immune microenvironment. The results underscore the importance of ICD and cellular senescence in LIHC and suggest that targeting these pathways could improve therapeutic outcomes for patients with LIHC. Results Technology roadmap A flow chart for the comprehensive analysis of ICDR&CSRDEGs is shown in Fig. [42]1. Fig. 1. Fig. 1 [43]Open in a new tab Flow chart showing the comprehensive analysis of ICDR&CSRDEGs. TCGA, The Cancer Genome Atlas; LIHC, Liver Hepatocellular Carcinoma; DEGs, Differentially Expressed Genes; ICDR&CSRGs, Immunogenic Cell Death-Related and Cellular Senescence-Related Genes; ICDR&CSRDEGs, Immunogenic Cell Death-Related and Cellular Senescence-Related Differentially Expressed Genes; SM, Somatic Mutation; CNV, Copy Number Variation; GSEA, Gene Set Enrichment Analysis; GSVA, Gene Set Variation Analysis; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes. Merging of GEO datasets for LIHC We used the R package sva to remove batch effects from the [44]GSE25097, [45]GSE46408, and [46]GSE121248 datasets, which were then integrated for further analysis. The TCGA-LIHC dataset was analyzed separately without batch effect removal. Boxplots and principal component analysis (PCA) plots were used to assess the datasets before and after the removal of batch effects (Fig. [47]2A–D). The results of the grouping of boxplots and PCA plots showed that batch effects in the LIHC dataset samples were largely eliminated after this process. Fig. 2. [48]Fig. 2 [49]Open in a new tab Batch effect removal of [50]GSE25097, [51]GSE46408, and [52]GSE121248. (A) Box plot of dataset distribution before batch effect removal. (B) Box plot of the combined GEO dataset distribution after batch processing. (C) PCA plot of the datasets before debatching. (D) PCA map of the combined GEO datasets after batch processing. The liver hepatocellular carcinoma (LIHC) dataset [53]GSE25097 is indicated in yellow, [54]GSE46408 in green, and [55]GSE121248 in red. PCA, Principal Component Analysis; GEO, Gene Expression Omnibus. ICDR&CSRDEGs TCGA-LIHC dataset was divided into LIHC and control groups. Differential gene expression between the groups was analyzed using the DESeq2 R package. The analysis identified 1,286 DEGs with |logFC| > 3 and adjusted p-value (p.adj) < 0.05, comprising 1,205 upregulated (logFC > 3) and 81 downregulated (logFC < -3) genes, as shown in the volcano plot in Fig. [56]3A. Fig. 3. [57]Fig. 3 [58]Open in a new tab Differential gene expression analysis for TCGA-LIHC datasets. (A) Volcano plot of differential gene expression analysis between the LIHC and control groups in TCGA-LIHC dataset. (B) DEGs and Venn diagram of ICDR&CSRGs in TCGA-LIHC dataset. (C) Heatmap of ICDR&CSRDEGs in TCGA-LIHC dataset. (D) Chromosomal localization map of ICDR&CSRDEGs. TCGA, The Cancer Genome Atlas; LIHC, Liver Hepatocellular Carcinoma; DEGs, Differentially Expressed Genes; ICDR&CSRGs, Immunogenic Cell Death-Related and Cellular Senescence-Related Genes; ICDR&CSRDEGs, Immunogenic Cell Death-Related and Cellular Senescence-Related Differentially Expressed Genes. Control group (gray) and LIHC group (yellow). ICDR&CSRDEGs were identified by finding the intersection of DEGs with |logFC| > 3 and p.adj < 0.05 and ICDR&CSRGs. A Venn diagram (Fig. [59]3B) illustrates the identification of the 16 ICDR&CSRDEGs, namely, UBE2T, HJURP, PTTG1, CENPA, FOXM1, E2F1, TERT, CDKN2A, CCNA2, SIX1, FOS, SFN, HMGA2, SOX2, RBP2, and DLX2. A heatmap generated with the R package pheatmap (Fig. [60]3C) was used to visualize the expression differences of ICDR and CSRDEGs across different sample groups in TCGA-LIHC dataset. The chromosomal locations of the 16 ICDR&CSRDEGs were analyzed and visualized with the R package RCircos, producing a chromosomal localization map (Fig. [61]3D). The map revealed that numerous ICDR&CSRDEGs, such as CENPA, DLX2, and HJURP, are situated on chromosome 2. GO and KEGG pathway enrichment analyses of LIHC GO and KEGG pathway enrichment analyses were performed to explore the relationships between the 16 ICDR&CSRDEGs and LIHC, focusing on biological processes, cellular components, molecular functions, and pathways. The results are detailed in Table [62]1. Table 1. Results of GO and KEGG enrichment analyses for ICDR&CSRDEGs. ONTOLOGY ID GeneRatio BgRatio p-value p.adj q-value BP GO:0044772 6/16 440/18,800 1.04255 e-06 0.000654461 0.000352088 BP GO:0044843 5/16 255/18,800 1.70655 e-06 0.000654461 0.000352088 BP GO:0045786 5/16 387/18,800 1.30495 e-05 0.002804538 0.001508789 BP GO:1902806 4/16 188/18,800 1.60455 e-05 0.002804538 0.001508789 BP GO:1901987 5/16 415/18,800 1.82825 e-05 0.002804538 0.001508789 CC GO:0000792 3/16 69/19,594 2.26488 e-05 0.000405066 0.000119921 CC GO:0032993 4/16 220/19,594 2.53166 e-05 0.000405066 0.000119921 CC GO:0005667 4/16 483/19,594 0.000524321 0.005592763 0.001655752 CC GO:0098687 3/16 366/19,594 0.003021249 0.024169991 0.007155589 CC GO:0000776 2/16 146/19,594 0.006179102 0.029798215 0.00882184 MF GO:0001216 5/16 466/18,410 3.52655 e-05 0.002539114 0.001113646 MF GO:0001221 3/16 108/18,410 0.000104002 0.003744059 0.001642131 MF GO:0030291 2/16 33/18,410 0.000368074 0.007175706 0.00314724 MF GO:0001223 2/16 40/18,410 0.000541821 0.007175706 0.00314724 MF GO:0001228 4/16 462/18,410 0.000560525 0.007175706 0.00314724 KEGG hsa05166 6/13 222/8164 5.52991 e-07 3.58245 e-05 2.53054 e-05 KEGG hsa04110 5/13 126/8164 9.42749 e-07 3.58245 e-05 2.53054 e-05 KEGG hsa04218 4/13 156/8164 8.01905 e-05 0.002031494 0.001434989 KEGG hsa01522 3/13 98/8164 0.00043963 0.008352979 0.005900304 KEGG hsa05219 2/13 41/8164 0.001853391 0.022834407 0.016129567 [63]Open in a new tab GO, Gene Ontology; BP, Biological Process; CC, Cellular Component; MF, Molecular Function; KEGG, Kyoto Encyclopedia of Genes and Genomes; ICDR&CSRDEGs, Immunogenic Cell Death-Related and Cellular Senescence-Related Differentially Expressed Genes. The analysis revealed that the 16 ICDR&CSRDEGs were mainly enriched in cell cycle regulation and transition processes, such as the G1/S phase, negative regulation of the cell cycle, and mitotic phase transitions. These genes were linked to cellular components such as the kinetochore, heterochromatin, chromosomal region, protein–DNA complex, and transcription regulator complex. The enriched molecular functions comprised protein serine/threonine kinase inhibitor activity, transcription coactivator binding, DNA-binding transcription activator activity, and RNA polymerase II-specific DNA-binding transcription activator activity. These genes participated in biological pathways including bladder cancer, endocrine resistance, cellular senescence, cell cycle, and human T-cell leukemia virus 1 infection. Bubble plots were used to visualize the results of the GO and KEGG pathway enrichment analyses (Fig. [64]4A). Network diagrams illustrating biological processes, cellular components, molecular functions, and pathways were generated using GO and KEGG enrichment analyses (Fig. [65]4B–E). Fig. 4. [66]Fig. 4 [67]Open in a new tab GO and KEGG enrichment analyses for ICDR&CSRDEGs. (A) Bubble plot of GO and KEGG pathway enrichment analyses of ICDR&CSRDEGs showing the BPs, CCs, MFs, and biological pathways (pathway); GO and KEGG terms on the abscissa. (B–E) The GO and KEGG pathway enrichment analysis results of ICDR&CSRDEGs. Network diagram showing BP (B), CC (C), MF (D), and KEGG pathways (E). Yellow nodes represent items, gray nodes represent molecules, and lines represent relationships between items and molecules. ICDR&CSRDEGs, Immunogenic Cell Death-Related and Cellular Senescence-Related Differentially Expressed Genes; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; BP, Biological Process; CC, Cellular Component; MF, Molecular Function. The screening criteria for GO and KEGG pathway enrichment analyses are p.adj < 0.05 and false discovery rate value (q-value) < 0.25; the Benjamini–Hochberg p-value correction method is applied. GSEA of LIHC GSEA was performed to evaluate the influence of gene expression levels on LIHC progression using TCGA-LIHC dataset. We analyzed TCGA-LIHC dataset to explore the association between gene expression and their functions in biological processes, cellular components, and molecular functions (Fig. [68]5A), with detailed results in Table [69]2. The analysis demonstrated a notable enrichment of all genes in pertinent biological functions and signaling pathways, such as pre-Notch expression and processing. Figure [70]5B illustrates the Tcf-dependent signaling activated by Wnt (Fig. [71]5C), oxidative stress-induced senescence (Fig. [72]5D), and Prc2 methylation of histones and DNA (Fig. [73]5E). Fig. 5. [74]Fig. 5 [75]Open in a new tab GSEA for TCGA-LIHC datasets. (A) GSEA of four biological functions or pathway bubble plots are presented for TCGA-LIHC dataset. (B–E) GSEA shows that LIHC is significantly enriched in pre-Notch expression and processing (B). Tcf-dependent signaling in response to Wnt (C), oxidative stress-induced senescence (D), and Prc2 methylation of histones and DNA (E). LIHC, Liver Hepatocellular Carcinoma; GSEA, Gene Set Enrichment Analysis. The screening criteria for GSEA are p.adj < 0.05 and false discovery rate value (q-value) < 0.25; the p-value correction method is Benjamini–Hochberg. Table 2. Results of GSEA for TCGA-LIHC datasets. ID SetSize Enrichment score NES p-value p.adj q-value REACTOME_PRE_NOTCH_EXPRESSION_AND_PROCESSING 103 0.56739 1.62748 6.48 e-05 1.16 e-03 8.93 e-04 REACTOME_TCF_DEPENDENT_SIGNALING_IN_RESPONSE_TO_WNT 226 0.46370 1.37310 2.02 e-03 1.64 e-02 1.27 e-02 REACTOME_OXIDATIVE_STRESS_INDUCED_SENESCENCE 117 0.52020 1.50315 7.76 e-04 7.89 e-03 6.10 e-03 REACTOME_PRC2_METHYLATES_HISTONES_AND_DNA 67 0.65783 1.82978 3.30 e-06 9.91 e-05 7.66 e-05 PID_PLK1_PATHWAY 46 0.76118 2.04124 6.77 e-09 4.39 e-07 3.39 e-07 REACTOME_RESOLUTION_OF_SISTER_CHROMATID_COHESION 125 0.66172 1.92092 1.00 e-10 1.23 e-08 9.51 e-09 PID_AURORA_B_PATHWAY 39 0.72948 1.91869 1.32 e-06 4.68 e-05 3.62 e-05 WP_GASTRIC_CANCER_NETWORK_1 27 0.76820 1.91080 7.92 e-06 2.05 e-04 1.59 e-04 REACTOME_MITOTIC_SPINDLE_CHECKPOINT 110 0.66266 1.90437 2.98 e-10 3.33 e-08 2.57 e-08 REACTOME_POLO_LIKE_KINASE_MEDIATED_EVENTS 16 0.83093 1.89761 2.37 e-05 4.89 e-04 3.78 e-04 REACTOME_UNWINDING_OF_DNA 12 0.87025 1.87720 2.22 e-05 4.63 e-04 3.58 e-04 REACTOME_CONDENSATION_OF_PROPHASE_CHROMOSOMES 68 0.67223 1.86828 2.53 e-07 1.11 e-05 8.61 e-06 REACTOME_ACTIVATION_OF_THE_PRE_REPLICATIVE_COMPLEX 33 0.72407 1.86612 1.85 e-05 3.98 e-04 3.08 e-04 REACTOME_ACTIVATION_OF_ATR_IN_RESPONSE_TO_REPLICATION_STRESS 37 0.71022 1.86265 2.54 e-05 5.12 e-04 3.96 e-04 REACTOME_ACTIVATED_PKN1_STIMULATES_TRANSCRIPTION_OF_AR_ANDROGEN_RECEPTO R_REGULATED_GENES_KLK2_AND_KLK3 61 0.67490 1.85803 9.55 e-07 3.51 e-05 2.71 e-05 REACTOME_DNA_METHYLATION 59 0.67426 1.85486 2.14 e-06 6.93 e-05 5.35 e-05 REACTOME_ASSEMBLY_OF_THE_ORC_COMPLEX_AT_THE_ORIGIN_OF_REPLICATION 63 0.66665 1.83933 1.33 e-06 4.68 e-05 3.62 e-05 REACTOME_GOLGI_CISTERNAE_PERICENTRIOLAR_STACK_REORGANIZATION 14 0.82451 1.82402 1.07 e-04 1.72 e-03 1.33 e-03 BIOCARTA_MCM_PATHWAY 18 0.78358 1.82082 1.30 e-04 1.96 e-03 1.51 e-03 REACTOME_MEIOTIC_RECOMBINATION 80 0.64659 1.82058 6.36 e-07 2.48 e-05 1.92 e-05 [76]Open in a new tab GSEA, Gene Set Enrichment Analysis; TCGA, The Cancer Genome Atlas; LIHC, Liver Hepatocellular Carcinoma; NES, Normalized Enrichment Score. GSVA of LIHC GSVA was conducted to examine the differential expression of the h.all.v2023.1.Hs.symbols.gmt gene set between the LIHC and control groups within TCGA-LIHC dataset (Table [77]3). Pathways with p.adj < 0.05 and the top 10 highest and lowest logFC values were selected and illustrated using group comparison boxplots (Fig. [78]6A). GSVA revealed 20 pathways with significant differences (p < 0.001) between the LIHC and control groups, including xenobiotic metabolism, coagulation, bile acid metabolism, TNFα signaling via NF-κB, inflammatory response, fatty acid metabolism, KRAS signaling upregulation, IL-6 JAK-STAT3 signaling, hypoxia, early estrogen response, PI3K-AKT-mTOR signaling, mTORC1 signaling, protein secretion, mitotic spindle, DNA repair, unfolded protein response, MYC targets (versions 1 and 2), G2/M checkpoint, and E2F targets. A heatmap was used to analyze and visualize the differential expression of these pathways (Fig. [79]6B). Table 3. Results of the GSVA for TCGA-LIHC datasets. ID logFC AveExpr p-value p.adj HALLMARK_XENOBIOTIC_METABOLISM 6.25 e-01 1.22 e-01 1.54 e-33 3.84 e-32 HALLMARK_COAGULATION 5.89 e-01 1.66 e-01 4.35 e-35 2.18 e-33 HALLMARK_BILE_ACID_METABOLISM 5.13 e-01 8.05 e-03 9.54 e-18 3.67 e-17 HALLMARK_TNFA_SIGNALING_VIA_NFKB 4.73 e-01 1.55 e-01 2.26 e-18 9.41 e-18 HALLMARK_INFLAMMATORY_RESPONSE 4.40 e-01 1.05 e-01 4.14 e-15 1.22 e-14 HALLMARK_FATTY_ACID_METABOLISM 4.04 e-01 3.83 e-02 4.28 e-14 1.13 e-13 HALLMARK_KRAS_SIGNALING_UP 3.57 e-01 1.30 e-01 7.97 e-14 1.99 e-13 HALLMARK_IL6_JAK_STAT3_SIGNALING 3.54 e-01 1.91 e-01 4.86 e-12 1.11 e-11 HALLMARK_HYPOXIA 3.39 e-01 1.22 e-01 1.01 e-14 2.81 e-14 HALLMARK_ESTROGEN_RESPONSE_EARLY 3.05 e-01 1.37 e-01 8.38 e-13 1.99 e-12 HALLMARK_PI3K_AKT_MTOR_SIGNALING 3.63 e-01 1.62 e-01 3.30 e-15 1.03 e-14 HALLMARK_MTORC1_SIGNALING 4.07 e-01 9.89 e-02 2.77 e-17 9.88 e-17 HALLMARK_PROTEIN_SECRETION 4.96 e-01 3.31 e-02 1.93 e-18 8.76 e-18 HALLMARK_MITOTIC_SPINDLE 5.17 e-01 8.94 e-02 7.76 e-20 3.88 e-19 HALLMARK_DNA_REPAIR 5.34 e-01 1.50 e-01 3.23 e-26 3.23 e-25 HALLMARK_UNFOLDED_PROTEIN_RESPONSE 5.75 e-01 2.93 e-02 2.14 e-27 2.68 e-26 HALLMARK_MYC_TARGETS_V1 5.89 e-01 1.51 e-01 3.96 e-24 2.83 e-23 HALLMARK_MYC_TARGETS_V2 6.40 e-01 4.51 e-02 9.72 e-24 6.08 e-23 HALLMARK_G2M_CHECKPOINT 6.72 e-01 3.04 e-02 3.10 e-25 2.59 e-24 HALLMARK_E2F_TARGETS 7.28 e-01 6.12 e-02 4.23 e-28 7.05 e-27 [80]Open in a new tab GSVA, Gene Set Variation Analysis; TCGA, The Cancer Genome Atlas; LIHC, Liver Hepatocellular Carcinoma. Fig. 6. [81]Fig. 6 [82]Open in a new tab GSVA for TCGA-LIHC datasets. (A, B). Boxplot (A) and complex numerical heatmap (B) of group comparison of GSVA results in the LIHC and control groups. LIHC, Liver Hepatocellular Carcinoma; GSVA, Gene Set Variation Analysis. Gray is the Control group, and yellow is the LIHC group. ***p < 0.001. The screening criteria for GSVA are p.adj < 0.05 and positive or negative top 10 logFC. Somatic mutation (SM) and copy number variation (CNV) analysis We utilized the R package maftools to statistically analyze and visualize mutation data for all genes in TCGA-LIHC dataset (Fig. [83]7A). Nine primary types of SMs were identified, with missense mutations being the most prevalent. Single nucleotide polymorphisms (SNPs) were the dominant mutation types across all genes in the LIHC group, with C to T mutations being the most frequent single nucleotide variants. We examined the SM status of all genes in the LIHC samples, ranked them by mutation frequency, and visualized the top 20 most frequently mutated genes (Fig. [84]7B). TP53 exhibited the highest mutation rate (28%). Fig. 7. [85]Fig. 7 [86]Open in a new tab SM and CNV analysis. (A) SM presentation of all genes in LIHC samples from TCGA-LIHC dataset. (B) SM presentation (top 20 genes) of all genes in LIHC samples from TCGA-LIHC dataset. (C–D) ICDR&CSRDEGs with CNVs are shown in the LIHC samples from TCGA-LIHC dataset. TCGA, The Cancer Genome Atlas; LIHC, Liver Hepatocellular Carcinoma; ICDR&CSRDEGs, Immunogenic Cell Death-Related and Cellular Senescence-Related Differentially Expressed Genes; SM, Somatic Mutation; SNV, Single Nucleotide Variant; SNP, Single Nucleotide Polymorphism; CNV, Copy Number Variation. We obtained and combined CNV data to examine the CNVs of 16 ICDR&CSRDEGs in LIHC samples from TCGA-LIHC dataset. The analysis, which was conducted using GISTIC2.0, revealed CNVs in 16 ICDR&CSRDEGs across 344 LIHC samples. The CNV status of the 16 genes—UBE2T, HJURP, PTTG1, CENPA, FOXM1, E2F1, TERT, CDKN2A, CCNA2, SIX1, FOS, SFN, HMGA2, SOX2, RBP2, and DLX2—was subsequently visualized (Fig. [87]7C–D). Creation of a prognostic risk model for LIHC A prognostic risk model for LIHC was created using univariate Cox regression analysis on 16 ICDR&CSRDEGs. Variables with a p-value under 0.10 from this analysis were displayed in a forest plot (Fig. [88]8A). The univariate Cox regression model identified 12 ICDR&CSRDEGs as significant, namely, UBE2T, HJURP, PTTG1, CENPA, FOXM1, E2F1, CDKN2A, CCNA2, SFN, HMGA2, SOX2, and RBP2. A multivariate Cox regression analysis was performed on these genes to evaluate their prognostic significance in the LIHC risk model (Table [89]4), with the results displayed in a forest plot (Fig. [90]8B). A risk score was derived from the risk coefficients of the multivariate Cox regression analysis and calculated using the following formula: graphic file with name d33e1611.gif Fig. 8. [91]Fig. 8 [92]Open in a new tab Univariable and multivariable Cox analyses. (A) Forest plot of 16 immunogenic cell death-related and cellular senescence-related differentially expressed genes (ICDR&CSRDEGs) in the univariate Cox regression model. (B) Forest plot of the multivariate regression model. Table 4. Results of the univariate and multivariate Cox regression analyses for TCGA-LIHC datasets. Characteristics Total (N) Univariate analysis Multivariate analysis Hazard ratio (95% CI) p-value Hazard ratio (95% CI) p-value Sex 349 0.760 (0.523–1.10) 0.152 Female 111 Reference Male 238 0.887 (0.602–1.306) 0.544 Age 349 1.25 (0.862–1.80) 0.241 ≤ 60 years 171 Reference > 60 years 178 1.301 (0.890–1.902) 0.174 Stage 349 1.42 (0.868–2.31) < 0.001 Stage I 173 Reference Stage II 86 1.164 (0.706–1.920) 0.551 Stage III 85 1.906 (1.217–2.985) 0.005 Stage IV 5 4.384 (1.321–14.551) 0.016 Risk score 349 2.72 (2.04–3.63) < 0.001 2.427 (1.789–3.292) < 0.001 [93]Open in a new tab TCGA, The Cancer Genome Atlas; LIHC, Liver Hepatocellular Carcinoma; CI, Confidence Interval. Significant values are in [bold]. Analysis of prognostic risk models for hepatocellular carcinoma Figure [94]9A illustrates a time-dependent receiver operating characteristic (ROC) curve that was produced using the risk score for LIHC in TCGA-LIHC dataset. The LIHC prognostic risk model demonstrated moderate accuracy at 1 year (AUC between 0.7 and 0.9) and reduced accuracy at 2 and 3 years (AUC between 0.5 and 0.7). Fig. 9. [95]Fig. 9 [96]Open in a new tab Prognostic analysis. (A) Time-dependent ROC curves at 1, 2, and 3 years. (B–C) Forest plot of the risk score and clinical information in the univariate Cox regression model (B) and multivariate Cox regression model (C). (D) Nomogram of the risk score and clinical information in the univariate and multivariate Cox regression model. (E) Kaplan–Meier curves are plotted according to the risk score. (F–H) 1- (F), 2- (G), and 3-year (H) calibration curves of the prognostic risk model for LIHC. (I–K) DCA curves at 1 (I), 2 (J), and 3 years (K) for the prognostic risk model for LIHC. LIHC, Liver Hepatocellular Carcinoma; ROC, Receiver Operating Characteristic Curve; AUC, Area Under the Curve; OS, Overall Survival; DCA, Decision Curve Analysis. P-value < 0.001, highly statistically significant. AUC between 0.5 and 0.7 indicates low accuracy; AUC between 0.7 and 0.9 indicates certain accuracy. Based on the median risk score from the prognostic model, LIHC samples were categorized into high- and low-risk groups. A univariate Cox regression analysis was conducted on the risk score and clinical data from LIHC samples, incorporating variables with a p-value < 0.10 into a subsequent multivariate Cox regression analysis. Forest plots (Fig. [97]9B–C) presented the results of univariate and multivariate analyses. The findings indicated that the risk score, in conjunction with clinical variables such as age, sex, and pathological stage, were significant (p < 0.05). A nomogram was developed through univariate and multivariate Cox regression analyses to evaluate the prognostic risk model for LIHC, highlighting the association between the risk score and clinical variables in the samples (Fig. [98]9D). The findings indicated that the utility of the risk score in the LIHC prognostic risk model surpassed that of the other variables, whereas the utility of sex was low. A Kaplan–Meier curve analysis evaluated overall survival in LIHC samples using median risk score grouping (Fig. [99]9E). The findings revealed a significant difference in overall survival between high- and low-risk groups (p < 0.001). Calibration analyses for 1-, 2-, and 3-year outcomes were conducted and visualized using calibration curves (Fig. [100]9F–H). In these curves, the horizontal axis represents the model’s predicted survival probability, whereas the vertical axis shows the observed survival probability. The predictive accuracy is enhanced when the model-predicted lines for different time points closely align with the ideal gray line. The prognostic risk model for LIHC showed strong clinical predictive accuracy at 1, 2, and 3 years. Decision curve analysis (DCA) was conducted over 1, 2, and 3 years to assess the clinical utility of the LIHC prognostic risk model, as illustrated in the DCA curves (Fig. [101]9I–K). The model’s effectiveness is indicated by the x-value range where its line consistently exceeds the “All positive” and “All negative” lines. An increased range of x values corresponds with improved model performance. The findings revealed that the LIHC prognostic risk model offered superior clinical predictive accuracy. Immune-cell infiltration analysis We utilized the ssGSEA algorithm to evaluate the abundance and correlation of immune-cell infiltration across 28 immune-cell types using samples from TCGA-LIHC dataset. Using immune-cell infiltration analysis data from TCGA-LIHC dataset, we generated plots to compare immune-cell infiltration between the LIHC and control groups. Figure [102]10A reveals significant differences (p < 0.05) in the abundance of 24 immune cells between the LIHC and control groups. The infiltration abundance of various immune cells, including B cells, T cells (CD4^+, CD8^+, gamma-delta, regulatory, and follicular helper), natural killer cells (CD56bright, CD56dim), memory CD8^+ T cells (central, effector), eosinophils, immature B and dendritic cells, macrophages, mast cells, myeloid-derived suppressor cells, monocytes, neutrophils, plasmacytoid dendritic cells, and type 1 and 17 T-helper cells, showed significant differences between the LIHC and control groups (p < 0.001). The groups showed a significant difference in type 2 T helper-cell infiltration abundance (p < 0.05). Fig. 10. [103]Fig. 10 [104]Open in a new tab TCGA-LIHC immune-cell infiltration analysis using the ssGSEA algorithm. (A) Plot of grouping comparison of immune cells in the LIHC and control groups. (B) Correlation heatmap of immune-cell infiltration abundance with significant difference in the group comparison map in TCGA-LIHC dataset. (C) Heatmap of correlation between model genes and infiltration abundance of 24 immune cells in TCGA-LIHC dataset. ssGSEA, Single-Sample Gene Set Enrichment Analysis; LIHC, Liver Hepatocellular Carcinoma; TCGA, The Cancer Genome Atlas. *p < 0.05; ***p < 0.001. The absolute value of correlation coefficient (r value) below 0.3 indicates weak or no correlation, between 0.3 and 0.5 indicated weak correlation, between 0.5 and 0.8 indicated moderate correlation, and more than 0.8 indicated strong correlation. Yellow indicates the LIHC group and gray the control group. A correlation heatmap was generated to depict the relationships among the infiltration abundance of 24 distinct immune cells in TCGA-LIHC dataset samples (Fig. [105]10B). This analysis revealed a positive correlation across all immune cell types, with particularly strong positive correlations observed between activated and immature B cells and between MDSCs and Tregs. Figure [106]10C illustrates a correlation heatmap showing the relationship between 12 genes linked to the prognostic risk model and the abundance of 24 immune cell types in TCGA-LIHC dataset. CDKN2A exhibited the strongest positive correlation with activated CD4 T-cell infiltration, whereas FOXM1 had the most pronounced negative correlation with eosinophil infiltration. Analysis of differential gene expression in the prognostic risk model Boxplots depicted the expression differences of 12 prognostic risk model-related genes between high- and low-risk groups in TCGA-LIHC dataset and LIHC samples. Our analysis identified notable variations in the expression levels of all 12 model genes (p < 0.05, Fig. [107]11A). Eleven genes (UBE2T, HJURP, PTTG1, CENPA, FOXM1, E2F1, CDKN2A, CCNA2, SFN, HMGA2, and RBP2) showed strong differences in expression levels (p < 0.001). SOX2 showed a significant difference in expression level as well (p < 0.05). Fig. 11. [108]Fig. 11 [109]Open in a new tab Expression difference analysis for model genes. (A) Boxplot of prognostic risk model-related genes (model genes) in LIHC high- and low-risk groups in TCGA-LIHC dataset. (B) Boxplots of prognostic risk model-related genes in the LIHC and control groups in TCGA-LIHC dataset. (C) Boxplots of prognostic risk model-associated genes in LIHC samples from combined datasets in the high- and low-risk groups. (D) Boxplots of prognostic risk model-associated genes in the LIHC and control groups in combined datasets. TCGA, The Cancer Genome Atlas; LIHC, Liver Hepatocellular Carcinoma. High-risk group is indicated in red and low-risk group in green. Orange-yellow indicates the LIHC group and gray the control group. *p < 0.05; **p < 0.01; ***p < 0.001. We analyzed the differential expression of model genes between LIHC and control samples using TCGA-LIHC dataset. Boxplots illustrated significant expression differences for all 12 model genes between LIHC and control groups (p < 0.05) (Fig. [110]11B). Twelve genes demonstrated significant differences, with UBE2T, HJURP, PTTG1, CENPA, FOXM1, E2F1, CDKN2A, CCNA2, SFN, and RBP2 showing p-values below 0.001 and SOX2 and HMGA2 showing p-values below 0.01. Differences in the expression of model genes related to the prognostic risk models were examined in LIHC samples from the combined datasets. Gene expression data and multivariate Cox analysis coefficients from the LIHC prognostic model were used to calculate risk scores, classifying samples into high- and low-risk groups based on the median. Figure [111]11C presents a boxplot illustrating the differential expression analysis between the groups, highlighting significant differences (p < 0.05) in 11 model genes: UBE2T, HJURP, PTTG1, CENPA, FOXM1, E2F1, CCNA2, and RBP2 exhibited the strongest differences in expression levels (p < 0.001), whereas HMGA2 and SFN showed strong significant differences (p < 0.01) and CDKN2A showed significant differences (p < 0.05). Boxplots illustrated the differential expression of the 12 model genes between LIHC and control samples in the combined datasets (Fig. [112]11D). Significant differences (p < 0.001) were observed in the expression of all 12 model genes—UBE2T, HJURP, PTTG1, CENPA, FOXM1, E2F1, CDKN2A, CCNA2, SFN, HMGA2, SOX2, and RBP—between the LIHC and control groups. Discussion In this study, we investigated the differential expression of genes associated with ICD and cellular senescence in LIHC to clarify their impact on disease progression and prognosis. LIHC is notably aggressive with a poor prognosis, significantly impacting patient quality of life and survival rates^[113]1,[114]10. The high mortality rate associated with LIHC underscores the urgent need for novel therapeutic strategies and improved diagnostic tools. With the development of precision medicine, the value of molecular targets in the treatment, diagnosis, and prediction of cancer has received increasing attention. Although many biomarker-based LIHC prediction models have been reported, most of them are single-omic models, and their predictive ability is not satisfactory. The role of ICD and cellular senescence in tumor progression is not fully understood, although some studies suggest that they exert antitumor effects^[115]11. ICD induces immune responses targeting tumor cells through cell death, whereas cellular senescence involves permanent cell cycle arrest, affecting both tumor suppression and progression^[116]3,[117]4. Through the integration of data from various GEO datasets and TCGA-LIHC dataset, we identified 16 key ICDR&CSRDEGs that were differentially expressed in LIHC: UBE2T, HJURP, PTTG1, CENPA, FOXM1, E2F1, TERT, CDKN2A, CCNA2, SIX1, FOS, SFN, HMGA2, SOX2, RBP2, and DLX2. Our study highlights the pivotal role of ICDR&CSRDEGs in the tumor microenvironment, suggesting their potential as prognostic biomarkers and therapeutic targets. The development and validation of a prognostic risk model using these genes highlight their clinical significance and provide new perspectives on personalized treatment strategies for patients with LIHC. Enrichment analysis of the 16 ICDR&CSRDEGs indicated their significant roles in various biological processes and pathways. The GO analysis highlighted the regulation of the G1/S phase transition in the cell cycle, a critical factor in cell proliferation and tumor growth, aligning with the findings of previous studies that identified cell cycle dysregulation as a hallmark of cancer, including LIHC^[118]12–[119]15. This regulation is vital to control cell proliferation and tumor growth and a potential target for cancer therapy^[120]13,[121]16. The KEGG pathway analysis identified significant enrichment in pathways in cancers such as bladder cancer, suggesting potential shared mechanisms among different cancer types. The GSEA highlighted the significant enrichment of genes in the pre-Notch expression and processing pathway, crucial for cell differentiation, proliferation, and apoptosis. The Notch signaling pathway influences cancer development, including LIHC, by affecting the tumor microenvironment and immune response^[122]17. Additionally, three other pathways are closely associated with tumor occurrence and progression. The Wnt-activated Tcf-dependent signaling pathway is essential for cancer stem cell renewal, proliferation, and differentiation, playing a significant role in tumorigenesis and therapeutic responses^[123]18. GSVA revealed 20 pathways with significant differences between the LIHC and control groups, highlighting the IL6-JAK-STAT3 and PI3K-AKT-MTOR signaling pathways, which are closely associated with LIHC development. We also investigated mutations in the 16 ICDR&CSRDEGs and analyzed their correlation with tumor mutation burden. TP53 showed a mutation rate of 28% in LIHC samples, which aligns with its recognized role as a frequently mutated tumor suppressor in various cancers. TP53 mutations contribute to poor prognosis and therapy resistance, underscoring their role in LIHC pathogenesis^[124]19. Through further data analysis, we obtained 12 model genes from the 16 ICDR&CSRGDEGs: UBE2T, HJURP, PTTG1, CENPA, FOXM1, E2F1, CDKN2A, CCNA2, SFN, HMGA2, SOX2, and RBP2. Next, we constructed the prognostic model based on 12 model genes and the risk coefficient of multivariate Cox regression analysis. The prognostic risk model for LIHC demonstrated a good clinical prediction effect at 1-, 2-, and 3-year intervals. A prognostic risk model incorporating 12 markers, including UBE2T, HJURP, PTTG1, and CENPA, highlights the importance of these genes in predicting patient outcomes. The time-dependent ROC curves highlight the risk model’s potential effectiveness in clinical applications, particularly for short-term prognosis. Model genes were significantly upregulated in LIHC, playing key roles in tumor progression, prognosis, and potential therapeutic interventions, which is consistent with previous study findings. UBE2T, which is involved in ubiquitination and DNA damage repair, contributes to genomic stability, and its overexpression has been linked to poor prognosis in various cancers, including LIHC, suggesting a role in tumor progression and therapy resistance. HJURP, which is essential for maintaining centromeric chromatin, enhances tumor cell proliferation and is correlated with aggressive tumor characteristics and unfavorable clinical outcomes in LIHC. PTTG1, which encodes a securin protein involved in mitosis regulation, is often overexpressed in LIHC and linked to tumor aggressiveness and apoptosis resistance. CENPA, which is crucial for centromeric function and genome stability, is significantly upregulated in LIHC, leading to chromosomal instability and tumorigenesis. FOXM1, encoding a transcription factor that regulates cell cycle and DNA repair, is markedly upregulated in LIHC and is linked to poor clinical outcomes and increased tumor aggressiveness. E2F1, encoding a transcription factor, contributes to tumor growth and metastasis in multiple human cancers, including hepatocellular carcinoma^[125]20. Luo et al.^[126]21reported that CDKN2A was overexpressed in liver cancer, which correlates with poor prognosis and decreased immune-cell infiltration in hepatocellular carcinoma. CCNA2, SFN, HMGA2, SOX2, and RBP2 have been identified as significant biomarkers for cancer diagnosis and prognosis, particularly in liver cancer^[127]22–[128]26. High expression levels of these genes are correlated with poor staging, higher tumor grades, and unfavorable survival outcomes. Their inclusion in prognostic models highlights their potential as predictive biomarkers and therapeutic targets, emphasizing their roles in tumor immunity and progression. In this study, CENPA and SFN were identified as independent prognosis-related genes, suggesting that they may play important roles in the occurrence and progression of LIHC. Previous studies have shown that CENPA, as a centriole-associated protein, may affect LIHC development by regulating chromosome stability and promoting mitotic abnormalities^[129]27,[130]28. CENPA overexpression may cause chromosome missegregation, promote genomic instability, and ultimately promote tumor progression. In addition, CENPA may also affect cell division and tumor growth by interacting with other key cell cycle proteins. SFN, as a cell cycle regulator, has an important regulatory role in a variety of cancers^[131]29. Studies have shown that SFN can improve the tolerance of tumor cells to external stress (such as chemotherapy and oxidative stress) by regulating DNA damage repair mechanisms. In addition, SFN may promote the proliferation, migration, and invasion of LIHC cells by activating the PI3K/AKT or MAPK signaling pathways. These effects make SFN a potential therapeutic target in LIHC, and its abnormal expression may be closely related to the poor patient prognosis. Based on the key roles of CENPA and SFN in LIHC progression, future studies can further explore the molecular regulatory network of these two genes in LIHC pathogenesis, study their functions in the tumor microenvironment, and evaluate their application value in personalized treatment. In addition, verifying their protein-level expressional changes and exploring their relationships with immune infiltration, drug resistance, and other tumor-related factors will facilitate our understanding of their roles in LIHC development. Emerging evidence indicates a strong association between immune cells, the microenvironment, and tumor development. The immune-cell infiltration analysis in our study provides crucial insights into the tumor microenvironment of LIHC. We utilized ssGSEA to evaluate immune-cell abundance across samples, identifying notable differences in immune-cell infiltration between high- and low-risk groups. Notable differences were identified in the infiltration of 24 immune cell types between the LIHC and control groups (p < 0.05). Positive correlations existed among all immune cells, with strong associations between activated and immature B cells and between MDSCs and Tregs, which was associated with impaired antitumor immunity. Among the 12 model genes, CDKN2A was positively correlated with activated CD4^+ T-cell infiltration, whereas FOXM1 was negatively correlated with eosinophil infiltration. CD8^+ T cells, macrophages, and Tregs exhibited distinct infiltration patterns that are crucial for tumor immunity and progression. CD8^+ T cells are critical for antitumor immunity, because they eliminate cancer cells directly through the release of cytotoxic granules and cytokines^[132]30. Tregs are known for their immunosuppressive functions that can inhibit effective antitumor immune responses^[133]31–[134]33. The elevated presence of Tregs in the high-risk group suggests a more immunosuppressive tumor microenvironment, which may promote tumor growth and resistance to immunotherapy. These findings align with previous study findings highlighting the prognostic significance of immune-cell infiltration in various cancers, including LIHC^[135]34–[136]36. The integration of our immune-cell infiltration analysis with the identified ICDR&CSRDEGs offers a thorough insight into the immune landscape in LIHC. This knowledge is crucial to develop targeted immunotherapies and improve patient stratification based on immune profiles. Future research should aim to confirm these results in larger groups and investigate the therapeutic potential of altering immune-cell infiltration to boost antitumor immunity in patients with LIHC. This study provides a new strategy for integrating ICDRGs and CSRGs to assess LIHC prognosis. Compared with existing studies, it introduces several innovations. First, we aggregated the TCGA-LIHC dataset and multiple GEO array datasets ([137]GSE25097, [138]GSE46408, and [139]GSE121248) and performed strict data standardization and batch effect correction to obtain more reliable gene expression features. We also constructed a comprehensive prognostic risk model based on ICDRGs and CSRGs and verified its predictive efficacy in the short- (1 year) and long-term (2–3 years). This approach differs from those of past single-model studies, which were limited to ICDRGs or CSRGs. This study also analyzed the immune cell infiltration characteristics of high- and low-risk groups to provide deeper insights into the LIHC immune microenvironment. We also combined SNV and CNV analyses to reveal the gene mutation characteristics that may affect LIHC prognosis. This study systematically analyzed the functional characteristics of key genes based on the integration of data from multi-source public databases, combined with bioinformatics methods such as DESeq2 differential expression analysis, GO/KEGG functional enrichment, and GSEA/GSVA pathway enrichment analysis. It demonstrated good predictive performance and result stability in an independent validation set. However, the following limitations should be noted: Firstly, the current analysis is mainly limited to the transcriptome level, and the protein expression levels of key genes and their clinical relevance have not been verified through experimental methods such as Western blotting and immunofluorescence double-staining, which may affect the completeness of the molecular mechanism explanation. Secondly, although a preliminary validated prognostic model has been established, its clinical applicability and stability still need to be further verified in larger-scale multi-center external cohorts. These limitations mainly stem from the objective constraints of research resources and time costs at the present stage. Subsequent studies will focus on conducting functional verification experiments at the protein level (such as Western blotting and immunofluorescence double-staining) and multi-center prospective cohort studies to comprehensively enhance the biological plausibility and implementing clinical translational value of the research conclusions. In conclusion, we effectively combined GEO datasets and TCGA data to identify DEGs associated with ICD and cellular senescence in LIHC. We developed a prognostic risk model with high predictive accuracy, validated through multiple statistical methods. Functional enrichment analyses identified biological processes and pathways linked to these genes. Analysis of immune-cell infiltration demonstrated a correlation between immune cell abundance and genes within the prognostic model. These findings enhance our comprehension of the molecular mechanisms in LIHC and indicate possible prognostic biomarkers and therapeutic targets. Future research should prioritize clinical validation and experimental studies to assess the applicability and biological significance of these findings. Methods Data download We obtained the LIHC datasets [140]GSE25097^[141]37, [142]GSE46408^[143]38, and [144]GSE121248^[145]39 from the GEO database^[146]40 using the GEOquery R package (Version 2.70.0, [147]https://bioconductor.org/packages/release/bioc/html/GEOquery.html) ^[148]41. Detailed information is provided in Table [149]5. All samples in these datasets originated from the liver tissues of Homo sapiens. Specifically, the [150]GSE25097 dataset utilized the [151]GPL10687 chip platform and comprised 268 LIHC and 243 control samples. The dataset [152]GSE46408 utilized the [153]GPL4133 chip platform, including six LIHC and six control samples. The chip platform for the [154]GSE121248 dataset was [155]GPL570, including 70 LIHC and 37 control samples. This study included all LIHC and control samples. The R package sva (Version 3.50.0, [156]https://bioconductor.org/packages/release/bioc/html/sva.html)^[157 ]42 was employed for batch correction on datasets [158]GSE25097, [159]GSE46408, and [160]GSE121248, producing the combined GEO datasets, including 344 LIHC and 286 control samples. The R package limma (Version 3.58.1, [161]https://bioconductor.org/packages/limma/)^[162]43 was then used for normalization, probe annotation, and further standardization of the combined GEO Datasets. Table 5. GEO microarray chip information. [163]GSE25097 [164]GSE46408 [165]GSE121248 Platform [166]GPL10687 [167]GPL4133 [168]GPL570 Type Array Array Array Species Homo sapiens Homo sapiens Homo sapiens Tissue Liver Liver Liver Samples in the tumor group 268 6 70 Samples in the control group 243 6 37 Reference PMID: 21955977 PMID: 23922981 PMID: 17975138 [169]Open in a new tab GEO, Gene Expression Omnibus. TCGA-LIHC dataset was acquired from TCGA via TCGAbiolinks (Version 2.30.0, [170]https://github.com/BioinformaticsFMRP/TCGAbiolinks)^[171]44 R package for further analysis. We analyzed 399 LIHC samples with comprehensive clinical data, comprising 349 tumor tissues (cancer group, LIHC) and 50 partially matched adjacent tissues (normal group, control). Data were normalized to the FPKM format with clinical information obtained from the UCSC Xena database^[172]45, as detailed in Table [173]6. Table 6. Baseline table with LIHC patient characteristics. Characteristics Overall Pathologic stage, n (%) Stage I 171 (49.3%) Stage II 86 (24.8%) Stages III and IV 90 (25.9%) Sex, n (%) Female 121 (32.6%) Male 250 (67.4%) Age, n (%) ≤ 60 years 177 (47.8%) > 60 years 193 (52.2%) [174]Open in a new tab LIHC, Liver Hepatocellular Carcinoma. ICDRGs were sourced from the GeneCards database^[175]46 and existing literature. The GeneCards database provides extensive data on human genes. A search for “Immunogenic cell death” focusing on “Protein Coding” genes with a relevance score greater than 0 identified 63 ICDRGs. Similarly, 20 ICDRGs were obtained from the published literature^[176]47, using “Immunogenic cell death” as the search keywords in PubMed. After merging and removing duplicates, 79 ICDRGs were identified, as detailed in Table S1. CSRGs were collected from the GeneCards database and published literature. “Cellular senescence” was used as a search keyword in GeneCards. Only “Protein Coding” and CSRGs with relevance scores > 5 were retained, and 63 CSRGs were obtained. Additionally, a PubMed search with the same keyword yielded 279 CSRGs from the literature^[177]48. After removing duplicates, 310 CSRGs were compiled. The detailed information is shown in Table S2. A total of 324 ICDR&CSRGs were obtained by merging and deduplication of ICDRGs and CSRGs (Table S3). ICDR&CSRDEGs Using the DESeq2 R package (Version 1.42.0, [178]https://github.com/thelovelab/DESeq2)^[179]49, differential expression analysis was conducted on samples from TCGA-LIHC dataset, categorized into LIHC and control groups. DEGs were identified with the criteria of |logFC| > 3 and p.adj < 0.05.Genes were classified as upregulated if logFC > 3 and p.adj < 0.05 and as downregulated if logFC < -3 and p.adj < 0.05. To obtain ICDR&CSRDEGs associated with LIHC-expressed genes, TCGA-LIHC dataset was analyzed. Variance analysis was conducted on DEGs with absolute logFC > 3 and p.adj < 0.05. A Venn diagram was used to determine the overlap between these DEGs and ICDR&CSRGs, identifying the ICDR&CSRDEGs. Variance analysis results were visualized in a volcano plot using ggplot2 (Version 3.4.4, [180]https://github.com/tidyverse/ggplot2) and heatmap using pheatmap; RCircos (Version 1.2.2, [181]https://github.com/hzhanghenry/Rcircos) was used for chromosomal locations in R^[182]50. GO and KEGG pathway enrichment analyses of LIHC GO analysis^[183]51 is widely utilized for large-scale functional enrichment studies, encompassing biological processes, cellular components, and molecular functions. The KEGG^[184]52 database offers comprehensive data on genomes, biological pathways, diseases, and drugs. The R package clusterProfiler (Version 4.10.0, [185]https://bioconductor.org/packages/clusterProfiler/)^[186]53 conducted GO and KEGG pathway enrichment analyses on ICDR&CSRDEGs, deeming results statistically significant with a p.adj < 0.05 and q-value < 0.25, applying the Benjamini–Hochberg method for p-value correction. GSEA of LIHC GSEA^[187]54 shows gene distribution trends in predefined sets within a phenotype-correlated ranked gene expression table to assess their phenotypic contribution. Genes from TCGA-LIHC dataset were categorized based on their median log fold change values. GSEA was performed on the dataset using the clusterProfiler R package, configured with a random seed of 2022, and gene set size constraints of 10 to 500. Gene sets were sourced from the Molecular Signatures Database (MSigDB) using the c2.cp.all.v2022.1.Hs.symbols.gmt collection, which includes 3,050 canonical pathways. GSEA significance was evaluated using the Benjamini–Hochberg correction for p-values, with thresholds of p.adj < 0.05 and a false discovery rate (q-value) < 0.25. GSVA of LIHC GSVA^[188]55 is a nonparametric, unsupervised approach for evaluating gene set enrichment in microarray and transcriptomic datasets. It transforms a gene expression matrix from various samples into a gene set expression matrix, facilitating pathway enrichment analysis across samples. The GSVA R package (Version 1.50.0, [189]https://github.com/rcastelo/GSVA)^[190]55 was utilized to apply the MSigDB gene sets (h.all.v2023.1.Hs.symbols.gmt) to TCGA-LIHC dataset. We sought to assess the functional enrichment variations between the LIHC and control groups. The criteria for selecting significantly enriched gene sets in the GSVA included p.adj < 0.05 and the top 10 positive or negative logFC values. Analysis of SMs and CNVs The somatic mutation analysis for LIHC samples from TCGA-LIHC dataset used “Masked Somatic Mutation” data from TCGA website, preprocessed with VarScan software. The SM landscape was visualized using the R package maftools (Version 2.18.0, [191]https://github.com/PoisonAlien/maftools)^[192]56. TCGA-LIHC dataset’s “Masked Copy Number Segment” data from TCGA website was utilized to analyze CNVs in LIHC samples. The CNV segments were analyzed using GISTIC2.0^[193]57 with default parameter settings after downloading and processing. Development of a prognostic risk model for LIHC We utilized the R package survival (Version 3.5-7, [194]https://github.com/therneau/survival)^[195]58 to conduct univariate Cox regression analysis on clinical data, aiming to create a prognostic risk model for LIHC using TCGA-LIHC dataset. We evaluated the prognostic impact of ICDR&CSRDEGs to determine their independence as prognostic factors. Genes associated with the prognostic risk model were identified through multivariate Cox regression analysis of variables with a p-value < 0.10 from univariate Cox regression. A forest plot depicted the expression levels of the model genes using univariate and multivariate Cox regression analyses. The risk score was derived from multivariate Cox regression analysis coefficients using the following formula: graphic file with name d33e2865.gif Analysis of the prognostic risk model for LIHC LIHC samples were categorized into high- and low-risk groups using the median risk score from the prognostic model. Cox regression analyses, both univariate and multivariate, were conducted using the risk score alongside clinical variables including age, sex, and pathologic stage. Variables with a p-value less than 0.10 from the univariate analysis were subsequently evaluated using multivariate Cox regression. A forest plot was employed to illustrate the impact of the risk score and clinical variables from both analyses. A nomogram^[196]59, created with the R package rms (Version 6.7-1, [197]https://github.com/harrelfe/rms) and derived from multivariate Cox regression analysis, visually represents the relationships between independent variables using distinct line segments, highlighting the association between the risk score and clinical data. A Kaplan–Meier curve analysis^[198]60 was conducted to assess patient survival times, examining the relationship between survival outcomes and influencing factors, with the curve derived from the risk score evaluating patient survival time. The time-dependent ROC^[199]61 curve is a graphical tool for assessing model performance and selecting optimal models or thresholds. The timeROC (Version 0.4, [200]https://cran.r-project.org/web/packages/timeROC/index.html)^[201]6 2 R package was utilized to create the curve and compute the AUC for evaluating the prognostic accuracy of the risk score. AUC values, ranging from 0.5 to 1, signify improved diagnostic performance with high scores. An AUC greater than 0.5 indicates event promotion, with values between 0.5 and 0.7 reflecting low accuracy, 0.7 and 0.9 indicating moderate accuracy, and values above 0.9 representing high accuracy. Calibration curves were used to compare the observed outcomes with the model predictions and assess the predictive accuracy and discrimination ability of the model. A calibration analysis was conducted to illustrate the comparison. The model’s clinical utility was assessed through DCA^[202]63 using the ggDCA (Version 1.1, [203]https://cran.r-project.org/package=ggDCA)^[204]64 R package. The DCA diagram offers insights into the model’s accuracy and discrimination in predicting LIHC outcomes. Immune-cell infiltration analysis The ssGSEA^[205]65 evaluates the relative abundance of immune-cell infiltrates, identifying immune cells such as activated CD8^+ T cells, activated dendritic cells, gamma-delta T cells, natural killer cells, and Tregs. Enrichment scores from ssGSEA indicated the immune-cell infiltration levels in the samples. Samples with a p-value under 0.05 were utilized to construct the immune-cell infiltration matrix. The R package ggplot2 was used to generate plots comparing groups in the ssGSEA, highlighting differential expression between LIHC and control samples. Using the R package pheatmap (Version 1.0.12, [206]https://cran.r-project.org/web/packages/pheatmap/index.html), heatmaps were created to illustrate the correlation between immune cells and the association of prognostic risk model-related genes with ssGSEA immune cells in both LIHC and control samples. Correlation coefficients (r-values) were categorized as follows: below 0.3 indicates weak or no correlation, 0.3 to 0.5 signifies weak correlation, 0.5 to 0.8 represents moderate correlation, and above 0.8 denotes strong correlation. Analysis of differential gene expression in prognostic risk models We evaluated the expression differences of prognostic risk model-related genes between high- and low-risk groups in TCGA-LIHC dataset and LIHC samples by calculating risk scores for the combined datasets. The scores were calculated using risk coefficients from multivariate Cox regression analysis combined with the expression levels of the model genes. Samples were divided into high- and low-risk groups using the median risk score derived from the prognostic model formula. graphic file with name d33e2931.gif Comparison maps were generated for high- and low-risk groups using TCGA-LIHC and combined datasets, emphasizing gene expression levels associated with the prognostic risk model. We created group comparison maps to analyze differences in model gene expression between TCGA-LIHC dataset and the combined datasets for both LIHC and control samples. Statistical analysis We employed R software (version 4.3.1) for data processing and analysis. Continuous variables are expressed as mean ± standard deviation. Group comparisons were conducted using the Wilcoxon rank-sum test, while Spearman’s correlation analysis evaluated molecular correlations, with statistical significance defined as p < 0.05. Acknowledgements