Abstract Background and aim Pancreatic cancer (PC) is one of the most common tumors with a poor prognosis. The current American Joint Committee on Cancer (AJCC) staging system, based on the anatomical features of tumors, is insufficient to predict PC outcomes. The current study is endeavored to identify important prognosis-related genes and build an effective predictive model. Methods Multiple public datasets were used to identify differentially expressed genes (DEGs) and survival-related genes (SRGs). Bioinformatics analysis of DEGs was used to identify the main biological processes and pathways involved in PC. A risk score based on SRGs was computed through a univariate Cox regression analysis. The performance of the risk score in predicting PC prognosis was evaluated with survival analysis, Harrell’s concordance index (C-index), area under the curve (AUC), and calibration plots. A predictive nomogram was built through integrating the risk score with clinicopathological information. Results A total of 945 DEGs were identified in five Gene Expression Omnibus datasets, and four SRGs (LYRM1, KNTC1, IGF2BP2, and CDC6) were significantly associated with PC progression and prognosis in four datasets. The risk score showed relatively good performance in predicting prognosis in multiple datasets. The predictive nomogram had greater C-index and AUC values, compared with those of the AJCC stage and risk score. Conclusion This study identified four new biomarkers that are significantly associated with the carcinogenesis, progression, and prognosis of PC, which may be helpful in studying the underlying mechanism of PC carcinogenesis. The predictive nomogram showed robust performance in predicting PC prognosis. Therefore, the current model may provide an effective and reliable guide for prognosis assessment and treatment decision-making in the clinic. Keywords: risk score, nomogram, TCGA, GEO Introduction Pancreatic cancer (PC), as one of the most common tumors, is the leading cause of cancer-related death worldwide and has a very poor prognosis.[39]^1 Currently, the American Joint Committee on Cancer (AJCC) staging system remains the most widely used predictive model for PC. The system was designed to provide a guide for prognosis assessment and therapeutic decisions.[40]^2 However, the AJCC staging system was constructed to assess only the three basic indicators of anatomic spread (including the extent of the tumor, the extent of spread to the lymph nodes, and the presence of metastasis) and is unable to comprehensively elucidate tumor behaviors.[41]^3 In fact, PC patients with the same AJCC stage may have different clinical prognosis after receiving the same treatments. Therefore, the current predictive system is not sufficient to predict the outcomes of patients with PC, and refinement is necessary. Over the past few decades, great efforts have been made to identify the molecular markers of cancer. The importance of gene signatures in the initiation, progression, and prognosis of tumors has been shown in many studies.[42]^4^–[43]^11 Thousands of genes can be studied simultaneously with the use of next-generation sequencing and novel microarray technologies, facilitating the investigation of the interaction between gene signatures and tumors.[44]^12^,[45]^13 Therefore, an increasing number of researchers are interested in using gene signatures for the risk stratification of patients.[46]^14 To the best of our knowledge, to date, only two studies have used gene expression signatures to build predictive models for PC.[47]^15^,[48]^16 Both the studies assessed the power of their prognostic models in a single dataset, and none of these models was constructed based on both clinicopathological factors and gene signatures. In the current study, we endeavored to identify important prognosis-related genes through a multi-dataset analysis, and built composite predictive models for PC that are more applicable in guiding prognostic assessments and treatment decision-making. Materials and methods Gene Expression Omnibus (GEO) datasets We searched and downloaded mRNA expression profiling data series concerning PC from the GEO ([49]https://www.ncbi.nlm.nih.gov/geo/) using the following keywords: “pancreatic cancer” and “pancreatic ductal adenocarcinoma.” The “Organism” parameter was limited to “Homo sapiens,” and the “study type” parameter was set to “Expression profiling by array.” Ineligible studies were excluded using the following criteria: 1) studies with less than 15 PC samples or non-tumor pancreatic samples; 2) studies using only PC cell lines or xenografts; 3) studies analyzing only blood samples or tumor samples; and 4) studies analyzing only pancreatic endocrine tumors. Finally, five PC datasets ([50]GSE15471, [51]GSE16515, [52]GSE28735, [53]GSE62452, and [54]GSE71729) were selected for further analysis. Probes were matched with the gene names in accordance with the annotation file provided by the manufacturer. If multiple probes matched a single gene, probes were integrated by using the arithmetic mean to account for the expression level of a single gene. The expression data were log2 transformed. The Genome Cancer Atlas (TCGA) TCGA dataset Transcriptome data (fragments per kilo-base of exon per million fragments) and the corresponding PC clinical information were obtained from TCGA ([55]https://cancergenome.nih.gov/). After removing patients who died within 3 months and patients without gene expression information, 172 patients with corresponding survival information were retained. Genes expressed in over 80% of samples were retained, and the zero values in the expression matrix were replaced with the minimum non-zero value of the corresponding gene. Then the expression data were log2 transformed. Identification of differentially expressed genes (DEGs) and bio-information analysis A Significant Analysis of Microarrays (SAM) algorithm was used to identify genes that were differentially expressed between tumor and non-tumor samples via BRB-Array Tools ([56]https://linus.nci.nih.gov/BRB-ArrayTools). A false discovery rate of <0.005 was set as the cutoff criterion.[57]^17 DEGs (including downregulated and upregulated genes) in the five GEO datasets were selected through overlapping analysis, and then functional annotation and pathway enrichment analyses were performed using DAVID software ([58]https://david.ncif-crf.gov/). A protein–protein interaction (PPI) network was established for DEGs using Search Tool for the Retrieval of Interacting Genes ([59]https://string-db.org/) and visualized using Cytoscape 3.6.0. Identification of potential prognostic genes The expression values of DEGs in the [60]GSE28735, [61]GSE62452, [62]GSE71729, and TCGA datasets were analyzed through a univariate Cox proportional hazard regression model. Genes significantly associated with overall survival (OS) in all these datasets were identified as survival-related genes (SRGs), and a P-value <0.05 was set as the cutoff criterion. Correlation analyses and survival analyses were performed to assess the importance of SRGs in PC progression and prognosis. A risk score for each dataset was computed through the summation of the gene expression value multiplied by the corresponding coeffcient from a univariate Cox regression model (TCGA dataset as a training cohort for risk score, and the other GEO datasets as external validation cohorts). The performance of the risk score in predicting OS was evaluated through a survival analysis, Harrell’s concordance index (C-index),[63]^18 area under the curve (AUC) of the receiver operating characteristic (ROC) curve,[64]^19 and a calibration plot comparing predicted vs observed Kaplan–Meier estimates of survival probability.[65]^20 Development, comparison, and validation of predictive nomogram In the TCGA dataset, a predictive nomogram was built on the basis of risk score and clinicopathological information using a backward stepwise Cox proportional hazard model.[66]^21 The calibration ability of the nomogram was assessed using a calibration plot comparing nomogram-predicted vs observed Kaplan–Meier estimates of survival probability, using 1,000 bootstrap resamples.[67]^20 We compared the discriminative ability of the nomogram with that of the AJCC stage through the C-index and AUC.[68]^18 In addition, based on the total point in the nomogram, patients were stratified into three subgroups in the TCGA dataset, including a low-risk group (total point <33.3%), a medium-risk group (total point between 33.3% and 66.6%), and a high-risk group (total point >66.6%), and survival curves for these subgroups were estimated using the Kaplan–Meier method. Statistical analysis SAM analysis was performed using BRB-Array Tools. All other statistical analyses were completed using R ([69]https://www.r-project.org/, v3.3.4). A P-value <0.05 (two-sided) was considered to indicate statistical significance. A chi-square or Fisher’s exact test was used to assess differences in categorical variables. Student’s t-test or a non-parametric Mann–Whitney U-test was used to detect differences in continuous variables between two groups. ANOVA or the Kruskal–Wallis test was used to detect the differences in continuous variables between multiple groups. OS was assessed using the log-rank test. HR and 95% CIs were estimated using a Cox regression model. Box plots were constructed using the R package “ggplot2.”[70]^22 The ROC curve was plotted using the R package “qROC.”[71]^23 A heat-map was plotted using the R package “gplots.”[72]^24 The survival analysis and Cox proportional hazard regression analysis were carried out using the R package “survival.”[73]^25 The C-index and nomogram were completed using the R package “rms.”[74]^26 Ethics statement All datasets ([75]GSE15471, [76]GSE16515, [77]GSE28735, [78]GSE62452, [79]GSE71729, and TCGA) are freely available as public resources. Therefore, additional approval by an ethics committee was not needed in this study. Results Identification of DEGs A total of 9,886, 3,961, 2,276, 3,732, and 1,605 genes differentially expressed between tumor and non-tumor tissues were identified after the SAM analysis of [80]GSE15471, [81]GSE16515, [82]GSE28735, [83]GSE62452, and [84]GSE71729 datasets, respectively ([85]Figure S1A–E). A total of 945 DEGs were found in the five GEO datasets through overlapping analysis ([86]Figure 1A and B; [87]Table S1), including 389 downregulated genes and 556 upregulated genes in tumor samples compared with non-tumor samples. Distinct expression patterns of the 945 DEGs in the five GEO datasets were presented through hierarchical clustering analysis ([88]Figure S2A–E). Figure 1. [89]Figure 1 [90]Open in a new tab DEGs in five GEO datasets. Notes: The figure shows 389 downregulated (A) and 556 upregulated (B) genes in PC samples. (C) GO biological process analysis for the DEGs. (D) KEGG pathway enrichment analysis for the DEGs. Set size refers to the number of genes differentially expressed between tumor and non-tumor samples in different GEO datasets. Abbreviations: DEGs, differentially expressed genes; GO, Gene ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; PC, pancreatic cancer. Functional annotation analysis, pathway enrichment analysis, and PPI network for DEGs In Gene ontology (GO) biological process analysis, the 945 DEGs were found to be principally enriched in zinc II ion transmembrane import, wound healing, regulation of lipid catabolic process, regulation of fibroblast migration, positive regulation of synapse assembly, positive regulation of cell growth, as well as other biological processes ([91]Figure 1C). Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis showed that the DEGs were mainly associated with salmonella infection, pyruvate metabolism, proteoglycans in cancer, PI3K-Akt signaling pathway, pathways in cancer, pancreatic secretion, p53 signaling pathway, and other biological pathways ([92]Figure 1D). PPI network was constructed to evaluate the interactive relationships among the DEGs ([93]Figure S3). Identification of SRGs and the correlation of SRGs with clinicopathological information Among the 945 DEGs, a total of 64, 190, 136, and 596 genes associated with OS were identified in the [94]GSE28735, [95]GSE62452, [96]GSE71729, and TCGA datasets, respectively. We also found four SRGs (LYRM1, KNTC1, IGF2BP2, and CDC6) in the four datasets ([97]Figure 2A) through overlapping analysis. Figure 2. [98]Figure 2 [99]Open in a new tab Relationship between SRGs and clinicopathological information. Notes: (A) The Venn diagram shows four SRGs in four datasets. (B) Relationship between SRGs and tissues types. (C, D) Relationship between SRGs and histological grade. (E) Relationship between SRGs and PT. (F) Relationship between SRGs and tumor subtype. Other, including neuroendocrine carcinoma, colloid carcinomas, acinar cell carcinoma, and adenocarcinoma not otherwise specified. Abbreviations: M, metastatic samples; N, normal samples; PDAC, pancreatic ductal adenocarcinoma; PT, the extent of the tumor; SRGs, survival-related genes; T, tumor samples. Correlation analysis was performed to determine the association between the expression levels of SRGs and clinicopathological information, including tissues types (normal, tumor, and metastatic samples) ([100]Figure 2B), histological grade ([101]Figure 2C and D), the extent of the tumor (PT) ([102]Figure 2E), tumor subtype ([103]Figure 2F), AJCC stage ([104]Figure S4A), tumor site ([105]Figure S4B), and the extent of spread to the lymph nodes ([106]Figure S4C). Among the SRGs, KNTC1, IGF2BP2, and CDC6 were significantly associated with tissues types, histological grade, PT, and tumor subtype (P<0.05); LYRM1 was significantly differentially expressed in normal, tumor, and metastatic tissues (P<0.05). Meanwhile, the four SRGs were analyzed using X-tile to select the best cutoff values for OS, and on this basis, patients were divided into low- and high-expression groups. Kaplan–Meier survival analysis showed that all SRGs were significantly correlated with patient OS (P<0.05) in the four datasets ([107]Figure 3A–D). Figure 3. [108]Figure 3 [109]Open in a new tab Survival analysis of SRGs in four datasets. Notes: Survival curves of LYRM1 (A[1–4]), KNTC1 (B[1–4]), IGF2BP2 (C[1–4]), and CDC6 (D[1–4]) in [110]GSE28735, [111]GSE62452, [112]GSE71729, and TCGA datasets. Abbreviations: SRGs, survival-related genes; TCGA, The Genome Cancer Atlas. Collectively, these results indicate that the identified SRGs play important roles in the development and progression of PC. Performance assessment of risk score in predicting outcome As described previously, the risk score was computed through the summation of the gene expression value multiplied by the corresponding coefficient (coefficients were obtained from the TCGA dataset through a univariate COX analysis): Risk score = (−0.4705 × expression value of LYRM1) + (0.3707 × expression value of KNTC1) + (0.4106 × expression value of IGF2BP2) + (0.4623 × expression value of CDC6). Then, we stratified patients into low- and high-risk groups in accordance with the median risk scores in the [113]GSE28735, [114]GSE62452, [115]GSE71729, and TCGA datasets. The Kaplan– Meier survival curves of both groups were notably different in the four datasets (P<0.05) ([116]Figure 4A[1]–D[1]). Figure 4. [117]Figure 4 [118]Open in a new tab Performance of risk score in predicting prognosis in four datasets. Notes: Survival curves, AUC, and calibration plots for risk score in TCGA (A[1–3]), [119]GSE71729 (B[1–3]), [120]GSE62452 (C[1–3]), and [121]GSE28735 (D[1–3]). Abbreviations: AUC, area under the curve; TCGA, The Genome Cancer Atlas. The power of the risk score in predicting OS was assessed through C-index and ROC analysis. The C-index of the risk score in the TCGA, [122]GSE71729, [123]GSE62452, and [124]GSE28735 datasets was 0.640 (95% CI, 0.572–0.708), 0.601 (95% CI, 0.531–0.671), 0.648 (95% CI, 0.558–0.738), and 0.689 (95% CI, 0.573–0.805), respectively ([125]Figure S5). The ROC analysis of the risk score is shown in [126]Figure 4A[2]–D[2], and all AUC values at the 3-year point in the four datasets are greater than 0.70. In addition, relatively good agreement was observed between the expected and observed outcomes for 1-, 2-, and 3-year OS in the calibration curves of risk score ([127]Figure 4A[3]–D[3]). In summary, these results indicate that the risk score shows relatively good performance in predicting the OS of PC patients. Assessment of prognostic factors in PC patients After removing patients for whom important clinical information was not available (including age, sex, malignancy history, diabetes history, pancreatitis history, tumor size, tumor site, tumor subtype, histological grade, residual tumor, AJCC stage, radiation treatment, and targeted therapy), 95 patients were retained. Univariate and multivariate adjusted Cox regression analyses were performed to identify prognostic factors for OS. As shown in [128]Table 1, the unadjusted univariate analysis indicated that risk score (P<0.001), age (P=0.013), tumor size (P=0.022), tumor subtype (P=0.001), histological grade (P=0.016, G3 and G4 vs G1), AJCC stage (P=0.002 [IIB vs I], P=0.006 [III and IV vs I]), radiation treatment (P=0.014), and targeted therapy (P=0.014) were significantly associated with OS, while the multivariate adjusted Cox regression analysis showed that risk score, age, tumor size, tumor subtype, radiation treatment, and targeted therapy served as significant independent risk factors (P<0.05). Table 1. Cox regression analysis of risk factors associated with overall survival in the TCGA dataset Unadjusted Adjusted 1[129]^a Adjusted 2[130]^b __________________________________________________________________ Variables HR (95% CI) P-value HR (95% CI) P-value HR (95% CI) P-value __________________________________________________________________ Risk score 1.667 (1.261-2.205) <0.001 1.539 (1.098-2.157) 0.012 1.524 (1.082-2.146) 0.016 Age 1.035 (1.007-1.063) 0.013 1.041 (1.008-1.076) 0.015 1.043 (1.012-1.075) 0.006 Sex  Female  Male 1.045 (0.608-1.797) 0.874 0.694 (0.355-1.358) 0.286 Malignancy history  No  Yes 0.993 (0.421-2.341) 0.988 2.046 (0.737-5.686) 0.170 Diabetes history  No  Yes 1.019 (0.534-1.945) 0.954 0.776 (0.373-1.616) 0.498 Pancreatitis history  No  Yes 1.006 (0.429-2.361) 0.988 1.469 (0.533-4.054) 0.458 Tumor size 1.225 (1.030-1.458) 0.022 1.394 (1.120-1.735) 0.003 1.284 (1.051-1.569) 0.014 Tumor site  Body  Head 4.064 (0.982-16.830) 0.053 1.439 (0.268-7.730) 0.672  Tail 2.301 (0.420-12.620) 0.337 0.852 (0.128-5.680) 0.869 Tumor subtype  Others[131]^c  PDAC 6.849 (2.103-22.300) 0.001 5.760 (1.436-23.109) 0.014 4.412 (1.226-15.886) 0.023 Grade  G1  G2 2.207 (0.832-5.857) 0.111 0.651 (0.189-2.240) 0.496 0.773 (0.254-2.351) 0.650  G3 and G4 3.427 (1.263-9.294) 0.016 1.140 (0.304-4.272) 0.846 1.040 (0.317-3.414) 0.949 Residual tumor  R0  R1 1.724 (0.994-2.992) 0.053 1.889 (0.973-3.670) 0.060 AJCC Stage  I  IIA 1.964 (0.546-7.069) 0.302 0.996 (0.194-5.119) 0.996 1.180 (0.310-4.485) 0.808  IIB 5.185 (1.820-14.771) 0.002 1.125 (0.255-4.962) 0.877 1.756 (0.579-5.326) 0.320  III and IV 11.391 (1.977-65.628) 0.006 5.353 (0.502-57.097) 0.165 8.755 (1.292-59.344) 0.026 Radiation treatment  No  Yes 0.387 (0.182-0.822) 0.014 0.378 (0.160-0.896) 0.027 0.411 (0.175-0.964) 0.041 Targeted therapy  No  Yes 0.506 (0.295-0.869) 0.014 0.317 (0.157-0.640) 0.001 0.403 (0.214-0.759) 0.005 [132]Open in a new tab Notes: ^a Adjusted covariates include all the indicators above; ^b Adjusted covariates include the prognostic factors from an unadjusted COX analysis; ^c Including neuroendocrine carcinoma, colloid carcinomas, acinar cell carcinoma and adenocarcinoma not otherwise specified. Bold number indicates statistical significance. Abbreviations: AJCC, the current American Joint Committee on Cancer stage; PDAC, pancreatic ductal adenocarcinoma; TCGA, The Genome Cancer Atlas. Development, comparison, and validation of predictive nomogram To build a more applicable and individualized predictive model, a predictive nomogram integrating clinical information and gene signatures was constructed based on the 95 patients with complete clinical information in TCGA. Through a stepwise Cox proportional hazard analysis, risk score, age, sex, tumor subtype, tumor size, residual tumor, radiation treatment, and targeted therapy were selected to establish a nomogram model ([133]Figure 5A). The calibration plot for predicting 1-, 2-, and 3-year OS ([134]Figure 5B) showed that the nomogram model performed well with the ideal prediction model. Figure 5. [135]Figure 5 [136]Open in a new tab Performance of the nomogram in predicting prognosis in the TCGA dataset. Notes: (A) Nomogram for predicting 1-, 2-, and 3-year OS in PC patients. (B) Calibration plot for 1-, 2-, and 3-year OS of the nomogram. (C) Comparison of the predictive power of the nomogram model, AJCC stage, and risk score, as assessed using C-index. (D, E) Comparison of the predictive power of the nomogram model, AJCC stage, and risk score by AUC at 1 and 2 years. (F) Kaplan–Meier analysis of risk groups stratified using total point of the proposed nomogram. Other, including neuroendocrine carcinoma, colloid carcinomas, acinar cell carcinoma, and adenocarcinoma not otherwise specified; vertical bars, 95% CI. Abbreviations: AJCC, the American Joint Committee on Cancer; AUC, area under the curve; C-index, concordance index; OS, overall survival; PC, pancreatic cancer; PDAC, pancreatic ductal adenocarcinoma; TCGA, The Genome Cancer Atlas. We compared the predictive power of the nomogram model, AJCC stage and risk score: the C-index ([137]Figure 5C) of the nomogram was 0.804 (95% CI, 0.740–0.868), which is significantly greater than that of the AJCC stage (0.609 [95% CI, 0.536–0.683], P<0.001) and risk score (0.645 [95% CI, 0.558–0.732], P<0.001). The AUC of the nomogram at 1 year ([138]Figure 5D) was 0.833 (95% CI, 0.731–0.935), which is superior compared with that of the AJCC stage (0.572 [95% CI, 0.464–0.680], P<0.001) and risk score (0.707 [95% CI, 0.574–0.840], P=0.026). The AUC of the nomogram at 2 years ([139]Figure 5E) was 0.888 (95% CI, 0.797–0.978), which is superior to that of the AJCC stage (0.757 [95% CI, 0.636–0.878], P=0.039) and risk score (0.686 [95% CI, 0.543–0.829], P=0.005). In addition, based on the total point of the nomogram, we stratified patients into low-, medium-, and high-risk groups (cutoff points were selected at each tertile point). Then, Kaplan–Meier analysis revealed that scoring using the nomograms effectively discriminated the risk groups in PC (P<0.0001) ([140]Figure 5F). Discussion In the past few decades, large amounts of data have been generated via high-throughput methods, such as microarrays and next-generation sequencing technologies, which significantly facilitates investigations of the interaction between gene signatures and disease. Meanwhile, an increasing number of studies tend to identify biomarkers through the analysis of multiple data sources, which often provides stronger evidence than a single data source. In the current study, to enhance the strength of our results, we identified DEGs and SRGs in PC via a joint analysis of six different data sources. Through GO biological process and KEGG analyses of the DEGs, the main biological processes and pathways involved in human PC were identified ([141]Figure 1C and D). Many previous studies have reported that the PI3K-Akt and p53 signaling pathways play important roles in cell cycle arrest, cell invasion, proliferation, angiogenesis, and metastasis in PC, which is consistent with our results.[142]^27^–[143]^33 Therefore, the biological processes and pathways reported here are worth further study to increase our understanding of the mechanism underlying carcinogenesis and progression in PC. Survival analyses and correlation analyses indicated that the SRGs (LYRM1, KNTC1, IGF2BP2, and CDC6) were significantly associated with PC prognosis. CDC6 is an essential gene required for DNA replication, which has been reported as overexpressed in various types of cancer.[144]^34^–[145]^36 High expression of CDC6 could trigger tumor-like transformation, apoptosis attenuation, genomic instability, cell proliferation, and epithelial-to-mesenchymal transition[146]^37^–[147]^39 and has been associated with poor prognosis in epithelial ovarian cancer.[148]^37 CDC6 depletion could result in increased cell death and attenuate tumor migration and invasion.[149]^35^,[150]^40 IGF2BP2 is a post-transcriptional regulatory factor implicated in mRNA localization, stability, and translational control. In previous studies, IGF2BP2 has been confirmed as upregulated in different cancer types[151]^41^–[152]^44 and is associated with tumor carcinogenesis, invasion, and prognosis.[153]^43^,[154]^45^,[155]^46 Although the function of Homo sapiens LYRM1 and KNTC1 have not yet been studied in cancer, these two genes have been reported to participate in the regulation of cell division, proliferation, and apoptosis,[156]^47^–[157]^49 which may affect tumor development and progression. However, the roles of LYRM1, KNTC1, IGF2BP2, and CDC6 in PC are still unclear, and further study of their underlying mechanism in PC and potential therapeutic applications is warranted. The current results demonstrated that the risk score based on the SRGs showed a relatively good and consistent performance in predicting OS in PC patients in the TCGA dataset and the other three validation cohorts (C-indexes of risk score were more than 0.60 and the AUC values at 3-year were more than 0.70 in the four datasets). However, a predictive model based on gene signatures or clinicopathological information alone may be unable to comprehensively elucidate tumor behaviors and their underlying mechanisms. Therefore, a composite and more effective predictive model integrating clinical and gene information is needed. To the best of our knowledge, a predictive nomogram for PC based on both clinical factors and gene signatures has not been previously reported. In the current study, we generated an effective prognostic nomogram via integrating clinical factors as well as risk score in a TCGA dataset. Good agreement was observed in the calibration curve of our nomogram between the predicted and observed outcomes ([158]Figure 5B). The nomogram demonstrated a greater C-index and AUC values than those of the AJCC stage and risk score ([159]Figure 5C–E). Therefore, our predictive nomogram may facilitate clinicians in predicting the individual risk of patient death and provide guidance for patient assessment and therapeutic decision-making. However, there are some limitations in the current study. First, we studied the roles of SRGs through data mining only, and no experimental data on the molecular mechanisms of these genes in PC have been reported. Therefore, further experimental studies may enhance our understanding of the biological behavior of PC. Second, the nomogram was developed and validated in a single dataset, and therefore the performance of our model needs to be further validated in independent external datasets with complete gene and clinical information. Conclusion The current study identified four new biomarkers that are significantly associated with PC carcinogenesis, progression, and prognosis, which may be helpful in studying underlying carcinogenesis mechanisms and potential therapeutic applications in PC. The predictive nomogram showed robust performance in predicting PC prognosis. Therefore, our model may provide an effective and reliable guide to prognosis assessment and treatment decision-making in the clinic. Acknowledgments