Abstract Background Hypoxia and lactate metabolism products are critical components of the tumor microenvironment in pancreatic cancer (PC), influencing tumor invasiveness, metastasis, and treatment resistance. This study aims to explore the role of hypoxia- and lactate metabolism-related genes (HLRGs) in predicting overall survival and guiding treatment for PC patients. Methods Gene expression and clinical data from PC patients were obtained from TCGA, ICGC, and GEO. Normal pancreatic tissue data were sourced from GTEx. Differential expression analysis was performed on the merged TCGA-PAAD and GTEx cohorts to identify differentially expressed genes (DEGs). We performed an intersection analysis between the DEGs and the HLRGs obtained from the MsigDB database to identify the DEGs associated with hypoxia and lactate metabolism in PC. A prognostic model was developed using random survival forests, Cox regression, and LASSO analysis in the TCGA-PAAD cohort. The model was externally validated in the ICGC-PACA and [34]GSE85916 cohorts. Risk stratification was performed, and the differences between subgroups in tumor mutational burden, immune microenvironment, and drug response were analyzed. RT-qPCR validated the key genes expression differences. Results A prognostic model based on HLRGs (SLC7A7, PYGL, HS3ST1, DDIT4, CYP27A1, ANKZF1, COL5A1) was established. High-risk patients exhibited worse prognosis, higher tumor mutational burden, and better response to anti-PD-L1 therapy, while low-risk patients exhibited higher immune infiltration and increased chemotherapy sensitivity. RT-qPCR confirmed that SLC7A7 and COL5A1 were upregulated, while ANKZF1 was downregulated in PC. Conclusions We developed an HLRGs-based prognostic model that predicts overall survival and guides treatment strategies, contributing to precision therapy in PC. Keywords: Pancreatic cancer, Hypoxia, Lactate metabolism, Prognosis, Tumor mutational burden, Tumor immune microenvironment, Drug sensitivity Introduction Pancreatic cancer (PC), the seventh leading cause of cancer-related mortality worldwide, is characterized by high heterogeneity and a dismal five-year survival rate of less than 8% [[35]1]. Over the past two decades, the incidence of PC has doubled and is projected to rise further due to multiple risk factors such as population aging, obesity, and smoking [[36]2]. Due to its insidious onset, lack of typical symptoms and effective screening methods, 80% of PC patients are diagnosed at advanced stage where they have already lost the opportunity for surgery, which is currently the only means of curative treatment for PC [[37]3]. Despite advancements in radiotherapy, chemotherapy, immunotherapy, and targeted therapy, the therapeutic benefits for PC patients remain limited compared to other cancers, likely due to the complex tumor microenvironment (TME) [[38]4]. Therefore, identifying a prognostic signature and developing individualized treatment strategies for PC are of significant clinical importance. Hypoxia is considered a common characteristic of various solid tumors, including PC, which may be associated with uncontrolled tumor tissue proliferation leading to substantial oxygen consumption and anemia induced by the tumor [[39]5]. It upregulates hypoxia-inducible factors (HIF), shifting cellular metabolism from oxidative phosphorylation to anaerobic glycolysis. HIF promotes tumor angiogenesis and epithelial-mesenchymal transition (EMT), playing a pivotal role in the malignant phenotype of pancreatic tumors [[40]6, [41]7]. Microenvironmental hypoxia is a determinant of the invasiveness, metastatic potential, and therapeutic resistance of solid tumors, while alleviating hypoxia may enhance patient sensitivity to immunotherapy by reshaping the TME [[42]8, [43]9]. Furthermore, under hypoxic conditions, pyruvate is predominantly converted to lactate via anaerobic glycolysis, leading to lactate accumulation [[44]5]. Lactate, a glycolysis byproduct, is associated with cancer recurrence and poor prognosis [[45]10]. Excessive lactate acidifies TME, promoting tumor invasion and metastasis [[46]11]. Additionally, lactate suppresses the anti-tumor activity of T cells and NK cells while inducing M0 macrophage polarization toward the pro-tumorigenic M2 phenotype, which is linked to PC aggressiveness [[47]12]. Notably, lactate stabilizes HIF-α through its conversion to pyruvate [[48]10], and its accumulation lowers extracellular pH, further exacerbating hypoxia in the TME [[49]5]. Current studies have analyzed hypoxia or lactate metabolism as separate microenvironmental factors in PC [[50]13, [51]14]. We propose that increased lactate production under hypoxic conditions further exacerbates hypoxia, creating a bidirectional relationship. These two factors together form an integral part of the PC tumor microenvironment. The objective of this study is to establish a prognostic model for PC based on hypoxia- and lactate metabolism-related genes (HLRGs) to predict overall survival (OS) of PC patients and to provide novel insights into personalized chemotherapy and immunotherapy for PC patients. Materials and methods Data acquisition and processing In total, we collected transcriptomic data and clinical profiles from 437 PC patients across three independent public datasets. Specifically, we obtained 175 samples from the TCGA-PAAD cohort via the TCGA database ([52]https://portal.gdc.cancer.gov), excluding patients with OS of less than 30 days to minimize the influence of surgical complications or incidental events. Additionally, we retrieved 80 PC samples from the [53]GSE85916 cohort through the GEO database ([54]https://www.ncbi.nlm.nih.gov/geo/) and 182 PC samples from the ICGC-PACA cohort via the ICGC database ([55]https://dcc.icgc.org/). Furthermore, we included 167 normal pancreatic tissue samples from the GTEx cohort, accessed through the UCSC database ([56]https://xenabrowser.net/datapages/). The data processing steps are as follows: (1) Selecting protein-coding genes (mRNA) from the expression matrix; (2) Transforming gene expression data into log2(COUNT + 1) format for differential expression analysis. Using the keywords “hypoxia” and “lactic” in the MsigDB database ([57]http://www.gsea-msigdb.org/gsea/downloads.jsp), we identified 200 hypoxia-related genes (HRGs) and 325 lactate metabolism-related genes (LMRGs). After merging these gene sets and removing duplicates, we compiled a final list of 518 HLRGs. Differential expression analysis and enrichment analysis We integrated the TCGA-PAAD data with GTEx data and employed the limma package in R to remove batch effects between the two datasets. We then performed principal component analysis (PCA) to assess sample clustering between the two groups. Differential expression analysis was conducted using the criteria |log2FC| > 1 and adj.P.Val < 0.05. The resulting differentially expressed genes (DEGs) were intersected with genes from the ICGC database to identify robust DEGs for further analysis. Finally, the DEGs were intersected with the HLRGs, and the common genes were considered as differentially expressed genes related to hypoxia and lactate metabolism (HLR-DEGs). Venn diagram, volcano plot, and heatmap were generated to visualize these genes. Based on the STRING database ([58]https://cn.string-db.org), we conducted a protein-protein interaction (PPI) network analysis for HLR-DEGs, with a correlation threshold set to medium confidence (0.40). The PPI network was visualized using Cytoscape software. Furthermore, functional enrichment analysis, including Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses, was performed using the clusterProfiler package. The enrichment results were visualized using the ggplot2 package. Construction of hypoxia and lactate metabolism prognostic signature in PAAD The TCGA-PAAD cohort (n = 175) was utilized as the training set, while the ICGC-PACA cohort (n = 182) and [59]GSE85916 cohort (n = 80) served as validation sets. Initially, we employed a machine learning approach with Random Survival Forests, incorporating VIMP and Minimum Depth values, to preliminarily identify 50 potential prognostic genes that positively contributed to the prognostic model. Subsequently, univariate COX regression analysis was conducted on these potential genes to select those that are associated with survival when considered as individual variables. Following this, Lasso regression analysis was applied to mitigate overfitting. Finally, multivariate Cox regression analysis was conducted on the Lasso regression results to obtain the HLR-DEGs associated with prognosis. The prognostic model was constructed using the survival package, with the risk score calculated according to the following formula: graphic file with name d33e408.gif In the formula, “n” denotes the number of genes, “coef” indicates the correlation coefficient corresponding to the genes in the model, and “expr” refers to the expression of the genes. Patients in the training set were classified into high-risk and low-risk groups based on the median risk score calculated from this model. Evaluation and validation of the prognostic model Survival analysis was performed on TCGA cohort samples based on risk stratification. The survminer package was used to generate survival curves, survival status scatter plots, and Kaplan-Meier risk curves to compare differences in survival between high-risk and low-risk groups. The timeROC package was employed to create ROC curves for 1, 2, and 3 years, and the predictive performance of the model was assessed by calculating the area under the curve (AUC). Additionally, samples from the ICGC and GEO cohorts were utilized as independent external validation sets. Using the same prognostic model formula, samples were classified into high- and low-risk groups, and similar visualizations were generated for validation analysis. We also evaluated whether clinical characteristics could serve as independent prognostic factors influencing the survival time of PC patients using univariate and multivariate COX regression analyses, as well as ROC curves. To further strengthen the predictive function of the model, we constructed a Nomogram using the rms package, incorporating clinical characteristics and risk scores to predict patient mortality at 1, 2, and 3 years. Calibration curves were then plotted to evaluate the accuracy of the Nomogram predictions. Tumor immune microenvironment analysis We utilized the CIBERSORT package to visualize the infiltration of 22 immune cell subtypes in the TCGA cohort samples and further analyzed the differences in immune cell infiltration between high- and low-risk groups. Subsequently, the ESTIMATE algorithm was employed to assess the differences in the content of immune cells and stromal cells in the TME between the two groups. The expression levels of immune checkpoint genes are closely related to the response to immunotherapy. Therefore, we evaluated the response to immunotherapy in the two groups by analyzing the differences in the expression of immune checkpoint genes and generated an immune checkpoint gene correlation circle plot [[60]15]. Finally, the relationship between key genes in the prognostic model and immune cell infiltration was visualized using Lollipop plots. Comprehensive mutation analysis in tumors After downloading the mutation data of the TCGA-PAAD cohort samples from the TCGA database and removing the samples with missing survival information in the cohort, the mutation data were analyzed and visualized using the maftools package. Initially, we analyzed the base mutation frequency and the correlation between mutated genes in the cohort samples. Subsequently, pathway enrichment analysis was conducted on the mutated genes to identify signaling pathways with higher mutation frequencies in PC patients. Additionally, waterfall plot were used to visualize the 20 genes with the highest mutation frequencies. Furthermore, to explore the relationship between prognostic model risk stratification and tumor mutation burden (TMB), separate waterfall plots were created for high- and low-risk groups. After calculating the TMB scores of the patients, we also divided them into high TMB and low TMB groups based on the median value and plotted KM curves combining the TMB scores and risk stratification. Finally, we explored the relationship between TMB scores and immune cell infiltration through scatter plots. Drug sensitivity prediction The Genomics of Drug Sensitivity in Cancer (GDSC) database provides IC50 values for various first-line chemotherapy drugs used in PC patients. The IC50 value, defined as the drug concentration needed to inhibit a biological process by 50%, is a key metric for assessing patient sensitivity to chemotherapy drugs. Using the oncoPredict package, we integrated the gene expression data from the TCGA cohort with the GDSC database to obtain the estimated IC50 values for different chemotherapeutic agents in the TCGA cohort samples. Subsequently, differences in drug sensitivity between high- and low-risk groups for various chemotherapy drugs were visualized using box plots. The Drug Gene Interaction Database (DGIdb) is a comprehensive resource that provides information on interactions between drugs and genes. In this study, we identified seven key genes associated with the prognosis of PC patients under the combined effects of hypoxia and lactate metabolism. Using DGIdb, we analyzed the drugs that interact with these seven genes, excluding inhibitors of protective genes and activators of risk genes. The potential drugs were visualized using a Sankey diagram. Cell culture and RT-qPCR Our group cultivated human normal pancreatic ductal epithelial cell lines (HPDE) and PC cell lines (PANC-1, MIA PaCa-2). All cell lines were cultured in DMEM (Hyclone) supplemented with 10% fetal bovine serum (FBS) and 1% penicillin-streptomycin (10,000 U/mL penicillin and 10,000 µg/mL streptomycin). The cell cultures were maintained in a 5% CO2 incubator set at 37℃. First, total RNA was extracted from human normal pancreatic ductal epithelial cells and PC cells using TRIzol reagent (Invitrogen, USA). Subsequently, the total RNA was reverse transcribed into cDNA using the PrimeScript RT reagent Kit (TaKaRa, Japan). A 2-µl volume of cDNA was then mixed with primers and SYBR Green PCR Master Mix (TaKaRa, Japan) for RT-qPCR reactions. The relative mRNA expression levels were calculated using the 2^-ΔΔCt method. The primers for the aforementioned genes were synthesized by Genewiz (Tianjin, China), and the primer sequences are as follows: SLC7A7 (Forward: 5’-CTCACTGCTTAACGGCGTGT-3’, Reverse: 5’-CCAGTTCCGCATAACAAAGG-3’), COL5A1(Forward: 5’-GCCCGGATGTCGCTTACAG-3’,Reverse: 5’-AAATGCAGACGCAGGGTACAG-3’), ANKZF1Forward: 5’-CGGTTTAACCTAAAGCAACGTCT-3’,Reverse: 5’-CCGATCCAGTGTCTGCAAGT-3’). Results Identification of HLR-DEGs The flowchart of this study is shown in Fig. [61]1. The PCA revealed distinct clustering patterns between normal pancreatic tissues and PC samples in the TCGA and GTEx cohorts. (Fig. [62]2A). Through differential expression analysis, we identified 6,632 DEGs. After intersecting with genes from the ICGC database, we narrowed down to 2935 DEGs. Further intersection with 518 HLRGs yielded 124 HLR-DEGs, as visualized in the Venn diagram (Fig. [63]2B). The expression profiles of these HLR-DEGs in PC patients were illustrated using a volcano plot and a heatmap (Fig. [64]2C–D). Fig. 1. [65]Fig. 1 [66]Open in a new tab Flowchart of this study Fig. 2. [67]Fig. 2 [68]Open in a new tab Identification of differentially expressed genes related to hypoxia and lactate metabolism (HLR-DEGs) for pancreatic cancer. A Principal component analysis reveals significant gene expression differences between TCGA and GTEx cohort samples. B By obtaining genes common to hypoxia- and lactate metabolism-related genes and differentially expressed genes, the Venn diagram shows 124 HLR-DEGs. C Volcano plot of HLR-DEGs. Red represents genes that are relatively highly expressed in pancreatic cancer cell lines, while green represents the opposite. The 7 labeled genes are key genes for the prognostic model screened by machine learning and regression analysis in this study. D The heatmap shows the expression differences of HLR-DEGs between tumor tissues and normal tissues. The Control group contained 167 samples from the GTEx cohort and 4 peritumoral tissue samples from the TCGA cohort, and the Treatment group contained 171 tumor samples from the TCGA cohort Enrichment analysis and PPI network analysis PPI network analysis identified GAPDH and HIF1A as the hub proteins in the network (Fig. [69]3A). GO enrichment analysis revealed that the genes were primarily enriched in ATP metabolic process, oxidoreductase complex, and NAD(P)H dehydrogenase (quinone) activity, suggesting a strong association between HLR-DEGs and tumor energy metabolism and redox balance (Fig. [70]3B). Furthermore, KEGG enrichment analysis demonstrated that the genes were predominantly enriched in the HIF-1 signaling pathway (Fig. [71]3C). Fig. 3. [72]Fig. 3 [73]Open in a new tab Protein-protein interaction network analysis and GO, KEGG enrichment analysis. A Protein-protein interaction network of hub gene in string database: protein interaction relationship. B GO enrichment analysis shows the top5 relevant pathways in Biological Process (BP), Cellular Component (CC), and Molecular Function (MF), respectively. C 20 significantly enriched KEGG pathways Construction of a prognostic model related to HLRGs Using the random survival forest method, we identified 50 potential prognostic-related genes (Fig. [74]4A). Subsequently, univariate COX regression analysis was conducted on these potential genes, resulting in the identification of 34 genes that were significantly associated with survival when considered as individual variables (Fig. [75]4B). Lasso regression analysis was then conducted to prevent model overfitting, resulting in the selection of 14 genes(Fig. [76]4C–D). Finally, multivariate COX regression analysis led to the selection of 7 HLRGs for constructing the prognostic model (Fig. [77]4E). Among these, ANKZF1, DDIT4, COL5A1, and HS3ST1 were classified as HRGs, while PYGL, CYP27A1, and SLC7A7 were identified as LMRGs. The risk score for the prognostic model was calculated using the following formula: Risk Score = (0.6909531 × SLC7A7) + (0.5335105 × PYGL) + (0.3597579 × DDIT4) + (0.2648033 × HS3ST1) + (– 0.6619 × CYP27A1) + (– 0.6435226 × ANKZF1) + (– 0.1975346 × COL5A1). In the prognostic model, SLC7A7, PYGL, DDIT4, and HS3ST1 were identified as risk factors, while CYP27A1, ANKZF1, and COL5A1 were identified as protective factors. Fig. 4. [78]Fig. 4 [79]Open in a new tab Identifying hub genes for constructing prognostic models. A Random Survival Forest algorithm selected 50 potential prognostic-related genes. B Univariate Cox regression analysis identified 34 genes associated with prognosis. C–D Lasso regression analysis was employed to prevent overfitting of the model. E Multivariate Cox regression analysis ultimately identified 7 hub genes that could serve as independent prognostic factors Prognostic model validation and exploration of clinical utility The median risk score of the training set samples (0.9828304) was employed as the cutoff value to stratify patients into two groups: the low-risk group (N = 88) and the high-risk group (N = 87) (Fig. [80]5A). Scatter plots of survival status were generated for the training set, demonstrating significant survival differences between the high-risk and low-risk groups (Fig. [81]5D). Kaplan-Meier analysis showed that the low-risk group had significantly higher OS than the high-risk group (P < 0.0001) (Fig. [82]5G). Time-dependent ROC curves yielded AUC of 0.78, 0.82, and 0.82 at 1, 2, and 3 years, respectively, confirming the model’s effectiveness in predicting OS (Fig. [83]5J). Fig. 5. [84]Fig. 5 [85]Open in a new tab Construction and validation of prognostic models A Patients in TCGA-PAAD were stratified into high-risk and low-risk groups based on the median value of the risk score. D Scatterplot of the survival status of patients in the high-risk and low-risk groups. G KM survival curve to assess survival differences between patients in the high-risk and low-risk groups in the TCGA Cohort. J Accuracy of model predictions assessed by ROC curves in the TCGA cohort. B, E, H, K In the validation cohort of the ICGC-PADA, risk scores were calculated and patients were grouped using the prognostic modeling formulas, and survival status scatterplots, KM curves, and ROC curves were plotted. C, F, I, L Similarly, in the validation cohort of the [86]GSE85916, scatter plot of survival status, KM survival curve, and ROC curves were plotted after grouping using the prognostic model The samples in the two validation cohorts, ICGC-PACA and [87]GSE85916, were also stratified into high-risk and low-risk groups according to the median risk score (0.9828304) (Fig. [88]5B–C). In the validation cohorts, there were also survival differences between the high-risk and low-risk groups (Fig. [89]5E–F). The KM curves showed that patients in the low-risk group likewise had a higher OS than those in the high-risk group (Fig. [90]5H-I). The AUC for the ICGC cohort at 1, 2, and 3 years were 0.61, 0.60, and 0.66, and the AUC for the [91]GSE85916 cohort were 0.74, 0.68, and 0.73, respectively (Fig. [92]5K–L). In the TCGA cohort, univariate COX regression analysis showed that risk score and lymph node metastasis could be used as prognostic factors for patients with pancreatic tumors (Fig. [93]6A). Multivariate COX regression analysis further confirmed that risk score could be used as an independent prognostic factor in PC (Fig.[94]6B). The ROC curve showed an AUC of 0.887 for the risk score of the prognostic model, which had the highest sensitivity and specificity among all clinical features, further demonstrating the accuracy and validity of the prognostic model in predicting the survival of pancreatic cancer patients (Fig. [95]6C). Additionally, we developed a Nomogram incorporating clinical characteristics and risk score to predict the survival time of PC patients (Fig. [96]6E). This Nomogram provides individualized predictive information for patients, exemplified by the sixth patient in the training set (TCGA-HZ-8315-01). The calibration curve of the Nomogram showed concordance between the predicted mortality and actual mortality of patients (Fig. [97]6D). Fig. 6. [98]Fig. 6 [99]Open in a new tab Prognostic correlation analysis combining clinical features and construction of nomgram. A Univariate Cox regression analysis results of clinical characteristics and risk score. B Multivariate Cox regression analysis results of clinical characteristics and risk score. C ROC curve of clinical characteristics and risk score. D Calibration curves assess nomgram prediction of patient mortality accuracy. E The Nomgram constructed by combining the riskscore and clinical characteristics predicts patient mortality at 1, 2, and 3 years. To illustrate the application of the nomogram, the sixth patient in the cohort (TCGA-HZ-8315-01) was randomly selected using the random number generation function in R, whose predictive information is exhibited Comprehensive analysis of immune microenvironment in prognostic model We assessed the infiltration of 22 immune cells in the TCGA cohort samples using the CIBERSORT algorithm (Fig. [100]7A, C). The differences in immune cell infiltration between subgroups showed that patients in the low-risk group had significantly higher infiltration of CD8 + T cells, while those in the high-risk group exhibited higher infiltration of M0 macrophages. Additionally, the violin plot demonstrated that patients in the low-risk group had higher ESTIMATE scores (P = 0.045) (Fig. [101]7B). Furthermore, immune checkpoint analysis revealed higher expression of BTLA and TNFRSF4 in patients in the low-risk group (P < 0.05), and higher expression of CD274 (PD-L1) in patients in the high-risk group (P < 0.001) (Fig. [102]7D). The Circos plot visualized the relationships among immune checkpoint genes (Fig. [103]7E). Finally, Lollipop plots revealed that the infiltration levels of immune cells were closely associated with seven key genes in the prognostic model (Fig. [104]7F–L). Fig. 7. [105]Fig. 7 [106]Open in a new tab Tumor immune microenvironment analysis of TCGA cohort. A Analysis of the level of immune infiltration of 22 immune cell subtypes in the cohort based on the CIBERSORT algorithm. B ESTIMATE score to assess differences in immune infiltration between patients in the high-risk and low-risk groups. C Differences in the infiltration levels of 22 immune cell types between high-risk and low-risk patients. * P < 0.05, ** P < 0.01, *** P < 0.001,**** P < 0.0001. D Expression differences of immune checkpoint genes between high-risk and low-risk patients. *P < 0.05, **P < 0.01, ***P < 0.001. E The Circos plot illustrates the relationships among immune checkpoint genes. The thickness and color of the ribbons correspond to the degree of correlation between the expression levels of these genes. F-L Lollipop plots show the correlation between the seven hub genes of the prognostic model and 22 immune cell types TMB analysis and its correlation with prognostic model and immune microenvironment TMB analysis of the TCGA-PAAD cohort revealed that transition (Ti) mutations occurred more frequently than transversion (Tv) mutations, with cytosine-to-thymine substitutions being the most prevalent (Fig. [107]8A). Correlation heatmap demonstrated strong associations among TP53, CDKN2A, and KRAS mutations (Fig. [108]8B). Enrichment analysis indicated that these mutated genes were primarily enriched in the RTK-RAS and TP53 signaling pathways. We further visualized the mutation status of the genes in the pathways (Fig. [109]8C–D). We performed visualization analysis on 169 samples with mutation information from the TCGA-PAAD cohort, the waterfall plot displayed the top 20 genes with the highest mutation frequencies, with KRAS, TP53, SMAD4, and CDKN2A exhibiting the highest mutation rates (Fig. [110]8E). We further analyzed gene mutation differences among various patient subgroups in the prognostic model. In the TCGA-PAAD cohort, 157 samples containing both survival and mutation data were selected, comprising 79 samples in the low-risk group and 78 in the high-risk group. The subgroup waterfall plot showed that the mutation frequency in the high-risk group (92.31%) was significantly higher than that in the low-risk group (68.35%) (Fig. [111]8F–G). Fig. 8. [112]Fig. 8 [113]Open in a new tab Tumor mutation burden (TMB) analysis of TCGA cohort. A Mutations in base sequences of samples. B Correlation heatmap between the 20 genes with the highest mutation frequencies. C Pathways enriched for mutant genes. D Genes with mutations in the RTK-RAS pathway and TP53 pathway in the TCGA cohort samples. The x-axis represents the cohort samples, and the y-axis represents the gene names. On the y-axis, blue indicates proto-oncogenes, while green indicates tumor suppressor genes. E Waterfall plots of the 20 genes with the highest mutation frequencies. F, G Waterfall plots of the 10 genes with the highest mutation frequencies in the low-risk and high-risk groups Subsequently, we explored the relationship between TMB and the prognostic model established in this study. The TMB scores of the 157 samples were calculated and transformed into the total_perMB form. Patients were categorized into high- and low-TMB groups according to the median TMB score (-0.22184875). The box plot revealed that patients in the high-risk group exhibited higher TMB values (P = 0.0084), and the scatter plot demonstrated a positive correlation between risk score and TMB value (R = 0.39, P = 0.039) (Fig. [114]9A–B). KM curves showed no significant difference in survival between the high-TMB and low-TMB groups (P = 0.068). However, when incorporating the risk stratification from our prognostic model, patients in the low-TMB and low-risk group exhibited significantly better prognosis (P = 0.00014) (Fig. [115]9C–D). Fig. 9. [116]Fig. 9 [117]Open in a new tab Analysis of TMB and its correlation with prognostic model and immune infiltration. (A) The box plot shows the difference in tumor mutation burden between the high-risk group and the low-risk group. B The scatter plot shows the correlation between risk score and TMB. (C) KM survival curve of patients grouped by the median value of TMB. D KM survival curves of patients grouped by TMB combined with prognostic model risk stratification. E The scatter plot shows the correlation between TMB and CD8+ T cells. * P < 0.05, ** P < 0.01, *** P < 0.001. F The scatter plot shows the correlation between TMB and M0 macrophage. * P < 0.05 Moreover, using the aforementioned immune infiltration analysis results, we assessed the correlation between TMB scores and M0 macrophages as well as CD8 + T cells individually. Scatter plots revealed that TMB scores correlated positively with M0 macrophage infiltration and negatively with CD8 + T cells infiltration in PC patients (Fig. [118]9E–F). Therefore, we propose that increased TMB correlates with elevated M0 macrophage infiltration and reduced CD8 + T cells infiltration within the tumor immune microenvironment of PC patients. The higher TMB observed in the high-risk group in this study may result from immune microenvironment alterations. Chemotherapeutic drug sensitivity and potential drug prediction Drug sensitivity analysis revealed significantly lower IC50 values in low-risk patients for multiple chemotherapeutic agents, including Paclitaxel, Gemcitabine, Oxaliplatin, Irinotecan, Cisplatin, Camptothecin, Cyclophosphamide, Olaparib, and Niraparib, suggesting enhanced therapeutic responsiveness in this subgroup (Fig. [119]10A–I). The Sankey plot presents the analysis results from the DGIdb database, identifying a total of 18 drugs that could potentially serve as therapeutic options for PC patients (Fig. [120]10J). Fig. 10. [121]Fig. 10 [122]Open in a new tab Prediction of drug response response in patients with pancreatic cancer. A–I Differences in sensitivity to chemotherapeutic agents between patients in the high-risk group and those in the low-risk group based on GDSC database analysis. J Prediction of potentially sensitive drugs for pancreatic cancer patients based on the DGIdb database Expression validation of key genes To validate our findings, we selected two clinically relevant genes from the model: SLC7A7, the risk gene with the highest correlation coefficient, and ANKZF1, a protective gene demonstrating reduced expression in PC. RT-qPCR results revealed that SLC7A7 mRNA expression was significantly higher in PC cell lines compared to normal pancreatic ductal epithelial cell lines, whereas ANKZF1 mRNA expression was lower in PC cell lines (Fig. [123]11A–B). Additionally, we validated the relative expression of COL5A1. The results showed that COL5A1 was significantly overexpressed in PC cell lines, consistent with the prognostic model analysis (Fig. [124]11C). Fig. 11. [125]Fig. 11 [126]Open in a new tab Further validation of key genes in the prognostic model A–C The mRNA expression levels of SLC7A7,COL5A1 and ANKZF1 were relatively quantified by RT-qPCR (HPDE was used as the reference group and set to 1), and normalized with β-acton as the internal reference gene. * P < 0.05, ** P < 0.01, *** P < 0.001 Discussion PC is characterized by its insidious onset, poor surgical outcomes, and frequent resistance to immunotherapy and chemotherapy, all linked to its complex TME [[127]4]. Due to the high heterogeneity of PC, patients at the same pathological stage may exhibit varying prognoses. Most patients hope to know their expected survival time after diagnosis [[128]16]. Hypoxia and lactate metabolism, key components of the TME, are mutually reinforcing and critically influence the biological behavior and therapeutic response of PC [[129]17]. Therefore, we analyzed hypoxia and lactate metabolism in combination, and built a prognostic model that can predict the survival time and guide the treatment of patients by machine learning and regression analysis. We ultimately established and validated a prognostic model comprising seven genes. Among them, SLC7A7, PYGL, HS3ST1, and DDIT4 were identified as risk genes, whereas ANKZF1, CYP27A1, and COL5A1 were protective genes. PYGL is the storage form of glycogen phosphorylase in the liver [[130]18]. In PC, hypoxia induces the high expression of PYGL through HIF-1α, promoting glycogen breakdown [[131]19]. The decomposed glycogen serves as a substrate for glycolysis to generate large amounts of lactic acid, and the acidified tumor microenvironment induces EMT thereby promoting tumor invasion and metastasis [[132]11]. PYGL is highly expressed in PC and strongly associated with poor prognosis, consistent with our findings. SLC7A7, encoding the light chain of y + LAT1, primarily controls the transport of cationic amino acids in vivo [[133]20]. It has been demonstrated that its homolog SLC7A5 (LAT1) is highly expressed in vascular endothelial cells of human pancreatic ductal adenocarcinoma (PDAC) and promotes the proliferation of tumor vascular endothelial cells [[134]21]. Similarly, deletion of SLC7A11 induces selective ferroptosis in PDAC and inhibits tumor growth [[135]22]. Given these observations, we hypothesize that SLC7A7 also plays a critical role as a risk factor in pancreatic tumor progression. HS3ST1, the rate-limiting enzyme for heparan sulfate production, is associated with tumor proliferation, metastasis, and angiogenesis [[136]23]. In colorectal cancer, knockdown of HS3ST1 significantly inhibits tumor growth [[137]24]. However, the relationship between HS3ST1 and PC remains unclear. DDIT4, also known as REDD1, is differentially expressed in various malignancies such as breast cancer, lung cancer, and gastric cancer, and is strongly associated with poor prognosis [[138]25–[139]27]. Hypoxia inhibits mammalian target of rapamycin (mTOR), a central regulator of protein synthesis, by upregulating DDIT4 expression [[140]28]. In the current research on PC, the association between high DDIT4 expression and poor prognosis has been widely confirmed, which is consistent with our findings [[141]29]. Ribosomal protein synthesis can be stalled due to defective mRNA or nascent peptide sequences, thereby leading to paused or terminated translation [[142]30]. ANKZF1 and its homolog Vsm1, a peptidyl tRNA hydrolase, release nascent chains from stalled ribosomes, enabling cellular function to continue [[143]31]. In previous studies, ANKZF1 has been validated as a protective factor against pancreatic tumors [[144]32]. COL5A1, encoding type V collagen, is highly expressed in PC, and its expression correlates positively with the lymph node ratio (the ratio of the number of metastatic lymph nodes to the number of detected lymph nodes).Patients with a high lymph node ratio generally have a poorer prognosis [[145]33]. In this study, COL5A1 was identified as a protective gene highly expressed in tumors, potentially due to the combined effects of multiple genes in the hypoxia and lactate metabolism pathways. The relationship between COL5A1 and PC requires further investigation and discussion. CYP27A1, a member of the cytochrome P450 oxidase family, has been demonstrated to exert antitumor effects in various cancers [[146]34]. CYP27A1 controls the occurrence of breast cancer by inhibiting the expression of oncogenes [[147]35], and it suppresses the proliferation of bladder cancer by regulating cholesterol homeostasis [[148]36]. However, its relationship with PC remains unclear, and its mechanism of action needs to be studied in depth. K-RAS, TP53, and CDKN2A are major driver genes in PC. These genes are frequently mutated and significantly correlated with each other, making them a focal point in current therapeutic research [[149]37]. K-ras gene, encoding a GTPase, participates in cell proliferation and angiogenesis. Approximately 90% of PC patients have mutations in the K-ras gene [[150]38]. K-ras mutant in PDAC tends to exhibit the most aggressive phenotype and the strongest treatment resistance [[151]39]. TP53, a tumor suppressor gene, encodes a protein that plays a key role in cellular stress response. TP53 mutations lead to loss of antitumor activity and acquisition of oncogenic properties [[152]40]. CDKN2A, a tumor suppressor gene that regulates the cell cycle, is inactivated in over 90% of PDAC patients [[153]41]. It has been noted that mutations in K-RAS, TP53, and CDKN2A influence the sensitivity of tumor patients to chemotherapeutic agents [[154]42]. In this study, patients in the low-risk group had significantly lower mutation burdens and significantly reduced IC50 values for multiple chemotherapeutic agents compared with the high-risk group. TMB reflects the degree of genomic alterations in tumor cells [[155]43]. We found that risk score was positively correlated with TMB, and analyzing TMB in combination with risk score could better predict the prognosis of PC patients. Furthermore, the relationship between TMB and the tumor immune microenvironment warrants deeper investigation. In PC patients, TMB scores correlated positively with M0 macrophage infiltration and negatively with CD8 + T cell infiltration. CD8 + T cells play a crucial role in the body’s anti-tumor immune response, while M0 macrophages are linked to tumor progression in vivo [[156]44, [157]45]. Consequently, we hypothesized that patients in the high-risk group in the prognostic model had higher TMB scores, worse prognosis, and poorer response to chemotherapeutic agents, which may potentially attributable to immune microenvironment alterations. Immune checkpoint analysis suggested that BTLA and TNFRSF4 inhibitors could be used as immunotherapy for low-risk patients, while high-risk patients may benefit more from CD274 (PD-L1) inhibitors. The infiltration of immune cells in tumors reflects the actual situation of the TME. The strong chemotherapeutic drug resistance and high metastatic potential of PDAC patients are largely attributed to the complex TME [[158]46]. When conventional therapeutic approaches, such as surgery and chemoradiotherapy, fail to achieve satisfactory outcomes, immunotherapy strategies, including immune checkpoint blockade and combination immunotherapy, can significantly improve patient prognosis [[159]47]. Advanced PDAC patients require CD8 + T cells and immune checkpoint inhibitors (ICIs) to work together to extend survival time [[160]48]. Our analysis revealed a significant correlation between HLRGs and the immune microenvironment of patients, suggesting that the prognostic model can be used to assess the immune infiltration and guide immunotherapy. CIBERSORT analysis showed that high-risk patients had significantly lower CD8 + T Cell infiltration. As an important component of the body’s anti-tumor immune response, CD8 + T Cell depletion is closely associated with disease progression in PC patients [[161]49, [162]50]. Meanwhile, high-risk patients were enriched in M0 macrophages, which have been shown to promote pancreatic cancer growth in vivo [[163]51]. The ESTIMATE score indicated that low-risk patients had a stronger anti-tumorimmune response. Higher TMB is commonly associated with increased neoantigen production, enhancing tumor immunogenicity and attracting more CD8 + T cell infiltration. However, our study found reduced CD8 + T cell infiltration in PC patients with high TMB. We propose that a synergistic interaction between TMB and cancer-associated fibroblasts (CAFs) may explain this phenomenon [[164]52, [165]53]. Specifically, CAFs in the pancreatic cancer microenvironment may suppress immune responses by secreting TGF-β, leading to the reduced CD8 + T cell infiltration observed in these patients [[166]54]. By integrating hypoxia and lactate metabolism, two critical factors of the tumor microenvironment, our study suggests for the first time that upregulation of SLC7A7 and downregulation of ANKZF1 are closely related to poor prognosis of patients with PC, and preliminarily verifies their aberrant expression in PC. However, our study still has some limitations. Although we supplemented the TCGA cohort with GTEx samples and conducted multiple external validations using independent GEO and ICGC databases, the sample size remains relatively modest. This may compromise the robustness of our analysis and potentially lead to biased results. For instance, certain studies have proposed that COL5A1 is a risk gene for PC [[167]33]. We preliminarily confirmed the upregulation of COL5A1 in PC using RT-qPCR. The role of genes in tumors can be further validated by subsequent cell function experiments. Additionally, this study is retrospective, and we will conduct prospective clinical studies to further validate the prognostic model. In future studies, we will focus on elucidating the mechanisms of SLC7A7 and ANKZF1 in pathways related to hypoxia and lactate metabolism, addressing the research gaps for these genes in PC. Conclusion In summary, we developed and validated a prognostic model for PC incorporating hypoxia and lactate metabolism. The risk scores based on this model can serve as an independent prognostic factor and predict OS in PC patients. Additionally, we analyzed the tumor immune microenvironment, TMB, and drug sensitivity in PC patients, providing new insights for personalized treatment. Abbreviations AUC Area under the curve DEGs Differentially expressed genes EMT Epithelial-mesenchymal transition HIF Hypoxia-inducible factor HLRGs Hypoxia and lactate metabolism-related genes HLR-DEGs Differentially expressed genes related to hypoxia and lactate metabolism HRGs Hypoxia-related genes LMRGs Lactate metabolism-related genes mTOR Mammalian target of rapamycin OS Overall survival PC Pancreatic cancer PDAC Pancreatic ductal adenocarcinoma PPI Protein-protein interaction TMB Tumor mutation burden TME Tumor microenvironment Author contributions Z.C.H. designed the study. H.A.Q. and Y.H. collected and organized the data. H.A.Q. analyzed the data, and S.C.C. performed the experimental validation. Z.C.H. wrote the first draft of the manuscript, S.J.J. and J.Z.J. organized and revised the manuscript. Z.C.H. and H.A.Q. contributed equally to the study and are co-first authors. All authors read and approved the final manuscript. Funding This work was supported by the Tianjin Municipal Education Commission Research Program (Natural Science) (Project No. 2024KJ201). Data availability The datasets that support the findings of this study are available in the TCGA (http://portal.gdc.cancer.gov), GEO (https://www.ncbi.nlm.nih.gov/geo/), ICGC (https://dcc.icgc.org/), GTEx (https://gtexportal.org/home/), MsigDB (http://www.gsea-msigdb.org/gsea/downloads.jsp), STRING database (https://cn.string-db.org), DGIdb (https://www.dgidb.org) databases, further inquiries can be obtained from the corresponding author. Declarations Ethics approval and consent to participate This study utilizes open-access databases and does not engage in original research involving human participants or animals. Competing interests The authors declare no competing interests. Footnotes Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Chen-Hui Zhang and An-Qi Huang have contributed equally to this work and share first author. References