Abstract Background Although the significance of intratumoral heterogeneity (ITH) has been widely acknowledged in various cancers, its role in pancreatic cancer (PC) remains underexplored and warrants further investigation. Methods Pancreatic cancer transcriptomic data were acquired from the TCGA and GEO databases. The DEPTH2 algorithm, in combination with differential expression analysis, was used to identify genes associated with intratumoral heterogeneity (ITH). We applied univariate Cox regression analysis and multiple machine learning techniques to establish a reliable prognostic model. Patients were then stratified according to their ITH scores, and differences between subgroups were examined through pathway enrichment analysis, immune cell infiltration profiling, and drug response prediction. Furthermore, we conducted subcellular localization and differential analysis of ITH using single-cell data, followed by cell-cell communication analysis to explore interaction relationships and identify key pathways. Results Patients exhibiting lower intratumoral heterogeneity (ITH) levels demonstrated poorer clinical outcomes. The constructed 11-gene signature successfully differentiated individuals into high- and low-risk categories with significant survival differences. Immune profiling revealed notable differences in immune cell composition between the two groups, with patients in the high ITH cohort exhibiting enhanced immune activation. Drug sensitivity analysis indicated a differential response to therapies, with high-risk patients more resistant to certain drugs. Single-cell RNA sequencing identified a greater ITH score in epithelial cells, highlighting key interactions, particularly involving Galectin signaling pathways. Conclusion Our results highlight intratumoral heterogeneity’s prognostic and therapeutic relevance in pancreatic cancer, suggesting its potential utility in guiding individualized treatment approaches. Supplementary Information The online version contains supplementary material available at 10.1007/s12672-025-03080-3. Keywords: Intratumoral heterogeneity, Pancreatic cancer, Prognostic model, Immune infiltration, Drug sensitivity Introduction With a five-year survival rate below 10%, pancreatic cancer is considered one of the most aggressive malignancies on a global scale [[32]1]. Its high mortality and therapeutic resistance pose significant challenges in oncology research. Epidemiological trends show a consistent upward trajectory in both the incidence and mortality of pancreatic cancer, with forecasts predicting it may rank as the second-leading cause of cancer-related deaths by 2030. The lack of early-specific symptoms and reliable screening methods leads to over 80% of patients being diagnosed at advanced or metastatic stages, preventing the possibility of curative surgery. Although progress has been made in surgical techniques, radiotherapy, chemotherapy, and targeted therapies, the overall prognosis for patients remains poor, emphasizing the urgent need to identify new biomarkers and treatment strategies [[33]2]. Intratumoral heterogeneity (ITH), a hallmark of tumor evolution, has emerged as a frontier in cancer research [[34]3]. ITH manifests as multidimensional diversity among tumor cells, encompassing genomic, epigenomic, transcriptomic, and proteomic variations that drive malignant progression, therapy resistance, and metastatic dissemination [[35]3]. The ITH score, a quantitative metric, systematically evaluates clonal architecture complexity, reflecting tumor evolutionary trajectories and treatment susceptibility. To date, ITH scoring has demonstrated prognostic value in breast [[36]4], colorectal cancer [[37]5], and other solid tumors [[38]6, [39]7], showing promise for guiding personalized therapies. Despite PC’s pronounced ITH features, systematic studies on ITH scoring in this malignancy remain limited. PC’s unique characteristics—including its dense desmoplastic stroma, highly heterogeneous tumor microenvironment, and complex clonal dynamics—make it an ideal model for investigating ITH. Growing evidence indicates that intercellular heterogeneity influences primary tumor growth kinetics, chemoresistance, and distant metastasis. Thus, comprehensively characterizing ITH scores across molecular subtypes and their clinical implications may offer new avenues to overcome therapeutic barriers. Although intratumoral heterogeneity (ITH) plays a critical role in pancreatic cancer biology, related research remains limited, and its clinical translation potential has yet to be fully explored. Rapid advancements in bioinformatics have opened new opportunities for research on various diseases [[40]8, [41]9]. Integrative analysis of multi-omics data (e.g., whole-exome sequencing, scRNA-seq, and spatial transcriptomics) combined with machine learning algorithms enables multidimensional dissection of PC heterogeneity. Furthermore, leveraging clinically annotated datasets from TCGA, ICGC, and other large-scale repositories allows the construction of prognostic models linking ITH scores to patient outcomes while elucidating underlying molecular mechanisms [[42]10]. As artificial intelligence and multicenter data are integrated, bioinformatics will become more essential in decoding the biological essence and clinical relevance of intratumoral heterogeneity in pancreatic cancer. This study systematically evaluated ITH levels in pancreatic cancer using established mRNA-based ITH assessment algorithms (DEPTH2) with TCGA and GEO datasets, demonstrating that elevated ITH significantly correlates with poor clinical outcomes. The machine learning-constructed ITH risk model showed robust prognostic predictive performance while providing novel insights into tumor immunosuppressive microenvironments and drug sensitivity evaluation. Further validation through single-cell RNA sequencing confirmed the biological significance of ITH-associated signature genes, providing theoretical support for applying ITH scoring in precision therapy for pancreatic cancer. Our study offers new research directions for developing individualized treatment strategies for pancreatic cancer. Materials and methods Patient dataset Data for 178 pancreatic cancer patients, including gene expression profiles, clinical information, and somatic mutations, were sourced from the Cancer Genome Atlas (TCGA) database ([43]https://portal.gdc.cancer.gov/). The Gene Expression Omnibus (GEO) database ([44]https://www.ncbi.nlm.nih.gov/geo/) provided two independent datasets: one single-cell RNA sequencing (scRNA-seq) dataset and one bulk transcriptome dataset. The scRNA-seq dataset [45]GSE212966 comprises expression profiles from six samples, including three PDAC tissues and three matched adjacent noncancerous tissues. [46]GSE62452, a bulk transcriptome dataset, contains gene expression data from 69 pancreatic tumor tissues and 69 adjacent non-tumor tissues. Differential expression analysis Differential expression analysis was performed using the ‘limma’ package in R, excluding genes with a false discovery rate (FDR) greater than 0.05. For the remaining differentially expressed genes (DEGs), genes with a log2 fold change (log2FC) ≥ 1 were considered upregulated, and those with a log2FC ≤ -1 were considered downregulated. scRNA-Seq data preprocessing The pipeline for processing single-cell RNA sequencing (scRNA-seq) data in pancreatic adenocarcinoma (PAAD) is described below. The raw data were imported using the Read10X function, and a Seurat object was constructed with the Seurat R package for further analysis. Quality control thresholds were defined according to the distribution of detected features and counts per cell. Low-quality cells were filtered out, characterized by high mitochondrial and ribosomal gene expression proportions. The data was normalized with the NormalizeData function, followed by selecting the top 2,000 highly variable genes using the FindVariableFeatures function. Principal component analysis (PCA) was performed with the RunPCA function, with the top 20 principal components chosen for further examination. Cell clustering was achieved with the FindNeighbors and FindClusters functions, followed by dimensionality reduction and visualization via Uniform Manifold Approximation and Projection (UMAP) using the RunUMAP function. Cell type annotation was performed using the SingleR R package. Differential expression analysis across cell clusters was done using the Wilcoxon rank-sum test. Pseudotime trajectory analysis was conducted using the Monocle R package, and cell–cell communication analysis and network visualization were executed with the CellChat R package. Construction of prognostic models This study aimed to construct and evaluate various prognostic models based on a diverse set of 101 machine learning algorithms. Expression data from the training and validation cohorts were normalized before analysis. Feature variables were then extracted through 101 combinations of classical and ensemble machine learning algorithms, including Cox regression, random forest (RF), LASSO, gradient boosting machine (GBM), support vector machine (SVM), and Bayesian additive regression trees (BART), among other methods. Prognostic models were trained on the TCGA cohort, and their performance was evaluated using the concordance index (C-index). Model performance was further validated in an external GEO validation cohort. The best-performing prognostic model was selected, and its signature genes were retained for downstream biological functional analysis. The R programming environment was used for all data processing and model construction, with custom scripts handling core functions. LASSO regression analysis was employed to identify the most effective prognostic model. Multivariate Cox regression analysis was then carried out, identifying 11 key genes and their associated coefficients. Risk scores for each patient were calculated using the formula: Risk score = (expression of RNA₁ × Coef₁) + (expression of RNA₂ × Coef₂) +… + (expression of RNAₙ × Coefₙ). We stratified the training cohort into high- and low-risk subgroups using the median risk score as the cutoff. The model’s prognostic performance was rigorously evaluated through Kaplan-Meier survival analysis (log-rank test) and time-dependent ROC analysis (AUC at 1/3/5 years). To further validate the model’s predictive capability, its prognostic accuracy, sensitivity, and specificity were evaluated in the validation set. Nomogram construction Univariate and multivariate Cox regression analyses were conducted to evaluate the ITH signature’s independent prognostic value for OS in PC. The multivariate model incorporated clinical parameters (age, sex, grade, stage, T stage) alongside the ITH risk score. Using the rms and regplot packages in R, we developed a comprehensive prognostic nomogram incorporating all independent predictors to estimate 1-, 3-, and 5-year overall survival probabilities. The prognostic nomogram was calibrated via 1,000 bootstrap resamples to generate calibration plots. The model’s discriminatory power was assessed by calculating the C-index and 95% confidence interval. Time-dependent ROC analysis was conducted to validate its performance further, and AUCs were determined for 1-, 3-, and 5-year overall survival. Immune microenvironment analysis Immune infiltration data were derived from previously estimated immune cell compositions in tumor tissues using established deconvolution algorithms, including CIBERSORT, EPIC, and xCell (immune estimation file: infiltration_estimation_for_tcga.csv). Only samples in the risk score data and immune infiltration profiles were retained for downstream analysis to ensure consistency in sample matching. Spearman correlation coefficients were calculated across various immune cell types to investigate the relationship between intratumoral heterogeneity (ITH) risk scores and immune cell infiltration. The correlation coefficient and corresponding p-value were calculated for each immune cell type. For immune cell types showing significant correlations with the risk score, scatter plots were generated to visualize the relationship, accompanied by linear regression fitting and annotated correlation values. To provide a comprehensive overview, a bubble plot displayed all significantly correlated immune cell types. The y-axis denoted immune cell categories, whereas the x-axis reflected their corresponding correlation values. Color coding was used to differentiate the source algorithms (e.g., CIBERSORT, EPIC, xCell), facilitating a clear visual representation of the correlation magnitude and direction between each immune cell type and the ITH risk score. ImmuneScore, StromalScore, and ESTIMATEScore were calculated using the ESTIMATE algorithm and compared across high- and low-risk cohorts to characterize tumor microenvironmental differences. Analysis of immune functional status Using ssGSEA (GSVA package), we identified significant [positive/negative] correlations between the risk score and immune functional states, revealing potential immune evasion mechanisms. Immune functional states were assessed by calculating ssGSEA enrichment scores. The resulting ssGSEA scores were subjected to min-max normalization, and non-tumor samples were excluded from subsequent analyses. Differences in immune function scores between the groups were assessed using the Wilcoxon rank-sum test, and boxplots were generated to depict the distribution of scores. Differential analysis of immune checkpoint genes A list of immune checkpoint genes was obtained based on previously published literature [[47]11]. Expression levels of these genes were retrieved from the transcriptomic matrix and integrated with risk group data. The Wilcoxon rank-sum test was applied to identify immune checkpoint genes that showed significant expression differences between the high-risk and low-risk groups. Boxplots were used to visualize these significant alterations in expression. Assessment of immunotherapy response The potential response to immune checkpoint blockade therapy was assessed using TIDE (Tumor Immune Dysfunction and Exclusion) scores, obtained from the TIDE database [[48]12]. We integrated these scores with the risk group classifications to assess the variation in immune evasion potential between the high-risk and low-risk groups. Violin plots were generated to visualize the distribution of TIDE scores across groups, and statistical significance was assessed using appropriate non-parametric tests. Functional enrichment analysis The “clusterProfiler” R package was employed for Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses [[49]12], and the findings were visualized using the “circlize” R package. To investigate pathway enrichment differences between the high-risk and low-risk groups, GSEA (Gene Set Enrichment Analysis) was performed with the “c2.cp.kegg.Hs.symbols.gmt” file from MSigDB [[50]13]. Tumor mutational burden analysis Somatic mutation profiles for PAAD samples were obtained from TCGA in Mutation Annotation Format (MAF). The TMB(Tumor mutational burden) scores for PAAD patients were computed using the “maftools” R package [[51]14]. We evaluated the association between tumor mutational burden (TMB) and risk scores, followed by Kaplan-Meier survival analysis (log-rank test) to assess TMB’s prognostic value in PAAD. Chemotherapy sensitivity analysis We utilized the Genomics of Drug Sensitivity in Cancer (GDSC) database ([52]https://www.cancerrxgene.org/) to estimate the half-maximal inhibitory concentration (IC50) values for frequently used chemotherapeutic agents in PAAD, employing the R packages pRRophetic [[53]15] and oncoPredict [[54]16]. These IC50 values were applied to predict chemotherapeutic sensitivity in patients across various risk groups. The Wilcoxon rank-sum test assessed potential differences in drug sensitivity between the groups classified based on gene expression. Statistical analysis All statistical analyses were executed in R software (version 4.2.2), where p-values and false discovery rate (FDR) q-values below 0.05 were deemed statistically significant. Result Overview of the intratumor heterogeneity score in PC To quantify ITH in pancreatic cancer samples from the TCGA database, we first utilized the DEPTH2 software. Using the surv_cutpoint() function, we established the optimal cutoff for ITH scores and subsequently divided the samples into high-ITH and low-ITH groups with the surv_categorize() function. After performing a distribution analysis, survival curves were generated based on the chosen threshold. As depicted in Fig. [55]1A, the low-ITH group demonstrated a significantly worse prognosis. Fig. 1. [56]Fig. 1 [57]Open in a new tab Correlations between ITH scores and clinicopathological features in PAAD patients. A. Distribution of ITH scores and selection of the optimal cut-off point. Survival analysis for patients with high or low expression of ITH scores. Differences in Age (B), Gender(C), Grade(D), T(E), M (F), Stage(G), and N(H) between high-ITH and low-ITH groups No statistically significant differences in age, gender, or T/N/M stages were observed when comparing the clinical characteristics of the high- and low-ITH groups (Fig. [58]1B, C, E, F, H). Patients with higher tumor grades and advanced clinical stages exhibited significantly lower ITH scores (Fig. [59]1D, G), which aligns with the poorer overall survival observed in the low-ITH group. These results suggest the need for further investigation into the molecular mechanisms and differentially expressed genes related to ITH. We performed differential gene expression analysis comparing high-ITH and low-ITH samples to investigate further the factors contributing to the poor prognosis observed in the low-ITH group. One thousand eight hundred forty-eight differentially expressed genes (DEGs) were identified, of which 345 were upregulated and 1,503 were downregulated in the low-ITH group. A heatmap depicting the top 50 DEGs is shown in Fig. [60]2A. A total of 302 genes were identified through univariate Cox regression analysis as significantly associated with overall survival. A forest plot (Fig. [61]2B) illustrates the top 20 genes. The correlation analysis among these prognostic genes revealed mostly positive associations (Fig. [62]2C). Twelve genes were classified as risk factors, whereas eight were associated with a favorable prognosis. Fig. 2. [63]Fig. 2 [64]Open in a new tab Differential gene expression and functional analysis; (A) The top 50 differentially expressed genes are displayed in a heatmap. (B) Forest plot of the Univariate Cox regression analysis; (C) Displays the correlation analysis between ITH-related differentially expressed genes; (D) GO term enrichment showing biological processes, molecular functions, and cellular components; (E) KEGG pathway mapping Biological functions analyses We conducted enrichment analyses based on KEGG pathways and GO functional annotation to better understand the biological functions of differentially expressed genes (DEGs) in pancreatic cancer progression. The GO analysis categorized the DEGs into three areas: Biological Process (BP), Molecular Function (MF), and Cellular Component (CC). Biological process (BP) analysis revealed significant enrichment of differentially expressed genes (DEGs) in pathways critical to pancreatic cancer progression, including cell migration, extracellular matrix (ECM) remodeling, angiogenesis, and inflammatory regulation. These findings implicate DEGs mediating tumor invasiveness, metastatic dissemination, and immune microenvironment modulation. In the MF category, the most significantly enriched functions included structural molecule activity, protein binding, and integrin binding, reflecting their roles in maintaining cellular structural integrity and mediating signal transduction. Regarding CC, DEGs were predominantly localized to the extracellular region, extracellular matrix, and cytoplasm, indicating their potential involvement in the tumor microenvironment and signal transduction processes(Fig. [65]2D). KEGG enrichment analysis showed that DEGs were predominantly mapped to pathways critical for pancreatic cancer biology. Highly enriched terms encompassed structural dynamics (muscle cytoskeleton, ECM-receptor binding) and pro-tumorigenic signaling (PI3K-Akt activation, cytokine networks, and proteoglycan-driven malignancy), reinforcing their roles in stromal remodeling and oncogenic progression (Fig. [66]2E). The convergent evidence from KEGG and GO analyses provides a systems-level understanding of pancreatic cancer biology, uncovering canonical and novel disease-relevant pathways. These mechanistic insights create actionable opportunities for targeted drug discovery and personalized therapeutic strategies. Construction of an integrated machine Learning-Based prognostic signature Employing a multi-algorithmic machine learning framework, we derived the ITH-S prognostic signature from 302 candidate genes initially selected via univariate Cox analysis. This framework incorporated 10 different machine learning algorithms and 101 algorithmic combinations. Patients with PC from the TCGA dataset were randomly assigned to a training set, and an independent GEO dataset was employed for external validation. We applied 10-fold cross-validation to develop 101 predictive models and assessed their performance by calculating the concordance index (C-index) across training and validation sets (Fig. [67]3A). Using 10-fold cross-validation, the optimal λ value in the LASSO regression was selected based on the minimum partial likelihood deviance (Fig. [68]3B, C). A subsequent stepwise multivariate Cox regression analysis of genes with non-zero coefficients identified 11 genes with significant independent prognostic relevance. These 11 identified genes with the most robust prognostic value were used to develop the final risk model. The individual risk score was calculated as the sum of the products of each gene’s expression level and its corresponding Cox regression coefficient (Fig. [69]3D, E). Fig. 3. [70]Fig. 3 [71]Open in a new tab Construction and evaluation of prognostic model. (A) The concordance index(C-index) heatmap of machine learning algorithms; (B) Lasso coefficient profiles of ITH-related prognostic genes; (C) Identification of the lasso optimal parameter (lambda); (D) Bar chart displaying Univariate Cox coefficient profiles; (E) Univariate Cox regression results visualized by forest plot; F.The riskScore distribution and OS status distribution; G. Survival prediction performance of ITH-risk model assessed by time-dependent ROC analysis at three clinical endpoints (1/3/5-year OS); H. Survival Analysis of ITH-risk High and Low Grouping Patients with PAAD in TCGA; I. Survival analysis of high and low ITH-risk subgroups in PAAD patients in GEO Patients were classified into high- and low-risk groups using the median risk score as a cutoff. Figure [72]3F demonstrates a clear trend of increasing mortality with higher risk scores, underscoring the prognostic significance of the risk model. The predictive capability of the risk model was validated using Kaplan–Meier analysis and time-dependent ROC curves. In the training cohort, the model yielded AUCs of 0.797, 0.734, and 0.786 for 1-, 3-, and 5-year OS predictions, respectively (Fig. [73]3G). Significant survival differences were observed between high- and low-risk groups in both the training and external validation sets, with high-risk patients experiencing markedly worse outcomes (Fig. [74]3H, I). Construction and evaluation of the prognostic model Univariate and multivariate Cox regression analyses confirmed the risk score as an independent prognostic factor for PC (Fig. [75]4A, B). To assess the model’s discriminative ability in survival prediction, time-dependent concordance index (C-index) curves were plotted. Analysis showed that, throughout the timeline, the risk model maintained a consistently higher concordance index compared to traditional clinical parameters (Fig. [76]4C). Fig. 4. [77]Fig. 4 [78]Open in a new tab Construction of a nomogram (A) Forest plot of the univariate Cox analysis; (B) Forest plot of the multivariate Cox analysis; (C) concordance index showed the prognostic predictive ability of ITH-risk model and clinical characteristics; D, E. ROC curve of risk score at 1, 3, and 5 years; F The nomogram combined with the variables; G. ROC curve analysis evaluates the prognostic value of risk scoring models and clinical features; H. Calibration curve of the nomogram at 1, 3, and 5 years Five key variables, including age, sex, tumor grade, stage, and the risk model, were selected through machine learning and logistic regression analyses to construct a comprehensive clinical prognostic model for pancreatic adenocarcinoma (PAAD). A nomogram was then generated to visualize this integrated predictive model (Fig. [79]4F). Time-dependent receiver operating characteristic (ROC) curve analysis was performed to evaluate the model’s predictive accuracy at 1-, 3-, and 5-year survival intervals(Fig. [80]4D, G). The study demonstrated that the risk model’s area under the curve (AUC) was consistently higher than other clinical variables, indicating its robust predictive value for patient prognosis (Fig. [81]4E). The robustness and predictive precision of the nomogram were further confirmed by time-dependent calibration curves, which demonstrated close agreement between predicted and observed survival probabilities (Fig. [82]4H). Immune landscape analysis Emerging evidence has highlighted the critical role of immune cells within the tumor microenvironment in tumor initiation, progression, and patient outcomes. We used a variety of immune deconvolution algorithms, such as CIBERSORT, EPIC, and TIMER, to examine the immune infiltration patterns between risk groups, providing a detailed comparison of immune cell distribution across different risk categories. Immune landscape analysis revealed substantial differences in cell infiltration patterns between risk groups. The high-risk group exhibited a pro-inflammatory yet immunosuppressive microenvironment, characterized by increased neutrophils and M1 macrophages. Still, reduced NK and memory B cells suggest an immune contexture potentially unfavorable for antitumor immunity (Fig. [83]5A). Fig. 5. [84]Fig. 5 [85]Open in a new tab Immune microenvironments Analysis in different groups (A) Various algorithms estimates tumor-infltrating immune cells; (B) Diferences in immune-related functions between two risk groups; (C) Diferences in immune checkpoints between two risk groups; (D) Analysis of the possibility of tumor immune escape by TIDE; (E) Comparison of tumor gene mutation burden between high and low risk groups; (F) Correlation analysis of tumor gene mutation burden in high and low risk groups; (G) Stratified survival analysis based on TMB threshold; (H) Differential survival outcomes by TMB status within each ITH stratum; (I) GSEA analysis for high and low risk groups The study identified significant immune functional differences between risk groups, particularly highlighting aberrant activity in APC co-inhibition, parainflammation, and type I interferon response pathways among high-risk individuals(Fig. [86]5B). We also analyzed immune checkpoint molecule expression and discovered 12 immune checkpoint genes with significant differential expression. Immunocheckpoint analysis revealed distinct expression patterns between risk groups, with seven inhibitory checkpoints (including TNFRSF4, IDO-2, and CD160) showing significantly higher expression in the low-risk group. In comparison, five stimulatory checkpoints (such as CD276, CD274, and VTCN1) were overexpressed in the high-risk group. These findings imply differential immune escape mechanisms and potential responsiveness to immunotherapy in each group(Fig. [87]5C). The TIDE algorithm revealed distinct immune evasion profiles, with high-risk patients showing significantly lower composite scores (reflecting dysfunction and exclusion mechanisms) than low-risk counterparts (Fig. [88]5D). This pattern suggests the high-risk subgroup may exhibit more favorable tumor-immune microenvironments for checkpoint blockade therapy. While tumor mutation burden (TMB) did not differ significantly between risk groups (p = 0.13, Wilcoxon test; Fig. [89]5E-F) nor correlate with risk scores (r = 0.15, p = 0.079), survival analysis revealed a synergistic prognostic impact: patients with concurrent high-risk scores and high TMB showed markedly reduced overall survival compared to those with low-risk/low-TMB profiles (Fig. [90]5G-H), suggesting combined effects of molecular risk signatures and genomic instability. GSEA revealed distinct pathway activation patterns between risk groups (FDR < 0.05). The high-risk group showed significant enrichment in proliferation-related pathways (Adherens Junction; Cell Cycle) and xenobiotic metabolism (Cytochrome P450). Conversely, the low-risk group exhibited enrichment in immune regulation (Primary Immunodeficiency) and neuroendocrine signaling (Neuroactive Ligand-Receptor Interaction) pathways (Fig. [91]5I). These findings highlight the significant differences between high-risk and low-risk groups in immune cell infiltration, immune functions, and underlying biological behaviors. These results emphasize the potential of the risk score as not only a reliable prognostic marker but also a valuable predictor of immunotherapeutic responsiveness in patients with pancreatic adenocarcinoma (PAAD). Drug sensitivity analysis To investigate the risk model’s clinical applicability, we analyzed differential drug sensitivity between high- and low-risk groups using the pRRophetic algorithm. Based on their transcriptomic profiles, this computational approach estimated the half-maximal inhibitory concentration (IC₅₀) of standard chemotherapeutic and targeted therapies for pancreatic adenocarcinoma (PAAD) patients. Distinct drug response patterns were observed across risk groups, highlighting potential variations in treatment efficacy. High-risk patients showed enhanced sensitivity (reduced IC₅₀) to tyrosine kinase inhibitors (e.g., Erlotinib, Sapitinib). In contrast, low-risk patients responded more to PARP and BCR-ABL inhibitors (Olaparib and Nilotinib, respectively) (Fig. [92]6A). Fig. 6. [93]Fig. 6 [94]Open in a new tab Chemotherapeutic drugs sensitivity analysis. (A) Boxplot visualization of estimated IC50 values for common anti-tumor agents across pancreatic cancer patient subgroups stratified by ITH score; (B) A comparative analysis of drug sensitivity was conducted between the two risk groups These findings suggest that risk stratification may inform personalized chemotherapy regimens. Analysis of ITH-Related changes based on Single-Cell transcriptomics Building upon the insights from bulk RNA sequencing, we sought to explore the distribution of ITH across distinct cell populations at the single-cell resolution. To enable downstream analysis, we processed the single-cell RNA-seq dataset [95]GSE212966, which included three pancreatic cancer and three peri-cancerous tissue samples. After quality control (Figure [96]S1-D), dimensionality reduction via t-SNE and UMAP identified 21 distinct cell clusters manually annotated using established cell-type-specific markers (Figure S2). The spatial distribution of these clusters was visualized using t-SNE (Fig. [97]7A-B) and UMAP plots (Fig. [98]7C). To quantify ITH at the single-cell level, we employed the “AddModuleScore” function to score each cell based on a curated set of ITH-related signature genes. The results demonstrated that ITH scores were significantly elevated in tumor tissues (Fig. [99]7D), particularly in epithelial cells (Fig. [100]7E-G). Subsequently, we identified 10 distinct epithelial cell subpopulations (Fig. [101]7H). The first epithelial cluster exhibited the highest ITH score (Fig. [102]7I). To further explore the dynamic progression of epithelial cells, we constructed developmental trajectories to infer the differentiation states between tumor and adjacent normal tissues. Pseudotime analysis revealed three distinct cellular states across the epithelial subpopulations. Epi2–6 subtypes were predominantly enriched in states 2 and 3, whereas the Epi0 subtype was confined mainly to state 1. Based on pseudotime annotations, state four was considered a potential progenitor/root-like state (Fig. [103]7J). Fig. 7. [104]Fig. 7 [105]Open in a new tab Single-cell analysis of ITH score in PAAD (A) t-SNE plot of normal and tumor cells; (B) t-SNE plot of 21 cell clusters; (C) UMAP of nine cell types; (D) A comparative analysis of ITH score between PAAD and adjacent tissues; (E) Violin plots showing ITH score in different clusters of PAAD; (F) Violin plots showing ITH score in PAAD and adjacent tissues; (G) UMAP projection showing ITH score in different clusters; (H) Display clusters of endothelial cells by t-SNE plot; (I) Display the ITH scores of different endothelial cell clusters; (J) Differentiation trajectory colored for states, pseudotime, tissue types, and clusters Based on transcriptomic features, endothelial cells were categorized into three ITH levels: high, medium, and low. To investigate intercellular communication, we performed CellChat analysis to examine interaction differences among endothelial subgroups and other cell types. As shown in Fig. [106]8A, the number of receptor–ligand interactions and the cumulative strength of intercellular communication were quantified across cell types. Thicker lines represent stronger interactions; line colors match the source cell population, while the opposing nodes denote the target population. Fig. 8. [107]Fig. 8 [108]Open in a new tab Cell communication analysis of scRNA-seq data. (A) Number of interactions and intensity of action of different cell types in the normal and tumor groups; (B) Bubble diagram of receptor-mediated cellular communication; (C) The contribution of cell populations to all signaling pathways; (D) Visualize the primary signal sender. (source) and signal receiver (target); E. Display of GALECTIN SIGNALING PATHWAY among different cell types; F. ITH highepithelial cell communication networks with other cell types; ITH-high endothelial cells exhibited significantly enhanced interaction strength with most other cell types. Ligand–receptor correlation analysis suggested that ITH-associated epithelial cells may influence ductal cells and macrophages through signaling axes such as MIF-(CD74 + CXCR4) and MIF-(CD74 + CD44). (Fig. [109]8B) Further analyses highlighted Galectin signaling as the dominant outgoing communication pathway in ITH-high endothelial cells, whereas SPP1-related pathways were identified as major contributors to incoming signals(Fig. [110]8C). Scatterplots were used to visualize key sender–receiver cell types(Fig. [111]8D). Compared to other cell populations, ITH-high endothelial cells were identified as significant sources of GALECTIN signaling, which predominantly targeted macrophages. Additionally, GALECTIN pathways mediated interactions between ITH-high endothelial cells and monocytes, T cells, and ductal epithelial cells, suggesting a critical role of this signaling axis in promoting PAAD progression(Figs. [112]8E-F). Discussion This study explores ITH in PC, examining its prognostic significance, influence on the tumor immune microenvironment, and implications for treatment response. Our results demonstrate that ITH strongly correlates with survival outcomes in pancreatic cancer, supporting its potential clinical utility as a robust prognostic biomarker for patient stratification. Interestingly, in contrast to many other cancer types, our analysis revealed that lower transcriptomic ITH was associated with worse prognosis in pancreatic cancer. One possible explanation is that low ITH reflects a monoclonal, immune-evading phenotype that dominates in the hostile stromal and hypovascular environment characteristic of pancreatic tumors. Such tumors may have undergone clonal selection and immune editing, favoring aggressive behavior despite apparent transcriptomic uniformity. Furthermore, late-stage tumors may exhibit transcriptional convergence toward stress-adaptive pathways, reducing ITH scores. These findings align with recent spatial transcriptomic studies that report poor outcomes in phenotypically homogeneous yet biologically aggressive PDAC subregions [[113]17–[114]19]. Lower ITH scores were notably found in patients with higher tumor grade and advanced clinical stage, further corroborating the link between ITH and tumor aggressiveness. Comparative analysis revealed 1,848 ITH-associated DEGs, with functional annotation implicating oncogenic pathways - particularly PI3K-Akt signaling and ECM remodeling- as key molecular determinants of tumor heterogeneity-driven progression. These findings suggest that the molecular mechanisms underlying ITH in pancreatic cancer involve pathways critical to tumor invasion, metastasis, and immune evasion. We notably uncovered multiple genes comprising detrimental and beneficial factors that may represent viable targets for therapy or serve as biomarkers to assess the progression associated with intratumoral heterogeneity. We constructed a robust intratumoral heterogeneity-related signature (ITH-S) using an integrated machine learning framework to better stratify patients based on ITH-related risks. The risk model demonstrated strong prognostic performance, with significantly shorter overall survival in high-risk patients versus low-risk counterparts. Our model’s accuracy was confirmed through multiple validation cohorts, and the C-index consistently exceeded that of other clinical variables, reinforcing the model’s clinical applicability. After controlling for conventional clinical variables, the risk score exhibited strong prognostic significance, underscoring its value as an independent predictor of survival in PAAD. This comprehensive model integrates ITH with other clinical features, providing a refined tool for individualized prognosis prediction. Several genes included in our 11-gene prognostic model are supported by prior studies for their involvement in pancreatic cancer progression, immune modulation, or therapeutic resistance. For instance, LGALS1 (Galectin-1) is a well-characterized immunosuppressive mediator that promotes T cell dysfunction and stromal remodeling in PDAC. CTSB and S100A4 facilitate invasion and metastasis through ECM degradation and EMT induction, respectively, and are linked to chemoresistance. IFI16, an innate immune DNA sensor, modulates interferon pathways and may affect immune evasion. TSPAN1 is overexpressed in PC and promotes proliferation, cancer stemness, and drug resistance. These findings support the risk model’s biological plausibility while identifying novel candidate genes for further validation. The TME has emerged as a critical determinant of tumor behavior, with its composition and functional state strongly correlating with treatment efficacy and survival outcomes in cancer patients [[115]20, [116]21]. The observed immune infiltration disparities between risk groups imply a link between ITH and immune landscape remodeling. Specifically, low-ITH tumors may create an immunologically ‘cold’ environment that facilitates immune evasion rather than effective immune activation. Furthermore, significant differences in immune checkpoint expression between the two groups imply that ITH-related stratification could help predict response to immunotherapies, with the low-risk group potentially benefiting more from immune checkpoint blockade. Consistent with the immune profiling, TIDE analysis indicated that low-ITH patients might benefit more from immune checkpoint blockade due to their lower dysfunction and exclusion scores, underscoring the potential of ITH-guided immunotherapy stratification. The ITH-based stratification revealed distinct drug response profiles: low-risk patients demonstrated increased susceptibility to chemotherapeutics targeting the cell cycle and DNA repair mechanisms. At the same time, high-risk individuals responded more favorably to agents targeting immune checkpoint pathways and cellular metabolism. These findings indicate the potential for tailoring chemotherapy regimens based on ITH-related risk stratification. Moreover, the observed therapeutic sensitivity differences suggest that personalized treatments incorporating conventional and immune-targeted therapies could improve treatment outcomes in pancreatic cancer patients. Several limitations should be noted. The retrospective design based on publicly available datasets (TCGA and GEO) may introduce inherent biases in sample selection and processing. Although our model demonstrated good performance when validated using the GEO dataset, we acknowledge that the sample size of the validation cohort was relatively small, and no external multi-center datasets were included for further validation. This limitation may affect the generalizability and robustness of our findings across different populations and clinical settings. While our drug sensitivity analysis provided useful in silico insights, we acknowledge the limitation of not having matched clinical drug response data, such as progression-free survival after gemcitabine or erlotinib treatment. Future validation using patient-level treatment outcomes will be essential to confirm the predictive utility of the ITH-based stratification model and its potential application in guiding personalized chemotherapy in pancreatic cancer. While our exploration of immune infiltration and drug response shed light on the immunological landscape of pancreatic cancer, further research is essential to clarify the molecular drivers of immune dysregulation and treatment resistance in tumors with low intratumoral heterogeneity. Conclusion Our study highlights the clinical significance of intratumoral heterogeneity in pancreatic cancer, both as a prognostic marker and determinant of immune response and therapeutic sensitivity. By integrating ITH and machine learning techniques, we have created a robust prognostic signature capable of predicting patient outcomes and informing personalized treatment strategies. Further research into the molecular and immune mechanisms underlying ITH will be critical for developing more effective therapies for pancreatic cancer. Electronic supplementary material Below is the link to the electronic supplementary material. [117]Supplementary Material 1^ (1.2MB, docx) Acknowledgements