Abstract Our study aims to investigate the role of pyrimidine metabolism in prostate cancer and its associations with the immune microenvironment, drug sensitivity, and tumor mutation burden. Through transcriptomic and single-cell RNA sequencing analyses, we explored metabolic pathway enrichment, immune infiltration patterns, and differential gene expression in prostate cancer samples. The results showed that pyrimidine metabolism-related genes were significantly upregulated in the P2 subgroup compared to the P1 subgroup, with enhanced metabolic activity observed in basal and luminal epithelial cells. In addition, immune infiltration analysis revealed a strong correlation between pyrimidine metabolism and immune cell regulation, particularly involving T cell activity. Tumors in the P2 subgroup, characterized by higher pyrimidine metabolism, exhibited greater infiltration of activated CD4 + T cells and M2 macrophages, indicating a potential link between metabolic reprogramming and the immune response in prostate cancer. Drug sensitivity analysis further demonstrated that tumors with elevated pyrimidine metabolism displayed increased responsiveness to several chemotherapeutic agents, including BI-2536, JW-7-24-1, and PAC-1, suggesting that targeting pyrimidine metabolism may enhance treatment efficacy. Moreover, key genes involved in pyrimidine de novo synthesis, such as RRM2, were identified as potential drivers of tumor progression, providing new insights into the molecular mechanisms underlying aggressive prostate cancer phenotypes. In conclusion, pyrimidine metabolism plays a critical role in prostate cancer progression, influencing immune infiltration and drug sensitivity. Targeting this metabolic pathway offers a promising strategy for the development of new therapeutic approaches, particularly for overcoming drug resistance and improving outcomes in patients with advanced prostate cancer. Supplementary Information The online version contains supplementary material available at 10.1038/s41598-025-86052-5. Keywords: Metabolic reprogramming, Androgen, Prostate Cancer, Pyrimidine metabolism, Tumor Microenvironment Subject terms: Cancer, Computational biology and bioinformatics Introduction Prostate cancer is one of the most common malignancies of the male urogenital system, affecting millions of men worldwide^[34]1. It is a significant contributor to increased mortality among men globally^[35]2. Patients with prostate cancer may present with either localized or advanced disease^[36]3. While the incidence of prostate cancer in China remains relatively lower, it is rising rapidly due to factors such as economic development and an aging population^[37]4. The Lancet major report on prostate cancer (The Lancet Commission on prostate cancer: planning for the surge in cases) states that from 1.4 million cases per year in 2020 to 2.9 million cases per year in 2040, the number of prostate cancer patients will double, with Low- and middle-income countries will be most affected.At the same time, prostate cancer deaths will also increase by 85% over 20 years, from 375,000 cases in 2020 to nearly 700,000 in 2040^[38]5. Prostate cancer currently ranks sixth among male malignancies in China^[39]6. Diagnostic approaches include digital rectal examination, prostate-specific antigen (PSA) analysis, and prostate biopsies^[40]7,[41]8. Treatment for localized prostate cancer includes active surveillance, radiation therapy, and radical prostatectomy^[42]9. In advanced or metastatic cases, androgen deprivation therapy (ADT), salvage radiation, and chemotherapy are commonly employed^[43]10. Despite available treatments, prostate cancer remains incurable, underscoring the need for further research into novel therapeutic strategies, including targeted radioisotopes; immunotherapy, and alternative treatment approaches^[44]11. Androgen and its receptor play a pivotal role in the development and progression of prostate cancer^[45]12. Testosterone, a key androgen, fuels prostate cancer growth, and reducing testosterone levels is a primary therapeutic strategy for locally advanced and metastatic prostate cancer^[46]13. ADT, whether through surgical castration or pharmacological intervention using GnRH agonists, is a frontline treatment, often combined with chemotherapy for metastatic disease^[47]14,[48]15. Although initial response rates to ADT are high, many patients eventually progress to castration-resistant prostate cancer (CRPC), a stage where the disease becomes less responsive to conventional therapies^[49]16,[50]17. The androgen receptor (AR) remains crucial in CRPC progression, and newer AR-targeted therapies have been developed^[51]18. However, despite these advancements, late-stage prostate cancer continues to be incurable, highlighting the persistent challenge of overcoming androgen signaling^[52]19. Metabolic reprogramming is a hallmark of cancer cells and plays an essential role in tumor growth and progression^[53]20. Cancer cells rewire their metabolism to meet the increased demands for nucleotides, lipids, and other building blocks essential for rapid proliferation^[54]21–[55]24. This metabolic shift involves multiple pathways, including glycolysis, lipid metabolism, and nucleotide synthesis, and is driven by oncogenes, tumor suppressors, and interactions with the tumor microenvironment^[56]25,[57]26. The reprogrammed metabolism not only supports tumor growth but also contributes to therapeutic resistance, making it a critical area of cancer research^[58]27. As one of the primary metabolic pathways altered in cancer, nucleotide metabolism—particularly the purine and pyrimidine pathways—has attracted increasing attention as a potential therapeutic target^[59]28,[60]29. In this study, we aim to investigate the role of pyrimidine metabolism in prostate cancer progression and its association with androgen. By classifying prostate cancer into subtypes based on androgen-related gene expression and performing a detailed analysis of metabolic pathways, we explore how pyrimidine metabolism contributes to the distinct molecular and clinical features of these subtypes. Using transcriptomic data, we perform comprehensive analyses, including differential gene expression, WGCNA, and immune infiltration, to uncover the key metabolic drivers of prostate cancer progression. Additionally, we evaluate drug sensitivity across pyrimidine metabolism subtypes to identify potential therapeutic interventions tailored to the metabolic profile of prostate cancer. Methods Data preparation Transcriptomic data and clinical information for prostate cancer and normal prostate tissue samples were obtained from The Cancer Genome Atlas (TCGA) database. A total of 554 samples were included in the analysis, consisting of 502 prostate cancer tissue samples and 52 normal prostate tissue samples. The dataset underwent rigorous quality control and normalization to ensure high data integrity and comparability across samples. Additionally, the [61]GSE70769 dataset, which includes transcriptomic data from 94 prostate cancer cases, was incorporated into the study to provide further insights and validation. In addition to the bulk transcriptomic data, the [62]GSE245387 dataset was utilized for single-cell RNA sequencing data, specifically focusing on androgen receptor-positive (AR+) prostate cancer. This dataset provided high-resolution insights into the cellular heterogeneity within prostate cancer, enabling a more detailed exploration of cell-type-specific gene expression and signaling pathways associated with androgen receptor activity. GSVA and GSEA analyses We employed Gene Set Variation Analysis (GSVA) to compare metabolic pathways between two prostate cancer subgroups (C1 and C2). Normalized gene expression data were used, and samples classified into C1 and C2 subgroups were extracted. KEGG metabolic gene sets were employed for pathway analysis. After performing GSVA, the resulting scores were normalized, and differential pathway activity between C1 and C2 was assessed using t-tests. Pathways with significant differences (p < 0.05) were identified and categorized as either upregulated or downregulated. The final results will be visualized using a heatmap to illustrate the pathway activity patterns across the two subgroups. Pyrimidine metabolism gene clustering For the clustering analysis, 13 genes involved in the superpathway of pyrimidine deoxyribonucleotides de novo biosynthesis (CAD, CMPK1, CTPS1, CTPS2, DTYMK, DUT, NME1, NME1-NME2, NME2, NTPCR, RRM1, RRM2, TYMS) were selected. Consensus clustering was performed using the ConsensusClusterPlus package with k-means clustering and a Euclidean distance metric. The maximum number of clusters (k) was set to 9, with 50 repetitions, where 80% of the samples and 100% of the features were randomly subsampled for each run. Based on the consensus cumulative distribution function (CDF) plots, the optimal cluster number was determined to be 2. The clustering results were recorded, with each sample assigned to either the P1 or P2 cluster based on the consensus class. Drug sensitivity analysis The drug sensitivity analysis was performed by comparing the P1 and P2 groups based on gene expression data. Utilizing the pRRophetic package, drug sensitivity for a comprehensive panel of anticancer agents was predicted for both groups. The sensitivity results, quantified as IC50 values, were filtered to exclude outliers by capping extreme values at the 99th percentile. A Wilcoxon rank-sum test was employed to evaluate differences in drug sensitivity between the P1 and P2 groups. For drugs exhibiting significant differences (p < 0.001), boxplots were generated to visually compare drug sensitivity between the groups, with statistical significance annotated. Tumor mutation burden analysis Mutation data from prostate cancer samples were obtained from the TCGA database to investigate key genetic alterations. The mutation analysis involved summarizing mutation types and frequencies using the maftools package. A gene cloud was generated to highlight the genes with the highest mutation frequencies across the dataset. Additionally, waterfall plots were constructed for two prostate cancer patient subgroups, P1 and P2, to visualize the top 20 most frequently mutated genes. Finally, a somatic interaction analysis was conducted to assess the relationships and co-occurrence between mutations in the top 20 genes. Immune infiltration analysis CIBERSORT was used to analyze immune cell infiltration in prostate cancer tissues from groups P1 and P2. Transcriptomic data from tumor samples were used as input, and a signature matrix containing reference profiles for 22 immune cell types was applied. After quantile normalization of the mixture file, the CIBERSORT algorithm employed support vector regression to estimate the relative proportions of immune cells in each sample. The analysis was performed with 100 permutations to calculate p-values, and results included immune cell fractions, p-values, correlation values, and RMSE. The output allowed for comparison of immune infiltration profiles between the P1 and P2 groups. Differential expression geneanalysis The differential expression analysis between the P1 and P2 groups involved importing, formatting, and processing gene expression data. Log2 transformation and quantile normalization were applied to normalize expression values. Samples were categorized into P1 and P2 groups, and a linear model was fitted to estimate differential gene expression. An empirical Bayes method was used to compute moderated statistics. Results were visualized using a volcano plot for significantly differentially expressed genes and a heatmap for the top differentially expressed genes across the groups. WGCNA analysis Weighted Gene Co-expression Network Analysis (WGCNA) was used to identify gene modules associated with clinical traits in prostate cancer samples. Initially, low-expression genes were filtered out, and the top 25% of genes with the highest standard deviation were selected. The data were normalized, and outliers were removed through sample clustering. An optimal soft-thresholding power (β) was chosen to achieve a scale-free topology for network construction. The adjacency matrix was computed and transformed into a topological overlap matrix (TOM) to measure gene connectivity. Gene modules were identified using hierarchical clustering and dynamic tree cut algorithms, with a minimum module size of 100 genes. Module eigengenes (MEs) were calculated to represent each module’s first principal component, and modules were clustered based on eigengene similarity. A heatmap of module-trait relationships was generated by correlating MEs with clinical traits, providing insights into significant gene modules associated with prostate cancer progression. Visualization was conducted using dendrograms, module heatmaps, and network plots to identify key gene modules involved in the disease. Network pharmacology analysis To further investigate the relationship between pyrimidine metabolism-related genes and androgen receptor inhibitors, we performed a network pharmacology analysis. We first identified potential targets of three clinically relevant AR inhibitors—Apalutamide, Enzalutamide, and Flutamide—using the SwissTargetPrediction database. To visualize the interactions between these AR inhibitors and their targets, we constructed a drug-target network using Cytoscape 3.8.2. Functional and pathway enrichment analyses To explore the biological significance of differentially expressed genes (DEGs) and gene modules identified through WGCNA, an intersection analysis was performed between DEGs and module genes. The intersection of these gene sets identified common genes of interest, referred to as the intersecting gene set. Gene Ontology (GO) enrichment analysis was then conducted on the intersecting gene set using the clusterProfiler package. Genes were mapped to their Entrez IDs, and GO enrichment was performed across biological process (BP), cellular component (CC), and molecular function (MF) categories using the org.Hs.eg.db database. Significant GO terms were identified based on p-value (< 0.05). In parallel, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was carried out to identify significantly enriched pathways^[63]30–[64]32. Similar to the GO analysis, genes were mapped to their Entrez IDs, and KEGG pathway enrichment was performed using the enrichKEGG function with human gene annotation (“hsa”). Pathways meeting p-value (< 0.05) thresholds were considered significant. The results from both the GO and KEGG enrichment analyses were visualized using bar plots and bubble plots to highlight the key biological processes and pathways associated with the intersecting gene set. Machine learning-based gene identification To identify key genes associated with prostate cancer, three machine learning approaches were employed: LASSO regression, Random Forest, and SVM-RFE. LASSO regression was applied to the gene expression matrix using the glmnet package in R, with binary classification as the outcome variable (P1 vs. P2). Cross-validation with 10 folds optimized the model, and the lambda.min value was used to extract the most predictive genes. A Random Forest model was built using the randomForest package to classify P1 and P2 groups, with 500 decision trees. The optimal number of trees was determined by minimizing the out-of-bag error rate, and gene importance scores ranked each gene’s contribution to classification performance, identifying top candidate genes. Support Vector Machine-Recursive Feature Elimination (SVM-RFE) was performed with 10-fold cross-validation to iteratively eliminate genes contributing least to classification performance. Model accuracy and error rates were assessed, and the optimal feature set was determined by minimizing the error rate. An SVM model was then built using the top 40 ranked features to predict sample classification, providing a powerful approach for distinguishing between P1 and P2 groups in prostate cancer research. Nomogram analysis In this study, a nomogram was constructed to assess clinical outcomes in prostate cancer using the expression levels of pyrimidine metabolism-related genes (CAD, CMPK1, CTPS1, CTPS2, DTYMK, DUT, NME1, NME2, NTPCR, RRM1, RRM2, TYMS). The data were processed using Cox proportional hazards regression, and a nomogram was developed to predict 1-, 2-, and 3-year overall survival. The model’s performance was evaluated using calibration curves to compare the predicted survival probabilities with actual survival rates, providing a visual and quantitative tool for clinical prognosis. Single-cell analysis In this study, single-cell RNA sequencing (scRNA-seq) data were analyzed to investigate cell-cell communication networks, including quality control, sample integration, cell annotation, and communication inference. First, quality control was performed on scRNA-seq data, filtering cells based on criteria of nFeature_RNA > 500, percent.mt < 20%, and nCount_RNA > 1000 to ensure high-quality data. Sample integration and batch effect correction were conducted using the Harmony algorithm, followed by principal component analysis (PCA) for dimensionality reduction and clustering. Clusters were annotated based on known marker genes to identify distinct cell types. Cell-cell communication analysis was performed using the CellChat package. The normalized expression matrix was used to construct a CellChat object, with the human ligand-receptor interaction database (CellChatDB.human) providing interaction information. Overexpressed ligand-receptor pairs were identified within each cell group, and interactions were projected onto a protein-protein interaction (PPI) network to adjust expression values. Communication probabilities between cell types were inferred, and low-abundance cell populations were filtered to ensure robust results. Pathway-level communication networks were also inferred, and the overall interaction count and strength between cell types were aggregated and visualized through circular network plots. Cellular metabolism analysis Single-cell metabolic pathway activity was analyzed using the scMetabolism package. The analysis was performed on single-cell data, with KEGG metabolic pathways as the reference. Imputation was disabled to retain raw expression values, and two cores were used for computation. UMAP was utilized to visualize the metabolic activity of specific pathways, such as “Pyrimidine metabolism,” across different cell populations. To assess pathway activity, a dot plot was generated to visualize the expression of “Pyrimidine metabolism” across cell clusters defined by phenotypic identities. Additionally, a box plot was created to compare pathway activities across different groups, highlighting metabolic variations at the single-cell level. Finally, a metabolic activity matrix was computed for all cells to capture pathway-specific activities, enabling further exploration of cell metabolism dynamics. Results Enhanced pyrimidine metabolism in androgen-high prostate cancer subtypes In previous research, prostate cancer was classified into two subgroups, C1 and C2, based on androgen-related genes^[65]33. Metabolic reprogramming plays a pivotal role in tumor progression and has garnered widespread attention. In this study, we performed GSVA enrichment analysis using KEGG metabolic gene sets to compare metabolic pathways between the C1 and C2 subgroups. The analysis revealed that pyrimidine metabolism, purine metabolism, and glyoxylate and dicarboxylate metabolism pathways were significantly upregulated in the C1 subgroup, while arachidonic acid metabolism and nicotinate and nicotinamide metabolism were downregulated (Fig. [66]1A). To further investigate the role of pyrimidine metabolism in prostate cancer progression, we performed GSEA analysis, which confirmed significant enrichment of the pyrimidine metabolism pathway in the C1 subgroup (Fig. [67]1B). Building upon these findings, we selected 13 genes associated with pyrimidine de novo biosynthesis from the Genecard database, including CAD, CMPK1, CTPS1, CTPS2, DTYMK, DUT, NME1, NME1-NME2, NME2, NTPCR, RRM1, RRM2, and TYMS, to classify prostate cancer into additional subgroups. This classification yielded two distinct groups, P1 and P2 (Fig. [68]1C). PCA demonstrated clear separation between the P1 and P2 subgroups (Fig. [69]1D). Notably, the expression of pyrimidine metabolism-related genes was significantly higher in the P2 subgroup, leading us to define P2 as the subgroup with enhanced pyrimidine synthesis activity (Fig. [70]1E). This stratification underscores the potential metabolic differences between prostate cancer subtypes and highlights the relevance of pyrimidine metabolism in the progression of the disease. Fig. 1. [71]Fig. 1 [72]Open in a new tab Pyrimidine metabolism subtype classification and GSVA enrichment analysis in prostate cancer. (A) Heatmap of KEGG metabolic pathway enrichment results using GSVA in C1 and C2 subgroups of prostate cancer. (B) GSEA results highlighting significant enrichment of the pyrimidine metabolism pathway in the C1 subgroup. (C) Consensus clustering analysis of prostate cancer samples based on the expression of 13 pyrimidine synthesis-related genes (CAD, CMPK1, CTPS1, CTPS2, DTYMK, DUT, NME1, NME1-NME2, NME2, NTPCR, RRM1, RRM2, TYMS), identifying two subgroups: P1 and P2. (D) PCA showing clear separation of P1 and P2 subgroups. (E) Boxplot of gene expression levels for pyrimidine metabolism-related genes, demonstrating higher expression in the P2 subgroup. Furthermore, we analyzed the differences in pyrimidine metabolism ssGSEA scores across various clinical and pathological parameters in Supplementary Fig. [73]1. We found that ssGSEA scores for pyrimidine metabolism were significantly higher in prostate cancer patients aged over 60 compared to those under 60 (Supplementary Fig. 1A). In terms of TNM staging, we observed significant differences in ssGSEA scores (Supplementary Fig. [74]1B–D), with T4-stage patients showing significantly higher scores than T2-stage patients. Additionally, patients with high Primary Gleason Grade and Secondary Gleason Grade patterns exhibited significantly higher scores compared to those with low-grade patterns (Supplementary Fig. [75]1E and F). These findings provide further evidence of the involvement of pyrimidine metabolism in prostate cancer progression and its association with clinical parameters. This stratification underscores the potential metabolic differences between prostate cancer subtypes and highlights the relevance of pyrimidine metabolism in the progression of the disease. Drug sensitivity evaluation of pyrimidine metabolism subtypes in prostate cancer To evaluate the clinical differences between the P1 and P2 subtypes, we visualized the clinical characteristics of both subgroups using a heatmap (Fig. [76]2A). The results revealed that the proportion of prostate cancer cases with high androgen was significantly higher in the P2 subgroup compared to the P1 subgroup, indicating a potential link between androgen signaling and pyrimidine metabolism. Additionally, the P2 subgroup exhibited a significantly higher proportion of patients with regional lymph node metastasis, suggesting a more aggressive disease phenotype in this group. Furthermore, the proportion of patients with a secondary Gleason pattern of 4 or 5 was also notably higher in the P2 subgroup, further supporting its association with a more advanced disease stage. The correlation between pyrimidine metabolism-related genes, including CAD, CMPK1, CTPS1, CTPS2, DTYMK, DUT, NME1, NME1-NME2, NME2, NTPCR, RRM1, RRM2, and TYMS, was visualized using a circos plot to illustrate their interactions in prostate cancer (Fig. [77]2B). To assess drug sensitivity between the two subgroups, we evaluated IC50 values for several drugs. The analysis revealed that drugs such as BI-2536, JW-7-24-1, MP470, PAC-1, QL-X-138, TL-2-105, WZ3105, XMD14-99, and YM201636 had significantly lower IC50 values in the P2 subgroup compared to the P1 subgroup, indicating greater sensitivity to these treatments in the P2 group (Fig. [78]2C). Fig. 2. [79]Fig. 2 [80]Open in a new tab Drug sensitivity and correlation analysis in pyrimidine metabolism subtypes. (A) Heatmap of clinical characteristics across P1 and P2 subgroups. (B) Circular plot showing the correlation between the 13 pyrimidine metabolism-related genes in prostate cancer. (C) Boxplot of IC50 values for drugs with significant sensitivity differences between P1 and P2 subgroups. Furthermore, we observed significant differences in androgen receptor expression levels between the P1 and P2 subgroups, as shown in Supplementary Fig. [81]2A. The AR expression level in the P2 subgroup was significantly higher than that in the P1 subgroup. Additionally, ssGSEA analysis revealed that the “Regulation of Androgen Receptor Activity” score was significantly higher in the P2 subgroup compared to the P1 subgroup (Supplementary Fig. [82]2B). These findings suggest that the P2 subgroup may exhibit enhanced androgen receptor activity, contributing to its distinct molecular characteristics. Additionally, a nomogram was developed based on the expression levels of pyrimidine synthesis-related genes to predict prostate cancer prognosis (Supplementary Fig. [83]3A), with a calibration curve to assess the accuracy of the model (Supplementary Fig. [84]3B). To further evaluate the model, we used transcriptomic data of patients with biochemical recurrence (BCR) from the [85]GSE70769 dataset to construct an additional nomogram (Supplementary Fig. [86]3C) and calibration curve (Supplementary Fig. [87]3D). Tumor mutation burden and immune infiltration characteristics of pyrimidine metabolism subtypes To explore the tumor mutation burden (TMB) phenotypes of the P1 and P2 subgroups, we analyzed the mutational profiles of both groups. Total TMB analysis revealed that the P2 subgroup had a significantly higher mutational burden compared to the P1 subgroup (Fig. [88]3A). Among the frequently mutated genes in prostate cancer, TP53, SPOP, and TTN were found to have higher mutation rates (Fig. [89]3B). The correlation between mutation frequencies in different genes was visualized using a heatmap (Fig. [90]3C), and the waterfall plots illustrated the mutational landscape in both subgroups, highlighting the most frequently mutated genes in each (Fig. [91]3D). Fig. 3. [92]Fig. 3 [93]Open in a new tab Tumor mutation burden and immune infiltration characteristics of pyrimidine metabolism subtypes. (A) Boxplot comparing total TMB between P1 and P2 subgroups, showing significantly higher TMB in the P2 subgroup. (B) Cloud plot displaying mutation frequency of high-frequency mutated genes in prostate cancer. (C) Correlation heatmap illustrating the co-mutation patterns of genes in prostate cancer. (D) Waterfall plots representing the mutation profiles of P1 and P2 subgroups. (E) Boxplots showing the comparison of Stromal Score, Immune Score, and ESTIMATE Score between P1 and P2 subgroups. (F) CIBERSORT analysis of immune cell infiltration, showing significant differences in immune cell types. (G) Heatmap of the correlation between pyrimidine metabolism-related genes and immune cell infiltration levels in prostate cancer tissues. Immune infiltration characteristics were evaluated using the ESTIMATE algorithm, which revealed that the Stromal Score, Immune Score, and ESTIMATE Score were significantly higher in the P1 subgroup compared to the P2 subgroup (Fig. [94]3E), indicating greater immune and stromal cell infiltration in the P1 group. Further analysis using the CIBERSORT algorithm to assess the infiltration levels of 22 immune cell types in tumor tissues demonstrated that mast cells (resting) were significantly more abundant in the P1 subgroup, whereas activated CD4 + T cells and M2 macrophages were more prevalent in the P2 subgroup (Fig. [95]3F). The correlation between pyrimidine synthesis-related genes and the infiltration levels of various immune cell types was visualized through a heatmap (Fig. [96]3G), providing insights into the immune landscape of prostate cancer in relation to pyrimidine metabolism. The integration of these findings highlights the distinct molecular and immune features of the P1 and P2 subgroups, with the P2 subgroup showing a more aggressive clinical phenotype, higher mutational burden, and altered immune infiltration, particularly in relation to pyrimidine metabolism. Differential gene expression, WGCNA analysis, and network pharmacology To investigate the molecular differences between the P1 and P2 subgroups, we conducted a differential gene expression analysis between the two groups. Using a threshold of |LogFC| > 0.8 and P < 0.05, the results identified 319 differentially expressed genes in the P2 subgroup compared to the P1 subgroup, with 123 genes upregulated and 196 genes downregulated (Fig. [97]4A, B). This differential expression highlights distinct gene regulation patterns between the subgroups, potentially contributing to their differing clinical and biological characteristics. To further explore the co-regulatory networks within these subgroups, we performed WGCNA. The scale-free fit index was set at 0.9, and we selected a soft threshold of 10 for subsequent analyses (Fig. [98]4C). Based on hierarchical clustering, the genes were grouped into distinct modules (Fig. [99]4D). The module-trait correlation heatmap revealed that the MEpink module had a strong positive correlation with the P2 subgroup phenotype, with a significant P-value of 1 × 10⁻⁸ (Fig. [100]4E). This suggests that genes within the MEpink module are closely associated with the molecular characteristics of the P2 subgroup. To narrow down key genes involved in the P2 phenotype, we intersected the differentially expressed genes with the MEpink module gene set from the WGCNA analysis. This resulted in the identification of 134 intersecting genes (Fig. [101]4F). These intersecting genes represent candidates that may play critical roles in driving the distinct molecular features observed in the P2 subgroup, providing potential targets for further investigation into their functional roles in prostate cancer progression. Fig. 4. [102]Fig. 4 [103]Open in a new tab Differentially expressed genes and WGCNA Analysis Between P1 and P2 Subgroups. (A) Volcano plot showing DEGs between P1 and P2 subgroups. (B) Heatmap of DEGs across P1 and P2 subgroups. (C) Plot of the soft thresholding power and scale-free topology fitting index used for WGCNA, showing a soft-threshold power of 10 for downstream analysis. (D) Dendrogram of genes clustered into different modules based on WGCNA analysis. (E) Module-trait heatmap illustrating the significant positive correlation between the MEpink module and the P2 subgroup phenotype. (F) Venn diagram showing the intersection of DEGs and MEpink module genes. In addition, to explore the relationship between pyrimidine metabolism-related genes and androgen receptor inhibitors, we included three clinically relevant AR inhibitors—Apalutamide, Enzalutamide, and Flutamide—in our analysis. Using SwissTargetPrediction, we identified the potential targets of these AR inhibitors and intersected them with the 134 pyrimidine metabolism-related genes. This analysis resulted in a subset of overlapping genes (Supplementary Fig. [104]4A). We further constructed a drug-target network to visualize the relationships between these AR inhibitors and their potential targets (Supplementary Fig. [105]4B). This approach highlights possible synergistic effects between AR inhibitors and pyrimidine metabolism-related pathways, offering valuable insights into potential therapeutic strategies. GO and KEGG pathway enrichment results To investigate the biological functions and signaling pathways associated with the intersecting genes from DEGs and WGCNA modules, we performed GO and KEGG enrichment analyses. The GO analysis revealed that, in the BP category, the intersecting genes were primarily involved in chromosome segregation, nuclear division, and organelle fission (Fig. [106]5A). In the CC category, these genes were mainly localized to the chromosome, centromeric region, spindle, and chromosomal region (Fig. [107]5A). The MF analysis highlighted enrichment in functions such as microtubule binding, tubulin binding, and microtubule motor activity (Fig. [108]5A). KEGG pathway enrichment analysis further revealed that the intersecting genes were predominantly associated with key regulatory pathways, including Cell Cycle, Oocyte Meiosis, and Progesterone-mediated Oocyte Maturation (Fig. [109]5B). These findings suggest that the genes shared between the DEGs and WGCNA modules are closely linked to essential processes of cellular division and replication, particularly through their roles in chromosome dynamics and cytoskeletal interactions. The involvement of pathways such as the cell cycle and oocyte meiosis indicates that these genes may be crucial regulators of cell proliferation and developmental processes, highlighting their potential importance in the progression of the biological phenomena under investigation. Fig. 5. [110]Fig. 5 [111]Open in a new tab Functional enrichment analysis of common DEGs and WGCNA module genes. (A) GO analysis bar plot showing the enrichment of biological processes, cellular components, and molecular functions. (B) KEGG analysis bubble plot illustrating the involvement of common genes in various signaling pathways. Feature gene identification through machine learning To further identify key genes associated with disease progression in the P1 and P2 subgroups, we conducted Lasso, Random Forest (RF), and SVM-RFE machine learning analyses. Initially, a Lasso regression was performed on the intersecting genes from the DEGs and WGCNA modules, resulting in the identification of 15 candidate genes (Fig. [112]6A). Subsequently, RF analysis was applied to rank the genes based on importance scores, and we selected the top 11 genes with importance scores greater than 5 (Fig. [113]6B). Finally, SVM-RFE analysis was performed on the top 30 genes, and we selected 3 genes based on the lowest error rate and highest accuracy (Fig. [114]6C). By taking the intersection of the key genes identified from these three machine learning approaches, we found that RRM2 emerged as the common gene across all methods (Fig. [115]6D). Notably, RRM2 was significantly overexpressed in the P2 subgroup compared to the P1 subgroup (Fig. [116]6E). This finding was further corroborated by analysis of the GEPIA database, which revealed that RRM2 is significantly upregulated in prostate cancer tissues compared to normal prostate tissues (Fig. [117]6F). Moreover, disease-free survival (DFS) analysis demonstrated that patients with high RRM2 expression had worse clinical outcomes than those with low RRM2 expression (Fig. [118]6G). These observations underscore the prognostic significance of RRM2 in prostate cancer. To elucidate the potential biological role of RRM2, we conducted single-gene GSEA, which indicated that RRM2 is primarily involved in the cell cycle pathway and is positively correlated with its activation in prostate cancer (Fig. [119]6H). This suggests that RRM2 may play a critical role in promoting cancer progression by driving cell cycle dysregulation, thereby contributing to the aggressive phenotype observed in the P2 subgroup. The consistent high expression of RRM2 across different analyses and its association with poor prognosis highlight its potential as a key therapeutic target in prostate cancer. Fig. 6. [120]Fig. 6 [121]Open in a new tab Identification of key genes in prostate cancer using machine learning. (A) Lasso regression analysis of DEGs and WGCNA module genes, selecting 15 key genes. (B) Random forest (RF) analysis showing the top 20 genes with importance score. (C) SVM-REF analysis selecting 3 genes based on minimum error and maximum accuracy. (D) Venn diagram showing the intersection of key genes identified by Lasso, RF, and SVM-REF, with RRM2 being the common gene across all methods. (E) Boxplot showing RRM2 expression levels in P1 and P2 subgroups. (F) GEPIA database analysis showing higher expression of RRM2 in prostate cancer tissues compared to normal prostate tissues. (G) Disease-free survival (DFS) curve indicating that high RRM2 expression is associated with worse prognosis in prostate cancer patients. (H) Single-gene GSEA analysis results for RRM2. Single-cell analysis in prostate cancer To further investigate the relationships between different cell types and their metabolic activity within prostate cancer tissues, we analyzed single-cell transcriptomic data from AR + prostate cancer samples. After performing quality control and normalization, dimensionality reduction and clustering identified distinct cell populations within the tissue, including cancer cells, basal cells, luminal epithelial cells, luminal cells, endothelial cells, and smooth muscle cells, as visualized by UMAP (Fig. [122]7A). The marker genes used to annotate these cell types are displayed in a bubble plot (Fig. [123]7B), providing clear distinctions between the cell clusters. Notably, UMAP visualization revealed that RRM2 expression was predominantly localized within prostate cancer cells (Fig. [124]7C), suggesting its key role in the oncogenic processes of these cells. We also assessed the activity of specific signaling pathways across cell populations. UMAP plots of the AUC scores for the Regulation of Androgen Receptor Activity and Pyrimidine Metabolism pathways revealed differential pathway activation across the cell types (Fig. [125]7D and E). This analysis provides insights into how androgen receptor signaling and pyrimidine metabolism may drive distinct biological processes in prostate cancer. Additionally, the interaction network among different cell types within the prostate cancer tissue was illustrated in Fig. [126]7F, highlighting complex intercellular communications that may contribute to tumor progression and microenvironment regulation. Furthermore, we visualized the metabolic activity of various cell types using a bubble plot (Fig. [127]7G), which showed that pyrimidine metabolism was most active in basal cells, luminal epithelial cells, and luminal cells. Additionally, purine metabolism was notably elevated in luminal cells, while porphyrin and chlorophyll metabolism was more prominent in cancer cells. These findings underscore the metabolic heterogeneity within prostate cancer tissues and suggest that different cell populations may engage in distinct metabolic pathways, potentially influencing tumor behavior and therapeutic response. This detailed single-cell analysis highlights the importance of cellular and metabolic diversity in shaping the tumor microenvironment and offers potential avenues for targeted therapies based on metabolic vulnerabilities in prostate cancer. Fig. 7. [128]Fig. 7 [129]Open in a new tab Single-cell analysis of pyrimidine metabolism and cell-cell interactions in prostate cancer. (A) UMAP plot showing the clustering of different cell types in AR + prostate cancer tissues, including cancer cells, basal cells, luminal epithelial cells, luminal cells, endothelial cells, and smooth muscle cells. (B) Bubble plot showing the expression of marker genes used for cell type annotation. (C) UMAP plot showing the expression of RRM2. (D) UMAP plot of AUC scores for the Regulation of Androgen Receptor Activity signaling pathway across different cell types. (E) UMAP plot of AUC scores for the Pyrimidine Metabolism pathway across different cell types. (F) Cell-cell interaction network illustrating the relationships between different cell types in prostate cancer tissues. (G) Bubble plot showing the metabolic activity levels in various cell types. Discussion Androgen receptor-positive prostate cancer remains one of the most challenging malignancies to treat, particularly in advanced stages^[130]34. Androgen deprivation therapy is the cornerstone of treatment for advanced prostate cancer, aiming to lower testosterone levels and inhibit tumor growth^[131]35. While initially effective, nearly all patients eventually progress to CRPC, which is characterized by continued disease progression despite low androgen levels^[132]36. The emergence of CRPC complicates treatment, as tumors adapt to low androgen environments through mechanisms such as AR mutations, overexpression, and alternative signaling pathways^[133]37,[134]38. Despite the development of second-generation AR-targeted therapies, the disease remains incurable in advanced stages, and the combination of ADT with immunotherapy and other modalities is still under exploration^[135]39,[136]40. Pyrimidine metabolism plays a significant role in tumor progression, as rapidly proliferating cancer cells require increased nucleotide synthesis to support DNA and RNA production^[137]41–[138]43. Our study demonstrated that pyrimidine metabolism is significantly upregulated in the P2 subgroup of prostate cancer, which exhibits higher androgen activity and metabolic reprogramming. The findings suggest that cells in the P2 subgroup are more reliant on pyrimidine synthesis, potentially offering a metabolic vulnerability that could be therapeutically targeted. The analysis of single-cell data further revealed that specific cell populations, such as basal and luminal epithelial cells, show higher activity in pyrimidine metabolism. These results indicate that targeting pyrimidine metabolism in specific cellular contexts could disrupt the nucleotide supply essential for cancer cell growth, providing a rationale for developing metabolic inhibitors as a treatment strategy for prostate cancer. Previous studies have highlighted the importance of metabolic reprogramming in cancer, particularly in the context of nucleotide biosynthesis^[139]44–[140]46. Our findings align with these studies, reinforcing the notion that cancer cells undergo profound metabolic changes to sustain their rapid proliferation. In prostate cancer, upregulated pyrimidine metabolism has been associated with aggressive tumor behavior and resistance to treatment^[141]47. High androgen levels are known to drive tumor progression through androgen receptor signaling, and our data suggest that this signaling may concurrently upregulate pyrimidine metabolism to meet the increased demand for nucleotide synthesis in rapidly proliferating cancer cells. This metabolic reprogramming likely supports the survival and growth of cancer cells under treatment pressure, where ADT alone is insufficient to suppress tumor progression. Moreover, the observed upregulation of pyrimidine metabolism under high androgen conditions may also facilitate DNA repair and replication processes, contributing to the tumor’s ability to evade treatment and develop resistance. These findings underscore the importance of further exploring the interplay between androgen signaling and pyrimidine metabolism in prostate cancer, as targeting this metabolic vulnerability could provide a novel therapeutic avenue for overcoming resistance in advanced stages of the disease. Although pyrimidine metabolism appears to be a key pathway, it is important to consider the broader metabolic landscape in prostate cancer. Other metabolic pathways, such as purine metabolism, glycolysis, and lipid metabolism, may also play significant roles in driving cancer progression and could interact with pyrimidine metabolism to support tumor growth and adaptation. These pathways may act in concert with pyrimidine metabolism or serve as compensatory mechanisms under metabolic stress. For instance, the interplay between androgen signaling and lipid metabolism has been implicated in CRPC progression, further illustrating the complexity of metabolic reprogramming in prostate cancer. Future studies should investigate how pyrimidine metabolism interacts with these alternative metabolic pathways to develop a more comprehensive understanding of the metabolic vulnerabilities in prostate cancer. The significance of this study lies in its pioneering systematic evaluation of pyrimidine metabolism in androgen‑associated prostate cancer. Through transcriptomic and single-cell analyses, the research reveals the relationship between pyrimidine metabolism, the immune microenvironment, drug sensitivity, and tumor mutation burden. These findings provide a theoretical foundation for future prostate cancer therapies targeting pyrimidine metabolism, particularly in addressing CRPC and treatment resistance. However, the study is limited by its reliance on bioinformatics analysis alone, without validation through large-scale clinical trials or experimental models. To fully realize the therapeutic potential of targeting pyrimidine metabolism, further research incorporating in vitro and in vivo experiments, as well as clinical studies, is essential. Conclusion In conclusion, this study provides a comprehensive analysis of pyrimidine metabolism in androgen‑associated prostate cancer, demonstrating its significant role in tumor progression, immune microenvironment modulation, and drug sensitivity. By integrating transcriptomic and single-cell sequencing data, we identified distinct metabolic profiles that correlate with therapeutic resistance and tumor mutation burden, offering potential avenues for targeted therapies. These insights contribute to a deeper understanding of metabolic reprogramming in prostate cancer and suggest that targeting pyrimidine metabolism could be a promising strategy for overcoming castration-resistant prostate cancer and enhancing treatment efficacy. Electronic supplementary material Below is the link to the electronic supplementary material. [142]Supplementary Material 1^ (527.3KB, docx) Author contributions Liang Huang , Yu Xie, Shusuan Jiang, Kan Liu and Zhihao Ming conceived the article and wrote the manuscript. Hong Shan reviewed and integrated it with additional data and references. All the remaining authors