Abstract Background Ethylnitrosourea (ENU) is a potent mutagen that induces gliomas in experimental models. Understanding the molecular mechanisms underlying ENU-induced gliomagenesis can provide insights into glioma pathogenesis and potential therapeutic targets. Methods We analyzed gene expression data from [36]GSE16011 and [37]GSE4290 datasets to identify differentially expressed genes (DEGs) associated with gliomagenesis. Comparative Toxicogenomics Database (CTD) was used to identify potential ENU targets. Protein-protein interaction (PPI) network, enrichment analysis, and Cox regression analysis were employed to elucidate key genes and pathways. A risk model was constructed using the TCGA dataset by LASSO analysis, and nomogram and immuno-infiltration analyses were performed. Results We identified 71 common genes potentially in ENU-induced gliomas. Key hub genes, including TP53, MCL1, CCND1, and PTEN, were highlighted in the PPI network. Enrichment analysis revealed significant GO terms and KEGG pathways, such as “Neuroactive ligand-receptor interaction” and “Glioma.” A risk model based on 11 prognostic genes was constructed, effectively stratifying patients into low and high-risk groups, with significant differences in overall survival. The model demonstrated high predictive accuracy. The nomogram constructed from ENU-related risk scores showed good calibration and clinical utility. Immuno-infiltration analysis indicated higher immune cell infiltration in high-risk patients. Molecular docking suggested strong binding affinities of ENU with MGMT and CA12. Conclusion Our integrative analysis identified key genes and pathways implicated in ENU-induced gliomagenesis. The ENU-related risk model and nomogram provide significant prognostic value, offering potential tools for clinical assessment and targeted therapies in glioma patients. Supplementary Information The online version contains supplementary material available at 10.1186/s40360-025-00919-x. Keywords: Gliomagenesis, N-ethyl-N-nitrosourea, Immune cell infiltration, Bioinformatics, Molecular docking, Prognostic model Introduction Gliomas are among the most aggressive and deadly types of brain tumors, posing significant challenges to patients and healthcare systems worldwide [[38]1, [39]2]. Despite extensive research, the precise molecular mechanisms underlying gliomagenesis remain elusive, hindering the development of effective therapeutic strategies [[40]3, [41]4]. Ethylnitrosourea (ENU) is a potent alkylating agent widely used in experimental models to induce gliomas. ENU has been extensively studied for its ability to cause DNA damage and mutations, leading to the formation of gliomas that closely mimic human glioma development [[42]5]. Understanding how ENU induces gliomagenesis can offer crucial insights into the pathogenesis of gliomas and potentially unveil novel therapeutic targets. Bioinformatics techniques have emerged as powerful tools in disease prevention and diagnosis by enabling comprehensive analysis of large-scale genomic datasets. These methods facilitate the identification of differentially expressed genes (DEGs), key pathways, and potential biomarkers critical for understanding disease mechanisms and developing targeted therapies [[43]6–[44]8]. Previous studies have demonstrated that ENU exposure leads to the formation of gliomas in animal models, closely mimicking many aspects of human glioma development [[45]9, [46]10]. Research has primarily focused on identifying genetic mutations and alterations in signaling pathways that contribute to tumorigenesis. However, the comprehensive identification of differentially expressed genes (DEGs) and their functional roles in ENU-induced gliomas has been limited. Moreover, research leveraging extensive transcriptomic datasets and bioinformatics methodologies to develop prognostic prediction algorithms for gliomas demonstrates a broad spectrum of potential applications [[47]11]. Bioinformatics techniques such as protein-protein interaction (PPI) network analysis, enrichment analysis, molecular docking, and Cox regression analysis have shown promise in elucidating key genes and pathways involved in disease progression [[48]12–[49]14]. By leveraging these methods, researchers can gain deeper insights into the molecular mechanisms of ENU-induced gliomagenesis and develop more accurate prognostic models. This integrative approach not only enhances our understanding of the underlying biology but also provides valuable tools for clinical applications, such as risk stratification and targeted therapy development. The motivation behind this study stems from the need to fill the gaps in our understanding of the molecular mechanisms by which ENU induces gliomas. By leveraging high-throughput gene expression data and advanced bioinformatics analyses, this study aims to identify key genes and pathways involved in ENU-induced gliomagenesis. Additionally, constructing a robust prognostic model based on these findings could significantly enhance the clinical management of glioma patients by providing more accurate risk stratification and identifying potential therapeutic targets. Our study utilized a multi-step approach. Gene expression data from the [50]GSE16011 and [51]GSE4290 datasets were analyzed to identify DEGs. Comparative Toxicogenomics Database (CTD) and other public datasets were used to pinpoint potential ENU targets. A PPI network was constructed to highlight key hub genes, followed by enrichment analysis to identify significant biological processes and pathways. A prognostic risk model was developed using LASSO analysis on the TCGA dataset, and its predictive accuracy was assessed. A nomogram based on ENU-related risk scores was constructed and validated. Immuno-infiltration analysis was performed to explore the immune landscape in glioma patients, and molecular docking studies evaluated the interactions between ENU and key proteins. Figure [52]1 presents the flowchart of our study. By addressing the current gaps in glioma research, this study contributes to the development of more effective diagnostic and therapeutic strategies, ultimately improving patient outcomes. Fig. 1. [53]Fig. 1 [54]Open in a new tab Flowchart of the study Methods Data collection and preprocessing To investigate the molecular mechanisms underlying glioma, we analyzed two publicly available gene expression datasets: [55]GSE16011 and [56]GSE4290. The [57]GSE16011 dataset comprises RNA-seq data from 276 glioma samples alongside 8 samples of normal brain tissue, whereas the [58]GSE4290 dataset contains 157 glioma samples and 23 normal brain tissue samples. Both datasets’ raw data were obtained from the Gene Expression Omnibus (GEO) database ([59]https://www.ncbi.nlm.nih.gov/geo/). Following this, the raw matrix files were processed and standardized using the Affy software package. Differential gene expression analysis was carried out on each dataset using the limma R package, with a significance threshold of an adjusted p-value < 0.05. The Benjamini-Hochberg method was employed for multiple testing corrections to control the false discovery rate. Genes meeting these criteria were considered differentially expressed genes (DEGs). Identification of potential targets linked to ENU-induced gliomas To identify potential molecular targets involved in ENU-induced gliomagenesis, we utilized the Comparative Toxicogenomics Database (CTD) ([60]https://ctdbase.org/) and cross-referenced the identified DEGs from both [61]GSE16011 and [62]GSE4290 datasets with ENU-related targets available in CTD. The overlap of DEGs from both datasets with ENU-specific targets was visualized using a Venn diagram to highlight genes common between the datasets and ENU exposure. Protein-protein interaction (PPI) network construction Following the identification of common genes linked to ENU-induced glioma, we constructed a PPI network using the STRING database ([63]https://cn.string-db.org/). The PPI network was visualized with Cytoscape software to highlight key hub genes based on degree centrality. These hub genes were then analyzed for their biological significance, with a focus on their regulatory roles in gliomagenesis. Functional and pathway enrichment analysis To further explore the biological processes and pathways involved in ENU-induced glioma, Gene Ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed on the 71 common genes. GO enrichment analysis categorized the genes into biological processes (BP), cellular components (CC), and molecular functions (MF), while KEGG pathway analysis identified relevant biological pathways. For the enrichment analysis, the same Benjamini-Hochberg method was applied to adjust p-values, ensuring the robustness of the results. Enrichment analysis was conducted using the clusterProfiler R package, with a significance threshold set at an adjusted p-value < 0.05. Construction of ENU-related risk model Using data from The Cancer Genome Atlas (TCGA), we performed univariate Cox regression analysis on the 71 ENU-related genes to identify prognostic genes associated with overall survival in glioma patients. To address multiple testing corrections, we applied the Benjamini-Hochberg method to control the false discovery rate. A LASSO regression model was then applied to these genes to construct a risk model, with the optimal tuning parameter (lambda) determined using 10-fold cross-validation to prevent overfitting. The risk model was used to stratify patients into high and low-risk groups based on their risk scores, which were calculated as the linear combination of the expression levels of the prognostic genes weighted by their corresponding coefficients from the LASSO model. The performance of the risk model was assessed using Kaplan-Meier survival analysis and the area under the curve (AUC) for 1-year, 3-year, and 6-year survival predictions. Furthermore, we assessed the prognostic and diagnostic efficacy of this risk-scoring model in glioma patients by utilizing the CGGA-325 dataset, which comprises 325 glioma patients, as well as the [64]GSE83300 dataset, which includes 50 glioma patients. Statistical significance was defined as a p-value < 0.05 for all analyses. Nomogram construction and evaluation To provide a visual tool for predicting patient outcomes, a nomogram was constructed based on the ENU-related risk scores. The nomogram was designed to predict 1-year, 3-year, and 6-year survival probabilities. Calibration plots were generated to evaluate the accuracy of the nomogram, and decision curve analysis (DCA) was conducted to assess its clinical utility. Analysis of immune cell infiltration The GSVA tool was utilized to conduct ssGSEA to evaluate the distribution of 24 distinct immune cell subsets across groups characterized by low and high risk scores. The ESTIMATE algorithm was employed to assess the ESTIMATE Score, Immune Score, and Stromal Score. Spearman’s rank correlation analysis was performed using the ggplot2 package. All bioinformatics analyses were conducted using the Xiantao Academic platform ([65]https://www.xiantaozi.com/), which offers a comprehensive suite of tools for data processing and analysis [[66]15]. Molecular docking analysis Molecular docking analysis was performed to investigate the binding affinities between ENU and the identified key risk genes. The chemical structure of ENU was sourced from the PubChem database ([67]https://pubchem.ncbi.nlm.nih.gov/), while the target protein structures were retrieved from the Protein Data Bank (PDB) ([68]https://www.rcsb.org). Subsequently, molecular docking was conducted utilizing AutoDock Vina. The binding interactions were analyzed based on the calculated Vina scores and the specific residues involved in the binding process. Molecular dynamics simulation (MD) MD was conducted utilizing the Gromacs 2022 software, employing the GAFF force field tailored for small molecules, alongside the AMBER14SB force field and the TIP3P water model for protein representation. The simulation system for the complex was established by integrating files corresponding to both proteins and small molecule ligands. The simulations were carried out under conditions of constant temperature and pressure, incorporating periodic boundary conditions. Throughout the MD simulations, all hydrogen bonds were constrained utilizing the LINCS algorithm, with a time integration step set at 2 femtoseconds. Electrostatic interactions were determined via the Particle-Mesh Ewald (PME) method, with a cutoff distance established at 1.2 nanometers. The cutoff for non-bonded interactions was designated at 10 Å and was refreshed every ten steps. The V-rescale method was employed for temperature coupling, maintaining the simulated temperature at 298 K, while the Berendsen method was applied to regulate the pressure at 1 bar. At a temperature of 298 K, 100 picoseconds of NVT and NPT equilibrium simulations were performed, followed by 100 nanoseconds of MD simulations on the complex system, with conformational data recorded every 10 picoseconds. Upon completion of the simulation, the resultant trajectories were analyzed utilizing VMD and PyMOL software. Results Characterization of molecular targets linked to ENU-induced gliomas In the [69]GSE16011 dataset, 4306 genes were found to be significantly downregulated, while 3684 genes were significantly upregulated (Fig. [70]2A). Similarly, in the [71]GSE4290 dataset, 4078 genes were significantly downregulated, and 5921 genes were significantly upregulated (Fig. [72]2B). To identify potential targets of ENU, we utilized the Comparative Toxicogenomics Database. The overlap between DEGs from [73]GSE16011 and [74]GSE4290 datasets with ENU targets was visualized using a Venn diagram, revealing 71 common genes potentially implicated in ENU-induced glioma (Fig. [75]2C). The significance of constructing the PPI network lies in its ability to identify key interactions and regulatory relationships among these genes. The 71 ENU-induced glioma targets were further subjected to PPI analysis, as shown in Fig. [76]2D. The PPI network highlights key hub genes, including TP53, MCL1, CCND1, and PTEN, which exhibit central roles in the network and suggest critical regulatory functions in the context of ENU-induced gliomagenesis. This PPI network provides insights into potential molecular mechanisms and interactions that could be targeted for therapeutic intervention. In summary, our integrative analysis of DEGs from glioma datasets and ENU targets has identified 71 genes that may play a significant role in ENU-induced glioma pathogenesis. Fig. 2. [77]Fig. 2 [78]Open in a new tab Identification and analysis of DEGs and ENU-induced glioma targets. DEGs were identified from glioma datasets [79]GSE16011 (A) and [80]GSE4290 (B). The x-axis denotes log2-transformed fold changes and the y-axis denotes the negative log10-transformed adjusted p-values (Padj). Red dots represent upregulated genes, blue dots indicate downregulated genes and grey dots denote non-significant genes. (C) Venn diagram illustrating the overlap between glioma DEGs and ENU targets, identifying 71 common genes implicated in ENU-induced glioma. (D) PPI network of the 71 ENU-induced glioma targets. Red nodes indicate hub genes and yellow nodes represent other interacting genes Enrichment analysis of ENU-related toxicity targets Figure [81]3 presents the enrichment analysis results of the 71 identified ENU-related toxicity targets, revealing significant biological processes, cellular components, molecular functions, and pathways likely implicated in ENU-induced glioma. The significance of the results presented in the enrichment analysis lies in their ability to elucidate the biological context and functional implications of the identified targets. Figure [82]3A shows the GO enrichment analysis categorized into Biological Process (BP), Cellular Component (CC), and Molecular Function (MF). In the BP category, the top enriched terms include “response to xenobiotic stimulus,” “one-carbon metabolic process,” and “regulation of cytosolic calcium ion concentration.” The CC category is significantly enriched for terms such as “chromosomal region,” “neuronal cell body,” and “nuclear chromosome.” In the MF category, key enriched terms include “carbonate dehydratase activity,” “p53 binding,” and “lyase activity.” Fig. [83]3B highlights the KEGG pathway enrichment analysis for the 71 ENU targets. The analysis reveals significant pathways such as “Neuroactive ligand-receptor interaction,” “MicroRNAs in cancer,” and “Sphingolipid signaling pathway.” Other notable pathways include “Nitrogen metabolism,” “Neurotrophin signaling pathway,” “Central carbon metabolism in cancer,” and specific cancer-related pathways such as “Glioma,” “Melanoma,” and “Endometrial cancer.” These enrichment analyses indicate that the ENU-related toxicity targets play crucial roles in various biological processes and pathways, thereby providing insights into the potential mechanisms underlying ENU-induced glioma development. Fig. 3. [84]Fig. 3 [85]Open in a new tab Enrichment analysis of 71 ENU-related toxicity targets. (A) Gene Ontology (GO) enrichment analysis is categorized into Biological Process (BP), Cellular Component (CC), and Molecular Function (MF). The x-axis represents -Log10(P.adj), indicating the significance of enrichment, with bars colored according to ontology categories (BP in blue, CC in red, MF in green). (B) KEGG pathway enrichment analysis, where the x-axis represents the GeneRatio and the y-axis lists the enriched pathways. Dot size indicates the gene count in each pathway, while color intensity reflects the adjusted p-value (P.adj), with blue representing lower and red representing higher p-values Identification of ENU-related risk model in glioma patients In the TCGA dataset, COX regression analysis was performed on the 71 ENU-related toxicity targets to identify prognosis-related genes. The regression analysis successfully identified 16 prognostic genes (Table [86]1). Based on the 16 identified prognostic genes, a risk model related to ENU exposure was constructed by LASSO regression analysis (Fig. [87]4A). Glioma patients were classified into low and high-risk groups according to their risk scores. The heatmap in Fig. [88]4B illustrates the expression profiles of these 11 genes in both risk groups, with distinct expression patterns observed between low and high-risk patients. Survival analysis demonstrated that patients in the high-risk group had significantly poorer overall survival compared to those in the low-risk group (Fig. [89]4C, p < 0.001). This indicates that the risk model can effectively stratify patients based on their prognosis. Furthermore, the area under the curve (AUC) values were 0.872 for 1-year, 0.907 for 3-year, and 0.848 for 6-year survival predictions (Fig. [90]4D). Furthermore, we validated the prognostic and diagnostic performance of this risk model in glioma patients using the CGGA-325 (Figure [91]S1) and [92]GSE83300 (Figure [93]S2) datasets. These results suggest that the risk model has high sensitivity and specificity in predicting the survival of glioma patients exposed to ENU. In summary, the constructed ENU-related risk model, based on 11 prognostic genes, provides significant predictive value for the survival of glioma patients. Table 1. The results of COX regression analysis based on TCGA-glioma dataset Characteristics Total(N) HR(95% CI) Univariate analysis P value Univariate analysis HR(95% CI) Multivariate analysis P value Multivariate analysis TOP2A 698 Low 349 Reference Reference High 349 4.080 (3.076–5.411) < 0.001 1.970 (1.041–3.730) 0.037 SMO 698 Low 348 Reference Reference High 350 3.412 (2.614–4.454) < 0.001 1.582 (1.113–2.247) 0.010 CASP6 698 Low 348 Reference Reference High 350 5.963 (4.459–7.975) < 0.001 1.713 (1.043–2.813) 0.034 PSMB9 698 Low 348 Reference Reference High 350 4.487 (3.416–5.895) < 0.001 1.678 (1.161–2.426) 0.006 GRIN1 698 Low 349 Reference Reference High 349 0.362 (0.280–0.468) < 0.001 0.619 (0.398–0.963) 0.034 MCL1 698 Low 348 Reference Reference High 350 1.871 (1.465–2.388) < 0.001 1.467 (1.070–2.012) 0.017 BCHE 698 Low 349 Reference Reference High 349 0.588 (0.462–0.750) < 0.001 0.716 (0.532–0.964) 0.028 CA12 698 Low 348 Reference Reference High 350 2.586 (2.013–3.323) < 0.001 1.821 (1.350–2.456) < 0.001 NTRK3 698 Low 349 Reference Reference High 349 0.383 (0.295–0.496) < 0.001 0.651 (0.474–0.895) 0.008 AURKB 698 Low 349 Reference Reference High 349 4.446 (3.355–5.891) < 0.001 1.766 (1.022–3.053) 0.042 CA3 698 Low 347 Reference Reference High 351 5.008 (3.791–6.615) < 0.001 1.521 (1.044–2.214) 0.029 PTGS1 698 Low 348 Reference Reference High 350 2.998 (2.309–3.891) < 0.001 1.704 (1.085–2.678) 0.021 ADORA1 698 Low 348 Reference Reference High 350 2.127 (1.665–2.717) < 0.001 1.387 (1.021–1.884) 0.037 MGMT 698 Low 348 Reference Reference High 350 1.775 (1.392–2.264) < 0.001 1.522 (1.158–2.000) 0.003 NFE2L2 698 Low 349 Reference Reference High 349 2.247 (1.748–2.890) < 0.001 2.153 (1.532–3.027) < 0.001 AVPR1B 698 Low 349 Reference Reference High 349 1.273 (1.002–1.617) 0.048 1.471 (1.133–1.909) 0.004 [94]Open in a new tab Fig. 4. [95]Fig. 4 [96]Open in a new tab Identification of ENU-related risk model in glioma. (A) LASSO regression analysis of 71 ENU-related toxicity targets identified 11 prognostic genes in the TCGA dataset. The x-axis represents Log(λ) and the y-axis shows partial likelihood deviance with error bars indicating standard errors. The dashed lines correspond to two selected λ values. (B) Construction of the ENU-related risk model for glioma patients using the 11 identified prognostic genes. The risk score distribution, patient survival status, and a heatmap of relevant gene expressions, with the x-axis representing patients sorted by risk score into low (blue) and high (red) risk groups, while the y-axis shows survival time. (C) Kaplan-Meier survival curves showing the overall survival of glioma patients in the high-risk and low-risk groups, with the x-axis representing time in days and the y-axis showing survival probability; the hazard ratio (HR) and p-value are indicated. (D) ROC curve analysis of the risk model demonstrates the predictive performance for 1-year, 3-year, and 6-year survival, with the x-axis denoting false-positive rate (1-specificity) and the y-axis denoting true-positive rate (sensitivity). The area under the curve (AUC) values for each time point are provided, demonstrating the predictive performance of the risk score model Construction and evaluation of the nomogram A nomogram model was constructed for glioma patients based on the ENU-related risk scores (Fig. [97]5A). The nomogram predicts 1-year, 3-year, and 6-year survival probabilities, offering a visual and quantitative tool for individualized patient prognosis. Calibration plots were used to evaluate the predictive performance of the constructed nomogram. As shown in Fig. [98]5B, the observed survival probabilities align closely with the nomogram-predicted survival probabilities for 1-year, 3-year, and 6-year outcomes, indicating good calibration and predictive accuracy. DCA was employed to assess the clinical utility of the nomogram by quantifying the net benefit at different threshold probabilities. Figure [99]5C and D, and [100]5E illustrate the DCA plots for 1-year, 3-year, and 6-year survival predictions, respectively. The results demonstrate that using the risk score-derived nomogram provides a higher net benefit than either treating all patients or treating no patients across a range of threshold probabilities. In summary, the nomogram constructed from ENU-related risk scores shows excellent predictive performance and has substantial clinical utility in predicting the survival outcomes of glioma patients. Fig. 5. [101]Fig. 5 [102]Open in a new tab Construction and evaluation of the nomogram for predicting survival in glioma. (A) Nomogram model based on ENU-related risk scores for predicting 1-year, 3-year, and 6-year survival probabilities in glioma patients, where the total points assigned for each risk score sum up to predict the survival probability. (B) Calibration plots show the agreement between nomogram-predicted survival probabilities and observed outcomes for 1-year, 3-year, and 6-year survival, where the x-axis represents the nomogram-predicted probabilities and the y-axis shows the observed fraction survival probabilities. The diagonal line represents perfect prediction, and the points with error bars represent the model performance. DCA assesses the net benefit of the nomogram model at different threshold probabilities for 1-year (C), 3-year (D), and 6-year (E) survival predictions. The x-axes indicate threshold probabilities, and the y-axes show net benefits Comparative analysis of risk scores among clinical subgroups As shown in Fig. [103]6A, risk scores were significantly higher in tumor tissues compared to normal tissues (p < 0.05), suggesting a strong association between elevated risk scores and glioma presence. Among different histological types, glioblastomas exhibited the highest risk scores compared to astrocytomas, oligodendrogliomas, and oligoastrocytomas (p < 0.01), indicating a possible correlation between higher risk scores and glioblastoma aggressiveness (Fig. [104]6B). When stratified by WHO grade, risk scores showed a stepwise increase from Grade 2 to Grade 3, and from Grade 3 to Grade 4 (p < 0.001 and p < 0.01 respectively), reflecting a positive association between risk scores and tumor grade (Fig. [105]6C). Analysis of IDH mutation status revealed significantly higher risk scores in tumors with wild-type IDH compared to those with mutated IDH (p < 0.001), consistent with the known poorer prognosis of IDH wild-type gliomas (Fig. [106]6D). The age-based comparison showed significantly higher risk scores in patients older than 60 years compared to those 60 years or younger (p < 0.001), suggesting that older age may be associated with higher risk scores (Fig. [107]6E). Finally, patients who were dead at the last follow-up had significantly higher risk scores compared to those who were alive (p < 0.001), highlighting the prognostic value of the risk scores for overall survival (Fig. [108]6F). These findings underscore the clinical relevance of ENU-related risk scores and their potential utility in stratifying glioma patients based on key clinical and pathological features. Fig. 6. [109]Fig. 6 [110]Open in a new tab Comparison of ENU-related risk scores across clinical subgroups. (A) Comparison of risk scores between normal and tumor tissues. (B) Stratification of risk scores among different histological types. (C) Risk scores by WHO grade, showing an increasing trend from Grade 2 to Grade 3 and from Grade 3 to Grade 4. (D) Comparison of risk scores between IDH wild-type (WT) and mutant (Mut) status. (E) Age-based comparison of risk scores. (F) Comparison of risk scores between patients who are alive versus dead at last follow-up. * for p < 0.05, ** for p < 0.01, and *** for p < 0.001, with all comparisons conducted using the Wilcoxon rank sum test. The y-axis for all panels represents the risk score Immuno-infiltration analysis based on ENU-related risk scores As shown in Fig. [111]7A, we observed that the high-risk group had significantly higher StromalScore, ImmuneScore, and ESTIMATEScore compared to the low-risk group (p < 0.001 for all comparisons). This indicates a higher level of stromal and immune cell infiltration in the high-risk group. The ssGSEA analysis further revealed distinct differences in the infiltration levels of various immune cell types between the high-risk and low-risk groups (Fig. [112]7B). Significant increases in scores were noted for activated dendritic cells (aDC), B cells, cytotoxic cells, dendritic cells (DC), eosinophils, immature dendritic cells (iDC), macrophages, neutrophils, NK CD56 dim cells, T helper cells, Tcm, Th17 cells, and Th2 cells in the high-risk group (p < 0.05 for all comparisons). Conversely, NK CD56 bright cells and plasmacytoid dendritic cells (pDC) showed lower infiltration in the high-risk group (p < 0.001). The immune infiltration analysis suggests that the high-risk group is characterized by an enriched immune microenvironment, which may influence tumor behavior and patient prognosis. The correlation heatmap illustrating the association of risk scores with the various immune cell types. A strong positive correlation was observed between the risk score and the majority of immune cells (Fig. [113]7C), reinforcing the observation of enhanced immune infiltration in high-risk patients. These results underscore the differences in the immune microenvironment between high-risk and low-risk glioma patients, with high-risk patients exhibiting significantly elevated levels of immune cell infiltration. Fig. 7. [114]Fig. 7 [115]Open in a new tab Evaluation of immune infiltration levels in high and low-risk glioma patients. (A) Box plots comparing StromalScore, ImmuneScore, and ESTIMATEScore between high-risk and low-risk groups. (B) Box plots comparing ssGSEA scores of various immune cell types between high-risk and low-risk groups. (C) Heatmap showing the correlation between risk scores and ssGSEA scores for different immune cell types. * for p < 0.05, ** for p < 0.01, and *** for p < 0.001, with all comparisons conducted using the Wilcoxon rank sum test Expression analysis of ENU-related risk model genes The expression levels of the 11 ENU-related risk model genes were analyzed in the TCGA dataset, comparing normal tissues and glioma tumors (Fig. [116]8). Expression levels of AVPR1B and MGMT were significantly reduced in tumor tissues compared to normal tissues, while levels of NFE2L2, ADORA1, AURKB, NTRK3, CA12, BCHE, CASP6, and SMO were markedly elevated in tumor tissues. The differential expression of these genes between normal and tumor tissues highlights their potential role in glioma pathogenesis and their association with ENU exposure. Furthermore, the expression levels of these genes were validated through independent datasets [117]GSE16011 (Figure [118]S3) and [119]GSE4290 (Figure [120]S4). Fig. 8. [121]Fig. 8 [122]Open in a new tab Expression analysis of ENU-related risk model genes. Comparison of the expression levels of AVPR1B (A), NFE2L2 (B), MGMT (C), ADORA1 (D), CA3 (E), AURKB (F), NTRK3 (G), CA12 (H), BCHE (I), CASP6 (J), and SMO (K) genes between normal tissues and glioma tumors in the TCGA dataset. ** for p < 0.01, and *** for p < 0.001, with all comparisons conducted using the Wilcoxon rank sum test Molecular docking analysis The chemical structure of ENU (Fig. [123]9A). The Vina score, which estimates the binding affinity, is presented for each gene (Fig. [124]9B). The genes NTRK3, AVPR1B, and ADORA1 showed relatively lower binding affinities with Vina scores of -4.0, -4.1, and − 4.1, respectively. CASP6 and NFE2L2 exhibited moderate binding affinities with scores of -4.6 and − 4.7, respectively. CA12 had affinities around − 5.0, while the highest binding affinity was observed with MGMT, scoring − 5.1. Detailed docking interaction of ENU with MGMT is illustrated (Fig. [125]9C). The binding pocket shows critical interactions, including hydrogen bonding and hydrophobic interactions, contributing to the strong binding affinity. Key residues such as L142, K107, W65, and V106 are involved in the binding process. The binding of ENU to MGMT potentially disrupts its catalytic activity by interfering with the active site’s structural conformation. MGMT is known for its role in DNA repair. Therefore, ENU binding could impair this function, leading to increased DNA alkylation and genomic instability, which are key factors in glioma development [[126]16]. The interaction of ENU with CA12 is depicted (Fig. [127]9D). ENU forms several key interactions within the binding pocket, with residues H96, H94, W209, H119, E106, and T199 playing significant roles. The docking poses highlight extensive hydrogen bonding and hydrophobic interactions, indicating a robust binding affinity of ENU to CA12. CA12 is a carbonic anhydrase involved in pH regulation and tumor metabolism. ENU binding to CA12 might alter its enzymatic activity, thereby disrupting the acidic microenvironment of glioma cells, which is crucial for their proliferation and survival [[128]17, [129]18]. This interaction could hinder tumor growth by affecting cellular metabolism and pH homeostasis. The results of the molecular docking analysis suggest that ENU exhibits varying degrees of binding affinities with different risk model genes, potentially implicating these genes in ENU-induced gliomagenesis. The high binding affinities of ENU with MGMT and CA12 underscore their potential roles in mediating ENU’s toxic effects in glioma development. Fig. 9. [130]Fig. 9 [131]Open in a new tab Molecular docking interactions of ENU with risk model genes. (A) The chemical structure of ENU. (B) The bar graph shows the Vina docking scores for ENU with various proteins, with lower scores indicating stronger binding affinities. (C) Detailed molecular docking interaction between ENU and MGMT. (D) Molecular docking interaction between ENU and CA12. Hydrogen bonds and interaction sites are marked, with residues involved in binding interactions labeled (e.g., W65, H119) Molecular dynamics simulation of the MGMT-ENU complex In the molecular dynamics simulations of the MGMT-ENU complex, several parameters were analyzed to assess the stability and behavior of the complex over a 100 ns simulation period. The Root-Mean-Square Deviation (RMSD) analysis (Fig. [132]10A) of the protein, ligand, and complex reveals distinct stability profiles. The overall stability of the complex is maintained, as evidenced by an RMSD fluctuating around 0.2 nm throughout the simulation. This indicates that while the complex formation does induce some conformational changes, these remain within a stable range. The Radius of Gyration (Rg) of the complex (Fig. [133]10B) fluctuates minimally within the 1.78–1.80 nm range, indicating consistent compactness and no significant unfolding of the protein structure during the simulation period. This stable Rg suggests that the ENU binding does not substantially alter the global folding of MGMT. The Root-Mean-Square Fluctuation (RMSF) analysis (Fig. [134]10C) provides insights into residue-specific flexibility. Most residues exhibit fluctuations below 0.1 nm, indicating limited flexibility. However, certain residues display increased fluctuations up to 0.3 nm, possibly corresponding to the active site or interaction regions that accommodate binding-induced conformational changes. The distance analysis between docking site-ligand and protein-ligand (Fig. [135]10D) indicates stable interactions, with distances generally averaging around 0.5–0.6 nm. Similarly, MD analysis showed that the complex of small-molecule ENU with CA12 protein remained stable during 0–65 ns (Figure S5). The complex of small molecule ENU with AURKB protein was gradually stabilised after 80 ns (Figure S6). However, further experimental validation should be pursued to corroborate these in silico findings. Fig. 10. [136]Fig. 10 [137]Open in a new tab Molecular dynamics simulation analysis of the MGMT-ENU complex. (A) RMSD plot over 100 ns, showing the stability of the protein, ligand, and complex. (B) Rg of the complex, indicating the compactness of the protein throughout the simulation period. (C) RMSF of protein residues. (D) Distance analysis between the docking site-ligand and protein-ligand Discussion ENU is a well-established mutagen known for its potent carcinogenic effects, particularly in the induction of gliomas, a type of malignant brain tumor characterized by aggressive growth and poor prognosis [[138]19]. ENU induces mutations through the alkylation of DNA, leading to genetic alterations that drive gliomagenesis [[139]9]. Understanding the molecular mechanisms by which ENU induces gliomas is critical for improving the diagnosis and treatment of glioma patients. In this study, we performed an integrative analysis of gene expression data from ENU-induced glioma models and identified key molecular targets and pathways that may contribute to glioma pathogenesis. Our findings highlight the potential of these targets as prognostic markers and therapeutic candidates for glioma. One of the main findings of our study is the identification of 71 common genes potentially implicated in ENU-induced gliomagenesis. These genes were identified through an overlap analysis of DEGs from the [140]GSE16011 and [141]GSE4290 datasets with ENU targets available in the CTD. Among the 71 genes, key hub genes such as TP53, MCL1, CCND1, and PTEN were identified through PPI network analysis. These genes are well-known tumor suppressors and oncogenes involved in cell cycle regulation, apoptosis, and DNA repair mechanisms, and their dysregulation has been previously implicated in glioma development [[142]20–[143]23]. However, the specific roles of these hub genes in the context of ENU-induced gliomagenesis warrant further investigation. For instance, elucidating the functional implications of TP53 and MGMT in about lactate metabolism, immune regulation, and tumor microenvironment remodeling could enhance our understanding of their contributions to glioma progression. The presence of these genes in the context of ENU exposure further supports their role in ENU-induced gliomagenesis, making them potential candidates for targeted therapeutic interventions. In comparison to previous studies, our analysis provides novel insights into the molecular pathways involved in ENU-induced glioma. For example, GO and KEGG enrichment analyses revealed significant biological processes and pathways, such as “Neuroactive ligand-receptor interaction,” “Glioma,” and “MicroRNAs in cancer.” These findings are consistent with prior reports linking these pathways to glioma development [[144]24–[145]26]. However, our study extends this knowledge by identifying specific ENU-induced targets within these pathways, highlighting the role of ENU in altering cellular communication and tumor-promoting microenvironments. Further pathway enrichment analyses could provide deeper insights into how these pathways interact with the identified hub genes, particularly in the context of immune modulation and metabolic alterations in glioma. The identification of these pathways suggests that ENU not only induces mutations but also triggers downstream signaling events that promote glioma progression, which may be explored for future therapeutic strategies. A significant contribution of our work is the development of an ENU-related risk model based on 11 prognostic genes. Using LASSO regression analysis, we constructed a model that effectively stratified glioma patients into low and high-risk groups, with distinct survival outcomes. This model demonstrated high sensitivity and specificity in predicting patient survival, as reflected in the AUC values for 1-year, 3-year, and 6-year survival predictions. Notably, the prognostic genes included in our model, such as MGMT, NFE2L2, CASP6, and AURKB, have been previously linked to glioma outcomes, particularly in terms of chemoresistance [[146]27–[147]31]. Importantly, our risk model’s translational relevance extends beyond prognosis. The distinct immune microenvironment observed in high-risk patients, marked by elevated infiltration of cytotoxic T cells, macrophages, and dendritic cells, suggests potential therapeutic vulnerabilities. For instance, the heightened immune activity in high-risk patients may indicate a more immunosuppressive microenvironment, as excessive infiltration of macrophages and exhausted T cells could promote immune evasion [[148]32, [149]33]. This raises the possibility that high-risk patients, as stratified by our model, might benefit from immunotherapies targeting checkpoint inhibitors (e.g., PD-1/PD-L1 or CTLA-4 blockers) to counteract immune suppression. Conversely, the lower infiltration of NK CD56 bright cells and plasmacytoid dendritic cells in high-risk patients highlights potential deficits in innate anti-tumor immunity, which could be addressed through adoptive cell therapies or cytokine-based interventions to enhance NK cell activity [[150]34]. To enhance clinical applicability, we propose that this risk model could be integrated into routine clinical workflows by enabling clinicians to identify patients who may benefit from more aggressive treatment strategies or closer monitoring based on their risk scores. For instance, high-risk patients with elevated stromal and immune scores might be prioritized for combination therapies that target both the tumor (e.g., temozolomide) and the immunosuppressive microenvironment (e.g., CSF-1R inhibitors to deplete tumor-associated macrophages). Additionally, the strong positive correlation between risk scores and immune cell infiltration suggests that our model could serve as a biomarker for predicting response to immunotherapy, guiding personalized treatment regimens. Furthermore, the nomogram constructed from ENU-related risk scores provides an intuitive and clinically useful tool for predicting individualized patient outcomes. The good calibration and significant net benefit demonstrated by the nomogram highlight its potential clinical utility in guiding therapeutic decisions and monitoring patient prognosis. By visualizing the cumulative effect of multiple prognostic factors, the nomogram can assist clinicians in making informed decisions regarding treatment plans tailored to individual patient profiles. An example of its application could involve using the nomogram to guide discussions with patients about the expected outcomes of different treatment modalities based on their specific risk profiles. In addition, our immuno-infiltration analysis revealed higher levels of stromal and immune cell infiltration in the high-risk group, indicating a distinct immune microenvironment in these patients. This finding suggests that ENU exposure may not only induce genetic alterations but also significantly influence the immune landscape of gliomas. Specifically, the increased immune cell infiltration (T cells and macrophages) observed in the high-risk group may contribute to an inflammatory tumor microenvironment that promotes tumor aggressiveness and could affect treatment responses [[151]32, [152]33, [153]35]. The molecular and immune heterogeneity captured by our model underscores the need for tailored therapeutic strategies. For example, high-risk patients with overexpression of AURKB (a mitotic kinase linked to therapy resistance) might benefit from targeted inhibitors like barasertib, while those with elevated cytotoxic cell infiltration could be candidates for vaccines or oncolytic viruses to amplify anti-tumor immunity. Further studies are warranted to explore the functional implications of this immune infiltration, particularly how it interacts with the tumor cells and influences glioblastoma progression and therapeutic outcomes. This observation is supported by previous studies showing that immune cell infiltration is associated with tumor aggressiveness and poor prognosis in gliomas [[154]36–[155]38]. However, it is important to note that our study does not address the temporal dynamics of ENU-induced gliomagenesis, such as the progression from early-stage mutations to tumor development. The TCGA glioma dataset utilized in our analysis lacks information on the timeline of these processes, which limits our ability to provide a longitudinal perspective on ENU’s role in glioma biology. Future research incorporating longitudinal studies would be beneficial to track the progression of ENU-induced mutations and their impact on tumor development over time. One of the most novel aspects of our study is the molecular docking analysis, which demonstrated strong binding affinities of ENU with key risk genes such as MGMT and CA12. MGMT, a DNA repair enzyme, has been extensively studied in the context of glioma, where its methylation status is a known predictor of response to alkylating agents [[156]39, [157]40]. Our docking results suggest that ENU may interact directly with MGMT, potentially interfering with its DNA repair function and contributing to glioma pathogenesis. Similarly, CA12, a carbonic anhydrase implicated in tumor growth, showed strong binding affinity with ENU, highlighting its potential role in ENU-induced gliomagenesis [[158]41]. These findings provide a mechanistic basis for ENU’s toxicity and may inform future studies aimed at targeting these interactions for therapeutic purposes. In conclusion, our integrative analysis reveals key molecular mechanisms and pathways involved in ENU-induced glioma. The identification of 71 ENU-related genes, the development of a robust prognostic model, and the exploration of ENU’s binding affinities with critical genes provide valuable insights into glioma pathogenesis and potential therapeutic targets. Future studies should focus on validating these findings in experimental models and clinical settings to explore their translational potential in glioma diagnosis and treatment. Electronic supplementary material Below is the link to the electronic supplementary material. [159]Supplementary Material 1^ (751.7KB, docx) Acknowledgements