Abstract Background Palmitoylation, a post-translational lipid modification, has garnered increasing attention for its role in inflammatory processes and tumorigenesis. Emerging evidence suggests a potential association between palmitoylation and inflammatory responses in the pathogenesis of endometriosis. However, the precise mechanistic interplay remains elusive, necessitating further investigation. Methods This study integrated transcriptomic analysis and Mendelian randomization (MR) to identify a causal gene set implicated in endometriosis. Differentially expressed genes (DEGs) were first identified in the training dataset using the limma package in R. Weighted gene co-expression network analysis (WGCNA) was subsequently performed, leveraging Single Sample Gene Set Enrichment Analysis (ssGSEA)-derived scores of palmitoylation-related genes (PRGs) as phenotypic traits to identify key modular genes. The intersection of these key modular genes with DEGs yielded a refined gene set. Machine learning algorithms were then applied to further optimize gene selection, followed by external validation, immune infiltration analysis, RNA network construction, and exploration of potential targeted drug candidates. Results Through a rigorous screening process, VRK1, GALNT12, and RMI1 emerged as key genes associated with palmitoylation, exhibiting significant downregulation in endometriosis samples (P < 0.05), indicative of a potential protective role. Immune infiltration analysis further revealed strong correlations between these genes and M2 macrophages as well as resting Natural Killer (NK) cells. Additionally, investigations into the targeted RNA network and drug association profiling provided novel insights, laying the groundwork for future high-quality validation studies. Conclusions This study employed a comprehensive analytical framework to identify palmitoylation-associated key genes in endometriosis. The integration of immunoinfiltration analysis, RNA network construction, and drug association profiling offers valuable insights for advancing clinical diagnostics, disease monitoring, and therapeutic development in endometriosis. Supplementary Information The online version contains supplementary material available at 10.1186/s12905-025-03697-0. Keywords: Palmitoylation, Endometriosis, Transcriptomics, Mendelian Randomization Introduction Endometriosis is an estrogen-dependent chronic inflammatory condition characterized by the ectopic proliferation of endometrial-like tissue [[36]1, [37]2]. Despite its histologically benign classification, the condition exhibits invasive, metastatic, and recurrent behaviors reminiscent of malignancies, affecting approximately 10% of women of reproductive age [[38]3, [39]4]. Common clinical manifestations, including chronic pelvic pain, dysmenorrhea, dysuria, and infertility, substantially diminish patients’ quality of life [[40]4]. As the pathophysiology of the disease becomes increasingly understood, attention has shifted toward the roles of hormonal dysregulation, inflammatory mediators, and genetic susceptibility [[41]5, [42]6]. However, definitive conclusions regarding their precise contributions remain elusive. Current therapeutic strategies primarily rely on hormonal suppression therapy and minimally invasive surgical interventions [[43]7]. Nevertheless, these approaches are often associated with high recurrence rates and treatment-related complications [[44]8, [45]9]. Given these limitations, elucidating the underlying molecular mechanisms and identifying reliable diagnostic and prognostic biomarkers represent crucial avenues for advancing precision medicine, optimizing clinical management, and mitigating disease recurrence. Protein palmitoylation, a reversible post-translational lipid modification, is dynamically regulated by a cohort of palmitoyl S-acyltransferases characterized by the Asp-His-His-Cys (DHHC) motif [[46]10, [47]11]. This process is counterbalanced by acylprotein thioesterases, which modulate protein localization and function in a highly dynamic manner [[48]10, [49]11]. Recent studies have highlighted the intricate role of palmitoylation in inflammatory regulation, with ZDHHC12 promoting the degradation of NOD-like receptor family pyrin domain-containing 3 (NLRP3) through chaperone-mediated autophagy [[50]12]. In autoinflammatory disorders, the NOD2 variant NOD2 s-R444 C demonstrates an increased affinity for ZDHHC5, leading to excessive palmitoylation and exacerbated inflammatory responses [[51]13]. Inflammation is recognized as a central etiological factor in endometriosis. Recent findings suggest that NLRP3-mediated pyroptosis contributes to the pathogenesis of inflammatory endometriosis by driving ectopic endometrial cell proliferation and angiogenesis [[52]14]. Targeted anti-inflammatory interventions, such as long-acting anti-IL8 antibody administration, have shown promise in mitigating disease progression [[53]15]. Despite extensive research on palmitoylation across various biological processes [[54]16–[55]18], its specific role in endometriosis remains largely unexplored. Notably, ZDHHC12 has been implicated in modulating NLRP3 palmitoylation, thereby influencing its activation status and regulating myocardial inflammation, oxidative stress, and associated cellular damage [[56]19]. Furthermore, loss of palmitoyl protein thioesterase 1 (Ppt1) impairs depalmitoylation, leading to aberrant synaptic protein trafficking and neuroinflammation through mechanisms involving A-kinase anchor protein 5 (Akap5) and nuclear factor of activated T cells (NFAT) [[57]20]. Building on these insights, this study aims to elucidate the regulatory role of palmitoylation in the pathogenesis of endometriosis, assessing its potential as a critical modulatory mechanism. By uncovering previously unrecognized pathogenic pathways and identifying novel therapeutic targets, these findings are anticipated to advance the development of more effective treatment strategies. By integrating transcriptomic and genomic data from the Gene Expression Omnibus (GEO) and Genome-Wide Association Studies (GWAS) databases, key palmitoylation-associated genes in endometriosis were identified through differential expression analysis, Mendelian randomization (MR), machine learning, and expression validation. Further exploration of the interplay between palmitoylation and endometriosis was conducted via immune infiltration analysis, chromosomal localization, and regulatory network reconstruction, providing a theoretical foundation for the precise diagnosis, surveillance, and therapeutic intervention of endometriosis. Materials and methods Data collection and extraction Endometriosis-related datasets ([58]GSE51981 and [59]GSE25628) were obtained from the GEO database for transcriptomic analysis. The [60]GSE51981 dataset, designated as the training set, comprised 77 pelvic endometriosis (PE) samples and 34 control samples, with genomic profiling conducted on the [61]GPL570 platform. To enhance data specificity, 37 samples with uterine or pelvic pathology were excluded from the analysis. The [62]GSE25628 dataset, serving as the independent validation set, included 16 endometriosis and 6 control endometrial tissue samples, sequenced on the [63]GPL571 platform. Additionally, Mendelian randomization (MR) data on endometriosis were retrieved from the publicly available Integrative Epidemiology Unit (IEU) Open GWAS database. The selected dataset (ukb-b- 9668) comprised genomic data from 463,010 European individuals, including 1,121 cases and 461,889 controls, encompassing a total of 9,851,867 single nucleotide polymorphisms (SNPs). A curated list of 23 palmitoylation-related genes (PRGs) was extracted from relevant literature [[64]21]. PRGs score and weighted gene co-expression network analysis In the [65]GSE51981 dataset, PRG scores were computed using single-sample gene set enrichment analysis (ssGSEA) from the GSVA package (v1.46.0; data of use: 2024.11.20) [[66]22], based on the differential expression of PRGs in PE and control samples. Statistical comparisons of PRG scores between PE and control groups were performed using the Wilcoxon test, with significance set at P < 0.05. Weighted gene co-expression network analysis (WGCNA) was subsequently applied to identify key module genes in [67]GSE51981, utilizing ssGSEA-derived PRG scores as trait variables via the WGCNA package (v1.7.1; data of use: 2024.11.20) [[68]23]. Initial sample clustering was conducted to detect and eliminate outliers. The optimal soft threshold power was determined by achieving an R^2 exceeding 0.8 while maintaining near-zero mean connectivity. A co-expression matrix was then constructed using the selected soft threshold, with a minimum module size of 30 genes, a dynamic tree cut parameter of 2, and a module merging threshold of 0.25. Distinct gene modules were assigned unique color labels. Correlation coefficients between endometriosis samples, control samples, and PRG scores were computed for each module, and the associations were visualized in a heatmap. Modules demonstrating a significant correlation with PRG scores (|r|> 0.5, P < 0.05) were designated as key modules, with their constituent genes identified as key module genes. Differential expression analysis Differentially expressed genes (DEGs) between PE and control samples in [69]GSE51981 were identified using the limma package (v3.54.0; data of use: 2024.11.20) [[70]24], applying selection criteria of |log[2] fold change (FC)|> 1 and P < 0.05. The distribution of DEGs was illustrated via a volcano plot and heatmap, generated using the ggplot2 package (v3.4.3; data of use: 2024.11.20) [[71]25] and ComplexHeatmap package (v2.14.0; data of use: 2024.11.20) [[72]26], respectively. Function analysis The intersecting genes were identified by overlapping DEGs with key module genes. To elucidate their functional significance, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed using the clusterProfiler package (v4.7.1.003; data of use: 2024.11.21) [[73]27], with a significance threshold of P < 0.05. GO enrichment analysis categorized functional annotations into biological processes (BP), cellular components (CC), and molecular functions (MF). MR study Based on these intersecting genes, an MR analysis was conducted using the TwoSampleMR package (v0.5.6; data of use: 2024.11.21) [[74]28], treating these genes as exposure factors and endometriosis as the outcome variable. Stringent adherence to classical MR assumptions was maintained throughout the analysis: (i) the independence assumption ensured that instrumental variables (IVs) were not confounded by external factors, (ii) the association assumption confirmed a direct influence of IVs on the exposure, and (iii) the exclusivity assumption verified that IVs affected the outcome solely through the exposure, without alternative causal pathways. GWAS data for the intersecting genes (expression Quantitative Trait Loci, eQTL) and endometriosis (ukb-b- 9668) were retrieved from the IEU Open GWAS database. Initial IV screening was performed using the VariantAnnotation (v1.44.0; data of use: 2024.11.21) [[75]29] and ieugwasr (v1.0.1; data of use: 2024.11.21) [[76]30] packages, with a significance threshold of P < 5 × 10^–6. Linkage disequilibrium (LD) filtering was applied (clump = TRUE, R^2 = 0.001, kb = 10), and genes with at least three SNPs (nSNP ≥ 3) were retained, ensuring harmonization of effect alleles and effect sizes. Weak IVs were identified based on the F-statistic, with IVs excluded when F < 10. MR analysis was conducted using five complementary methods: MR Egger [[77]31], Weighted Median [[78]32], Inverse Variance Weighted (IVW) [[79]33], Simple Mode [[80]34], and Weighted Mode [[81]35], with IVW serving as the primary statistical approach (P < 0.05). Results were visualized through scatter plots, forest plots, and funnel plots. To assess the robustness of the MR findings, sensitivity analyses were performed, including heterogeneity testing (Cochran’s Q test, P > 0.05), horizontal pleiotropy evaluation (P > 0.05), and leave-one-out (LOO) analysis using the mr heterogeneity [[82]36], mr pleiotropy test [[83]37], and mr leaveoneout [[84]38] functions, respectively. The causal direction was further validated using the Steiger test, with a correct causal direction indicated by Steiger P < 0.05. Following these analyses, genes demonstrating significant causal relationships with endometriosis were identified as candidate genes for further investigation. Machine learning and gene expression analysis To further refine the selection of feature genes, four machine learning algorithms—Random Forest (RF), Support Vector Machine (SVM), Generalized Linear Model (GLM), and k-Nearest Neighbor (KNN)—were employed to construct predictive models based on the expression profiles of candidate genes in the [85]GSE51981 dataset. The core reason for selecting these machine learning methods was that their advantages complemented each other, providing more comprehensive and accurate feature gene selection results. Each method had its strengths in data processing, feature importance evaluation, high-dimensional data handling, and model interpretability. Therefore, using multiple methods for comparison improved the reliability and accuracy of the results. Model training and validation were conducted using the caret package (v6.0–93; data of use: 2024.11.21) [[86]39]. To evaluate model performance, receiver operating characteristic (ROC) curves and residual box plots were generated using the DALEX package (v1.1.0; data of use: 2024.11.21) [[87]40]. Additionally, gene importance scores derived from each machine learning model were visualized using bar plots. The top 10 most important genes from each model were intersected to identify a refined set of feature genes. Expression validation of the identified feature genes was subsequently performed in both the [88]GSE51981 and [89]GSE25628 datasets. Wilcoxon tests were applied to compare gene expression between endometriosis and control samples, with significance set at P < 0.05. Genes exhibiting significant differential expression in both datasets, with a consistent expression trend, were designated as key genes. Immune infiltration analysis To assess immune cell infiltration in endometriosis and control samples from [90]GSE51981, the CIBERSORT algorithm (v1.0.3; data of use: 2024.11.22) [[91]41] was applied to estimate immune scores for 22 immune cell types. Samples with P > 0.05 were excluded to ensure reliable deconvolution results. Wilcoxon tests were then used to compare immune cell composition between endometriosis and control samples, and immune cell types exhibiting significant differential infiltration (P < 0.05) were selected for further analysis. Spearman correlation analysis was subsequently conducted to explore relationships among the 22 immune cell types and to assess associations between key genes and differentially infiltrated immune cells, with correlation thresholds set at |r|> 0.3 and P < 0.05. Chromosomal localization and functional similarity analyses To determine the genomic distribution of key genes across the 23 pairs of human chromosomes, the University of California Santa Cruz (UCSC) Genome Browser ([92]http://genome.ucsc.edu/) was utilized to retrieve their chromosomal start and stop positions. The RCircos package (v1.2.2; data of use: 2024.11.22) [[93]42] was then employed to generate a genome-wide visualization of key gene loci. Additionally, functional relationships among the key genes were further explored using the GoSemSim package (v6.5–0; data of use: 2024.11.22) [[94]43]. Regulation network analysis The miRwalk ([95]http://mirwalk.umm.uni-heidelberg.de/) and miRDB ([96]https://mirdb.org/) databases were utilized to predict MicroRNAs (miRNAs) targeting the identified key genes. The intersection of miRNAs derived from both databases was considered the final set of key miRNAs. An mRNA-miRNA regulatory network was subsequently constructed and visualized using Cytoscape (v3.10.2; data of use: 2024.11.22) [[97]44]. Similarly, transcription factors (TFs) associated with key genes were predicted using the hTFtarget ([98]https://guolab.wchscu.cn/hTFtarget/#!/) and miRNet ([99]https://www.mirnet.ca/) databases. Key TFs were identified by overlapping the predictions from both sources, and an mRNA-TF regulatory network was constructed and visualized in Cytoscape. To further explore potential therapeutic targets, drug-gene interactions were analyzed using the Comparative Toxicogenomics Database (CTD) ([100]http://ctdbase.org/) and the Enrichr database ([101]https://maayanlab.cloud/Enrichr/). Drugs targeting endometriosis-associated key genes were extracted from both databases, with duplicate entries removed. An mRNA-drug interaction network was then established and visualized in Cytoscape. Immunohistochemistry For experimental validation, three paraffin-embedded sections of ectopic endometrial and normal endometrial tissue were collected from the Pathology Department of Shanghai General Hospital. Immunohistochemistry (IHC) was performed using primary antibodies against GALNT12 (Solarbio, K108365P, 1:100), VRK1 (Proteintech, 28,018–1-AP, 1:100), and RMI1 (Proteintech, 14630–1-AP, 1:100), diluted in Phosphate-Buffered Saline (PBS). Five-micrometer-thick paraffin sections were deparaffinized and rehydrated, followed by incubation with 0.3% H[2]O[2] in methanol to inhibit endogenous peroxidase activity. After antigen retrieval and cooling, sections were blocked with 1% Bovine Serum Albumin (BSA) and incubated with primary antibodies overnight at 4 °C. The following day, sections were treated with HRP-conjugated secondary antibodies (Shanghai Long Island Biotech, Shanghai, China) for 1 h at room temperature, followed by diaminobenzidine (DAB) staining and hematoxylin counterstaining. Slides were examined and imaged under a Leica SP5 light microscope (Leica, China) at 100 × and 200 × magnification. Statistical analysis Statistical analyses were conducted using R (v4.2.2), with inter-group differences assessed via the Wilcoxon test (P < 0.05). Regulatory networks were generated and visualized using Cytoscape (v3.10.2). Results Screening of palmitoacylation related gene modules in endometriosis PRG score analysis in the [102]GSE51981 dataset revealed significantly elevated scores in PE samples compared to controls (Fig. [103]1A). To further explore gene co-expression patterns, WGCNA was performed on the [104]GSE51981 dataset. Clustering analysis confirmed the absence of outlier samples (Fig. [105]1B). An optimal soft-thresholding power of 19 was determined based on scale-free topology criteria (R^2 = 0.8) while maintaining mean connectivity near zero (Fig. [106]1C). Subsequently, a co-expression matrix was constructed, identifying 18 distinct modules, each represented by a unique color, with the Grey module excluded as it contained unassigned genes (Fig. [107]1D). Pearson correlation analysis revealed significant associations between PRG scores and two key modules: MEgreenyellow (r = 0.68, P < 0.001) and MEbrown (r = − 0.55, P < 0.001) (Fig. [108]1E). These modules were designated as key modules, collectively encompassing 307 genes, referred to as key module genes. Fig. 1. [109]Fig. 1 [110]Open in a new tab Results of Screening palmitoacylation related gene modules. A The PRGs score of PE and control samples. B The result of cluster analysis. C The scale-free fit index for soft threshold power and mean connectivity. D Gene and trait clustering dendrograms. Each branch represents an expression module of a highly interconnected groups of genes; each color indicates a corresponding co-expression module. E Heatmap of 18 gene co-expression modules. The numbers in each cell means the correlation coefficient and p value Identification and functional exploration of the intersection genes Differential expression analysis in the [111]GSE51981 dataset identified 3,376 DEGs between endometriosis and control samples, with 1,267 genes exhibiting upregulation and 2,109 showing downregulation (Fig. [112]2A and B). By intersecting the 3,376 DEGs with the 307 key module genes, 204 intersection genes were identified (Fig. [113]2C). Functional enrichment analysis of these 204 genes revealed significant enrichment in 368 GO terms and 31 KEGG pathways. GO enrichment analysis, categorized into BP, CC, and MF, identified key terms such as"nuclear division,""chromosomal region,"and"tubulin binding"(Fig. [114]2D). KEGG pathway enrichment analysis highlighted pathways including"cell cycle,""mineral absorption,"and"progesterone-mediated oocyte maturation"(Fig. [115]2E). Fig. 2. [116]Fig. 2 [117]Open in a new tab Identification and functional exploration of the intersection genes. A Volcano plot. We set the criteria of |log2fold-change (FC)|> 1 and P < 0.05 as the difference genes. Red dots are upregulated genes, and blue dots are downregulated genes. B Heatmap plot. The heatmap reflects the distribution of gene expression density and gene expression differences in each sample. C Venn diagram. The key module genes obtained from WGCNA were intersected with DEGS genes. D GO enrichment analysis results. E KEGG enrichment analysis results Candidate genes with a significant causal relationship with endometriosis The causal association between the 204 intersecting genes and endometriosis was further examined. Following the IV screening, 126 genes remained as exposure factors for further investigation. The MR analysis identified 17 genes with a statistically significant causal relationship with endometriosis (P < 0.05) (Table [118]1). Among them, seven genes (e.g., CFD, ECT2, HMMR) were classified as risk factors (Odds Ratio [OR] > 1), whereas ten genes (e.g., GALNT12, RMI1, VRK1) exhibited a protective effect (OR < 1). To visualize these associations, scatter plots, forest plots, and funnel plots were generated. Specifically, scatter plots for GALNT12, RMI1, and VRK1 (Fig. [119]3A) displayed a negative slope in their fitted regression lines, consistent with a protective association. Forest plots (Fig. [120]3B) further illustrated the MR effect sizes, all of which were negative under the IVW method, reinforcing their protective role. Funnel plots (Fig. [121]3C) demonstrated a symmetrical distribution of IVs around the IVW line, indicating adherence to Mendel's second law. Scatter plots, forest plots, and funnel plots for the remaining genes are provided in Figures S1–S3. Additionally, heterogeneity and horizontal pleiotropy tests across all 17 genes yielded P values exceeding 0.05 (Tables [122]2 and [123]3), suggesting the absence of significant heterogeneity or confounding influences in the MR study. LOO analysis (Fig. [124]3D and Fig. S4) further corroborated the robustness of the MR results, as no substantial deviations were observed upon sequential exclusion of individual IVs. Finally, Steiger directionality tests (Table [125]4) confirmed the correct causal direction for all 17 genes, with P values below 0.05, reinforcing the validity of the findings. Collectively, these 17 genes emerge as potential causal candidates implicated in endometriosis. Table 1. Mendelian randomization analysis unveils 17 causal genes in endometriosis NO exposure outcome method nsnp pval or 1 CENPE PE Inverse variance weighted 25 0.000458026 0.99892761 2 CFD PE Inverse variance weighted 7 0.013505093 1.000750339 3 ECT2 PE Inverse variance weighted 11 0.049775503 1.000690854 4 FBXO5 PE Inverse variance weighted 8 0.01256748 0.999008793 5 GALNT12 PE Inverse variance weighted 9 0.011019426 0.998513982 6 HMMR PE Inverse variance weighted 5 0.008649861 1.002027803 7 IER3 PE Inverse variance weighted 17 0.012138092 0.999591358 8 MKI67 PE Inverse variance weighted 3 0.020648678 0.997693863 9 NDC80 PE Inverse variance weighted 15 0.014084781 0.999332329 10 PARPBP PE Inverse variance weighted 4 0.003391087 0.997696177 11 PRIM1 PE Inverse variance weighted 9 0.008881558 1.000353618 12 RLN2 PE Inverse variance weighted 4 0.045332653 0.998816069 13 RMI1 PE Inverse variance weighted 11 0.048692518 0.999587813 14 STIL PE Inverse variance weighted 16 0.009312601 1.000757596 15 STMN1 PE Inverse variance weighted 19 0.00385043 1.000811607 16 TYMS PE Inverse variance weighted 12 0.002906875 1.00065104 17 VRK1 PE Inverse variance weighted 10 0.002029756 0.999426071 [126]Open in a new tab Abbreviation: PE pelvic endometriosis Fig. 3. [127]Fig. 3 [128]Open in a new tab Identification of candidate genes through MR study. A Scatter plots for GALNT12, RMI1, and VRK1. B Forest plots for GALNT12, RMI1, and VRK1. C Funnel plots for GALNT12, RMI1, and VRK1. D LOO analysis for GALNT12, RMI1, and VRK1 Table 2. Results of Mendelian randomization heterogeneity test NO exposure outcome heterogeneity_pval 1 CENPE PE 0.994236723 2 CFD PE 0.904994304 3 ECT2 PE 0.974875149 4 FBXO5 PE 0.856902477 5 GALNT12 PE 0.999813501 6 HMMR PE 0.987296141 7 IER3 PE 0.414274668 8 MKI67 PE 0.935968918 9 NDC80 PE 0.999792808 10 PARPBP PE 0.988250764 11 PRIM1 PE 0.946071659 12 RLN2 PE 0.925913532 13 RMI1 PE 0.659333301 14 STIL PE 0.999978701 15 STMN1 PE 0.99963283 16 TYMS PE 0.999579046 17 VRK1 PE 0.99775246 [129]Open in a new tab Abbreviation: PE pelvic endometriosis Table 3. Results of Mendelian randomization level pleiotropy test NO exposure outcome pleiotropy_pval 1 CENPE PE 0.159937324 2 CFD PE 0.353849828 3 ECT2 PE 0.170419632 4 FBXO5 PE 0.612553883 5 GALNT12 PE 0.831829309 6 HMMR PE 0.873546239 7 IER3 PE 0.053648145 8 MKI67 PE 0.791493743 9 NDC80 PE 0.975018594 10 PARPBP PE 0.773482392 11 PRIM1 PE 0.362264287 12 RLN2 PE 0.808950553 13 RMI1 PE 0.070378789 14 STIL PE 0.220005997 15 STMN1 PE 0.781020923 16 TYMS PE 0.894010125 17 VRK1 PE 0.947229403 [130]Open in a new tab Abbreviation: PE pelvic endometriosis Table 4. Mendelian randomization Steiger directivity analysis NO exposure outcome correct_causal_direction steiger_pval 1 CENPE PE TRUE 1.8803E- 163 2 CFD PE TRUE  < 0.001 3 ECT2 PE TRUE 6.3846E- 150 4 FBXO5 PE TRUE 8.09018E- 92 5 GALNT12 PE TRUE 2.53764E- 52 6 HMMR PE TRUE 2.45247E- 33 7 IER3 PE TRUE  < 0.001 8 MKI67 PE TRUE 7.05518E- 18 9 NDC80 PE TRUE 8.5987E- 214 10 PARPBP PE TRUE 4.02091E- 22 11 PRIM1 PE TRUE  < 0.001 12 RLN2 PE TRUE 3.27508E- 44 13 RMI1 PE TRUE  < 0.001 14 STIL PE TRUE 2.468E- 171 15 STMN1 PE TRUE 1.9595E- 242 16 TYMS PE TRUE  < 0.001 17 VRK1 PE TRUE  < 0.001 [131]Open in a new tab Abbreviation: PE pelvic endometriosis VRK1, GALNT12, and RMI1 were deemed as key genes for endometriosis Building on the 17 candidate genes identified through the MR study, machine learning algorithms were employed to further refine the selection of feature genes. Four distinct models were constructed, with their predictive performance assessed via ROC curves. All models achieved an area under the curve (AUC) exceeding 0.7, indicative of high classification accuracy (Fig. [132]4A). Additionally, residual box plots compared true observed values with model-predicted outcomes, further validating model reliability (Fig. [133]4B). To prioritize genes with the greatest potential relevance to endometriosis treatment, gene importance scores were derived from each model (Fig. [134]4C). By selecting the top 10 genes from each model and determining their intersection, six feature genes were identified: TYMS, VRK1, MK167, GALNT12, CFD, and RMI1 (Fig. [135]4D). Fig. 4. [136]Fig. 4 [137]Open in a new tab Obtained key genes via machine learning and external validation. A ROC curves constructed by four machine learning models. B Residual box diagram of four machine learning models. C Feature importance of four machine learning models. D Venn diagram of the top 10 feature importance genes across four machine learning models. E Expression of feature genes in [138]GSE51981. F Expression of feature genes in [139]GSE25628 Subsequent gene expression analysis in the [140]GSE51981 and [141]GSE25628 datasets revealed significantly lower expression levels of VRK1, GALNT12, and RMI1 in both datasets (P < 0.05) (Fig. [142]4E and F). Consequently, these three genes were designated as key genes implicated in endometriosis. Immune cell infiltration analysis Immune infiltration analysis (Fig. [143]5A) characterized the distribution of 22 immune cell types in endometriosis and control samples from [144]GSE51981. The Wilcoxon test identified 11 differentially abundant immune cells. Notably, M2 macrophages and resting mast cells exhibited significantly higher proportions in control samples, whereas monocytes and resting natural killer (NK) cells were significantly enriched in endometriosis samples (Fig. [145]5B). Correlation analysis among immune cell populations demonstrated a strong positive association between resting mast cells and M0 macrophages (r = 0.51, P < 0.05), while regulatory T cells (Tregs) displayed the strongest negative correlation with activated memory CD4 T cells (r = − 0.53, P < 0.05) (Fig. [146]5C). Further correlation analysis between key genes and differentially abundant immune cells revealed a consistent positive association between all key genes and M2 macrophages, alongside a strong negative correlation with resting NK cells (|r|> 0.3, P < 0.001) (Fig. [147]5D). Fig. 5. [148]Fig. 5 [149]Open in a new tab Immune cell infiltration analysis. A Proportions of 22 immune cell types in PE and controls. B Expression differences of 22 immune cell types in PE and controls. C Relationships among immune cells. D Associations between immune cells and key genes Chromosome localization and functional similarity analysis of key genes Chromosomal localization analysis provided further insights into the genomic context of the key genes. Specifically, GALNT12 and RMI1 were mapped to chromosome 9, whereas VRK1 was located on chromosome 14 (Fig. [150]6A). Functional similarity analysis revealed that VRK1 exhibited the highest similarity with the other key genes, suggesting its potential central role in the pathogenesis of endometriosis (Fig. [151]6B). Fig. 6. [152]Fig. 6 [153]Open in a new tab Chromosome localization and functional similarity analysis of key genes. A The chromosome localization of key genes. B The functional similarity analysis of key genes Analysis of regulatory networks associated with key genes Prediction of miRNA interactions with key genes identified nine key miRNAs through overlapping results from the miRWalk and miRDB databases, enabling the construction of an mRNA-miRNA regulatory network comprising 12 nodes and 9 edges. Notable interactions included VRK1- ‘hsa-mir- 4428’, GALNT12- ‘hsa-mir- 202 - 3p’, and RMI1- ‘hsa-mir- 3190 - 3p’ (Fig. [154]7A). Furthermore, 61 TFs targeting the three key genes were identified through overlapping predictions from the hTFtarget and miRNet databases. These interactions were visualized in an mRNA-TF network consisting of 64 nodes (3 key genes and 61 TFs) and 76 edges, with SPI1 identified as a common regulator of all three key genes (Fig. [155]7B). Additionally, drug-gene interaction analysis identified 195 drugs targeting the three key genes, leading to the construction of a key gene-drug network (Fig. [156]7C). Notably, enterolactone was found to co-target RMI1 and VRK1, while retinoic acid co-targeted GALNT12 and VRK1. These regulatory networks provide valuable insights into the molecular mechanisms underlying endometriosis and potential therapeutic targets. Fig. 7. [157]Fig. 7 [158]Open in a new tab The regulatory networks associated with key genes. A The mRNA-miRNA network of key genes. B The mRNA-TF network of key genes. C The key genes-drugs network Validation of key genes by immunohistochemistry To validate the expression patterns of the key genes, three cases of ectopic endometrial tissues and three cases of normal endometrial tissues were collected from the pathology department. Immunohistochemical staining was performed using antibodies against VRK1, RMI1, and GALNT12. The results demonstrated significantly higher positive staining rates for all three proteins in normal endometrial tissues compared to endometriotic tissues, further corroborating their potential involvement in endometriosis pathophysiology (Fig. [159]8). Fig. 8. [160]Fig. 8 [161]Open in a new tab Immunohistochemical Validation of GALNT12, RMI1 and VRK1 in Normal and Endometriosis Discussion Endometriosis is an inflammatory disease characterized by invasiveness and recurrence, and currently lacks reliable diagnostic and monitoring indicators [[162]2, [163]4]. Palmitoylation stands as a pivotal mechanism of protein post-translational modification, exerting a significant influence on inflammatory responses, lipid metabolism, and the genesis of tumors [[164]12, [165]45]. Research indicates that palmitoylation plays a significant role in the migration and adhesion of neutrophils by regulating the function of CRACR2 A protein, thereby affecting inflammatory responses and associated tissue damage [[166]46]. Additionally, palmitoylation plays an important role in inflammatory responses by modulating the functions of immune proteins and the metabolism of gut microbiota [[167]47]. Although the specific role of palmitoylation in endometriosis remains unclear, its close association with inflammation suggests that it may play a key role in the inflammatory process of this disease. This study employed bioinformatics approaches to identify DEGs associated with palmitoylation in endometriosis and further elucidated their functional relevance. Using the IEU OpenGWAS database, 17 genes were identified with statistically significant associations, establishing a causal relationship between these genes and endometriosis. Subsequently, machine learning algorithms, combined with external dataset validation, refined this selection to three key genes—VRK1, GALNT12, and RMI1—each exhibiting reduced expression in endometriotic tissues and demonstrating a negative correlation with disease occurrence. The VRK1 (vaccinia-related kinase 1) gene, which encodes a serine/threonine protein kinase, is localized on chromosome 14 and exhibits broad expression across human tissues, with predominant nuclear localization [[168]48]. The VRK1-encoded protein regulates cell cycle progression and genomic stability through phosphorylation and is implicated in apoptosis, thus contributing to cellular proliferation and tissue regeneration [[169]49]. Previous studies have demonstrated that VRK1 modulates p53 stability and activity via phosphorylation, thereby influencing lung cancer cell proliferation [[170]50]. Additionally, VRK1 promotes cell cycle progression by phosphorylating VREB, thereby enhancing cAMP-responsive element-binding protein activity at the CCND1 promoter, leading to CCND1 upregulation [[171]51]. Furthermore, VRK1 plays a pivotal role in DNA damage repair by stabilizing histone H2 AX-H3 interactions, neutralizing ionizing radiation-induced H2 AX phosphorylation, and participating in early DNA repair mechanisms [[172]52]. The GALNT12 (N-Acetylgalactosaminyltransferase 12) gene, located on chromosome 9, belongs to the polypeptide N-acetylgalactosaminyltransferase family and is primarily involved in protein post-translational modification. It catalyzes the transfer of N-acetylgalactosamine to serine or threonine residues of target proteins, thereby influencing protein conformation, functional properties, and genomic stability [[173]53]. Aberrant expression or dysregulation of GALNT12 has been implicated in various pathological conditions. For instance, mutations in GALNT12 leading to abnormal glycosylation play a critical role in the pathogenesis of colorectal cancer [[174]54]. Moreover, elevated GALNT12 expression is significantly associated with poor prognosis in patients with glioblastoma, where it enhances tumor cell proliferation and invasiveness via modulation of the PI3 K/AKT/mTOR signaling pathway [[175]55]. Additionally, GALNT12 has been closely linked to IgA1 galactose deficiency, with significantly lower mRNA expression levels observed in affected individuals compared to healthy controls [[176]56]. The RMI1 (RecQ Mediated Genome Instability 1) gene, also localized on chromosome 9, encodes a key protein involved in DNA repair and recombination. As an integral component of the BLM/RMI1/Top3α complex, RMI1 plays a pivotal role in maintaining genomic stability and facilitating DNA damage repair [[177]57]. Loss of RMI1 function leads to increased DNA damage accumulation, cell cycle arrest, and impaired homologous recombination repair, particularly following ionizing radiation exposure [[178]58]. Beyond its role in genomic maintenance, RMI1 is involved in metabolic regulation, with its expression in adipocytes being modulated by glucose through the E2 F pathway [[179]59]. RMI1-deficient mice exhibit resistance to diet- and genetically induced obesity, highlighting its involvement in metabolic homeostasis [[180]60]. Furthermore, mutations in RMI1 contribute to the pathogenesis of Bloom syndrome, a genetic disorder characterized by primary microcephaly, intrauterine growth restriction, and short stature [[181]61]. Although direct evidence linking GALNT12, RMI1, and VRK1 to endometriosis remains limited, their well-documented roles in gene expression regulation, cell signaling, DNA repair, cell cycle progression, and apoptosis suggest potential involvement in the disease’s pathogenesis. For instance, mutations in GALNT12 or RMI1 leading to aberrant protein function may compromise the stability and proliferative capacity of endometrial cells. Simultaneously, dysregulated VRK1 activity could disrupt normal cell cycle control, potentially contributing to the onset and progression of endometriosis. Immunofiltration analysis identified 11 distinct immune cell types exhibiting differential infiltration patterns in endometriosis. Notably, M2 macrophages demonstrated reduced abundance in endometriotic tissues, whereas resting NK cells were significantly enriched. M2 macrophages are recognized for their role in tissue repair, angiogenesis, and tumor progression [[182]62]. Prior studies have reported a marked decline in M2 macrophage proportions across all stages of endometriosis in affected individuals [[183]63]. Consistent with these findings, the key genes identified in this study were downregulated in ectopic endometrial tissues and exhibited a positive correlation with M2 macrophage infiltration. NK cells, as critical components of the innate immune system, contribute to immune surveillance and tissue homeostasis. Within the endometrium, a specialized subset known as uterine NK (uNK) cells has been identified [[184]64]. Research indicates that CD16^+ uNK cells produce cytotoxic factors capable of affecting trophoblast function, potentially leading to infertility, miscarriage, or placental abnormalities [[185]65]. While this study observed a negative correlation between key genes and resting NK cells, alongside increased NK cell infiltration in ectopic endometrial tissues, the precise role of NK cells in endometriosis remains inconclusive [[186]66]. Further investigation with larger sample cohorts and additional functional validation is required. MicroRNAs (miRNAs), a class of short non-coding RNAs, regulate gene expression post-transcriptionally by modulating mRNA stability and translation efficiency. Aberrant miRNA expression has been extensively documented in endometriosis. This study predicted miRNA interactions with the three key genes, highlighting miR- 202, which has been reported to be upregulated in ectopic endometrial tissue. Notably, miR- 202 suppresses SOX6 expression, thereby enhancing the invasive capacity of ectopic endometrial cells [[187]67]. Although the miRNAs identified in this study have not been directly investigated in endometriosis, their involvement in other pathological conditions has been documented. In cervical cancer, RGMB-AS1 promotes tumor proliferation and invasiveness via the miR- 4428/PBX1 axis [[188]68], while in ovarian cancer, miR- 6086 suppresses angiogenesis by downregulating the OC2/VEGFA/EGFL6 signaling pathway [[189]69]. Within the mRNA-TF regulatory network, SPI1 was identified as a shared transcriptional regulator of the three key genes. Notably, SPI1 is upregulated in ectopic endometrial tissues, contributing to the aggressive phenotype of endometriotic lesions [[190]70]. Furthermore, drug repurposing analysis using the CTD and Enrichr databases identified 195 drug candidates targeting the key genes. This network encompasses a diverse range of therapeutic agents, including retinoic acid, which has demonstrated potential for endometriosis treatment by inhibiting estradiol secretion in ovarian endometriotic cysts and attenuating disease progression [[191]71]. Additionally, while enterolactones have not been studied in the context of endometriosis, their therapeutic potential in other malignancies has been explored. Specifically, enterolactones have been shown to enhance radiotherapy efficacy in breast cancer by inhibiting DNA repair mechanisms and promoting apoptotic pathways [[192]72]. The mechanisms underlying targeted drug actions are highly intricate, with potential impacts on disease progression mediated through diverse pathways. While most studies suggest that targeted therapies exert their effects primarily by downregulating the expression of target genes [[193]73, [194]74], their functional scope extends beyond mere gene suppression. For example, TP53 serves as a pivotal tumor suppressor gene, and its functional loss is implicated in the pathogenesis of numerous malignancies. Restoring TP53 activity via targeted therapies can reestablish its antitumor function, thereby inhibiting tumor progression [[195]75]. Similarly, FoxP3, a key transcription factor essential for the development and function of Tregs, plays a critical role in immune modulation. Upregulation of FoxP3 enhances the immunosuppressive capacity of Tregs, influencing the onset and progression of esophageal cancer [[196]76]. Given the multifaceted mechanisms of targeted drugs, identifying effective therapeutic targets is imperative for advancing treatment strategies, improving clinical outcomes, and improving patient prognosis. This study has inherent limitations stemming from its reliance on data sourced from multiple public databases, which may introduce potential biases. Furthermore, the analysis is predominantly bioinformatics-driven, lacking extensive experimental validation. Although preliminary immunohistochemical analysis corroborated the computational findings regarding the expression of key genes in tissue samples, additional validation is required. Future work will focus on quantifying gene expression using Western blot and quantitative polymerase chain reaction (qPCR) methodologies. Moreover, functional assays will be conducted at the cellular level, including gene overexpression experiments to assess the impact of these genetic alterations on cell proliferation, migration, invasion, and apoptosis. Conclusions Overall, this study employed an integrative approach combining differential gene expression analysis, WGCNA, MR analysis, and machine learning to identify three key genes associated with palmitoylation in endometriosis. Subsequent analyses explored immune infiltration dynamics, gene functional similarity, and pharmacological correlations. These findings provide novel insights that may inform clinical diagnostics, disease surveillance, and therapeutic development for endometriosis. Supplementary Information [197]Supplementary Material 1.^ (299KB, zip) Acknowledgements