Abstract Background Early diagnosis and early delivery are the main strategies for the treatment of premature ovarian insufficiency (POI). However, POI warning markers, especially those that can be detected through noninvasive methods, are very limited; therefore, the identification of noninvasive markers for POI is urgent. Methods We acquired POI GWAS summary statistics from the FinnGen database. The metabolome, circulating plasma proteins, gut microbiota, immunophenotypes, circulating microRNAs (miRNAs), and two proteomes were obtained for two-sample Mendelian randomization (MR). Specifically, we employed inverse variance weighted (IVW) as the main method to calculate the MR effect estimates. eQTL data (from the eQTLGen Consortium) were employed for SMR. Hub genes were identified using the String database and Cytoscape software. Potential mechanisms of POI were identified via pathway enrichment analysis of the identified genes and miRNAs. Results Three metabolites (sphinganine-1-phosphate levels, X-23636 levels, 4-methyl-2-oxopentanoate levels), two circulating plasma proteins (fibroblast growth factor 23 levels, neurotrophin-3 levels), one gut microbiota (faecalibacterium abundance), one immunophenotype (HVEM on naive CD8 + T cells), 23 miRNAs (miR-500a-3p, miR-555, miR-584-5p, miR-642a-5p, miR-671-3p, miR-1324, miR-6870-3p, miR-1468-5p, miR-146a-3p, miR-221-3p, miR-3121-5p, miR-3184-3p, miR-3185, miR-335-5p, miR-4302, miR-4506, miR-6808-5p, miR-6894-5p, miR-145-5p, miR-149-3p, miR-23a-3p, miR-3141, and miR-374b-5p), and three hub genes (ESR1, ERBB2, and GART) serve as warning markers for POI. Enrichment analysis indicated that pathways such as glutathione metabolism and the PI3 kinase pathway may be involved in mechanisms regulating POI. Conclusion Our results are the first to identify noninvasive predictors for POI via MR, providing contributions for early warning and fertility guidance for clinical POI patients. Supplementary Information The online version contains supplementary material available at 10.1186/s13048-025-01696-1. Keywords: Premature ovarian insufficiency, Genome-wide association study, Mendelian randomization, Non-invasive, Biomarkers Introduction Premature ovarian insufficiency (POI), which refers to a significant decline or loss of ovarian function in women before the age of 40 years, is a reproductive system disease that affects the physical and mental health of women [[30]1]. POI is characterized by infertility, menstrual abnormalities (irregular, infrequent or amenorrhea), and perimenopausal symptoms, with high levels of gonadotropins (in particular, follicle-stimulating hormone [FSH]) and low levels of oestradiol (E2) [[31]2]. The global prevalence of POI is approximately 3.7% and is increasing [[32]3]. POI lacks effective treatment; therefore, clinicians advocate early diagnosis and recommend early delivery [[33]4]. However, the phenomena of marriage at later years and late childbearing are becoming increasingly apparent, and the number of POI patients who have never given birth is increasing yearly. In addition, POI warning markers, especially markers that can be detected through noninvasive methods (such as blood, urine, and faeces), are very limited; therefore, many patients are infertile at the initial diagnosis. As such, the identification of noninvasive markers for POI is urgently needed. The mechanisms driving POI are highly complex and involve genetic factors, immune dysregulation, inflammation, medications, and psychological factors, making identifying POI biomarkers through a single omics approach difficult. The integration of multiple omics data, such as genomics, epigenomics, transcriptomics, proteomics, and metabolomics data, may improve biomarker and drug target discovery capabilities [[34]5]. For example, Zhaoyang Yu et al. identified eight metabolic markers significantly correlated with ovarian reserve function by integrating metabolomics and transcriptomics [[35]6]. The necessity of integrating multiple omics methods to study POI has gradually been recognized; however, multiomics research on POI is very limited. With the increasing availability of genome-wide association studies (GWASs), Mendelian randomization (MR) has emerged as an efficient way to estimate the causal relationships between exposures (such as biomarkers) and outcomes (such as phenotypes) [[36]7]. Summary-data-based Mendelian randomization (SMR) is a method similar to MR that uses data from GWASs and expression quantitative trait loci (eQTLs) to investigate whether the impact of SNPs on phenotype is mediated by gene expression [[37]8]. In recent years, the integration of omics data with MR has emerged as a prominent strategy for the discovery of disease biomarkers. For example, Qianhan Lin et al. conducted a proteome-wide MR analysis to identify potential plasma biomarkers for ovarian cancer [[38]9]. Shaoxuan Liu et al. conducted a two-sample MR analysis to identify metabolites that drive ovarian cancer and thus could be potential candidates for biomarkers [[39]10]. In the present study, we integrated POI GWAS summary statistics with metabolome, gut microbiota, immunophenotypes, plasma inflammatory proteins, miRNAs, eQTL and proteomes data to identify noninvasive markers for POI by MR analysis. The overall study design is shown in Fig. [40]1. Fig. 1. [41]Fig. 1 [42]Open in a new tab The overall study design. MR, Mendelian randomization; SMR, summary data-based Mendelian randomization; eQTL, expression quantitative trait loci Materials and methods Data sources 1. POI: GWAS summary results for POI (consortium definition) were obtained from the FinnGen R11 release ([43]https://r11.finngen.fi/), which comprises 542 cases and 241,998 controls from Finlander. 2. Immunophenotypes: A total of 731 immunophenotypes were obtained from the GWAS catalog ([44]https://www.ebi.ac.uk/gwas/) (GCST90001391 to GCST90002121), encompassing comprehensive data collected from 3757 Europeans [[45]11]. 3. Metabolome: A total of 1,091 blood metabolites were obtained from the GWAS catalog (GCST90199621 ~ GCST90201020), encompassing data collected from 50,000 Europeans [[46]12]. 4. Gut microbiota: Gut microbiota species were derived from the Germany Microbiome Project, which includes a total of 430 taxa in 8,956 individuals [[47]13]. 5. eQTL data: eQTL data were obtained from the eQTLGen Consortium, including summary data from 37 datasets involving 31,684 individuals [[48]14]. 6. Proteomes were obtained from 35,559 Icelanders, encompassing 4,907 proteins [[49]15], and 54,306 UK Biobank participants, encompassing 2,904 proteins [[50]16]. 7. miRNA data were obtained from 710 unrelated Europeans, encompassing 2083 miRNAs [[51]17]. 8. Ninety-one plasma inflammatory proteins were collected from 14,824 Europeans [[52]18]. To avoid overlap, samples from different countries were selected for exposure and outcomes in the MR analysis. Selection of instrumental variables SNP selected as instrumental variables (IVs) satisfied three assumptions of MR: a correlation with the exposure, independence from other confounding factors, and an effect on the result solely via the exposure [[53]19]. Our study established a threshold of P < 1 × 10^− 5 for SNP selection. Additionally, we concurrently computed the F value for the assessment of IV strength and selected IVs that exhibited a high correlation with F > 10 [[54]20]. The threshold for the linkage disequilibrium coefficient was established as R^2 < 0.001, with R^2 values corresponding to a linkage disequilibrium distance of 10,000 kb. MR analysis Two-sample MR analyses were employed. Generally, the methods used to assess causality included inverse-variance weighted (IVW), MR-Egger (MRE), weighted median (WME), and weighted modes (WMOs) [[55]14]. Specifically, we employed the IVW method to calculate the MR effect estimates for traits having over one instrument. For traits with more than three instruments, MRE, WME and WMO were further applied. An FDR-adjusted P value < 0.05 combined with an odds ratio (OR) > 1.5 or < 0.5 was considered statistically significant. Sensitivity analysis To perform sensitivity analyses, the intercept of the MR-Egger test was initially examined to assess the presence of directional horizontal pleiotropy. A P value less than 0.05 indicated the presence of pleiotropy. To evaluate heterogeneity among SNPs in the IVW estimators, Cochran’s Q statistic was employed. A Q_pval value less than 0.05 indicated the presence of heterogeneity. Summary-data-based MR analysis Putative functional genes involved in POI were identified via summary data-based Mendelian randomization (SMR). The SMR method was used to assess the causal association between gene expression and the phenotype by combining summary statistics obtained from GWASs and expression quantitative trait loci (eQTL) analyses. Here, we used eQTL data from the eQTLGen database. We utilized genome-wide significant SNPs as instrumental variables and conducted a heterogeneity of causal instruments (HEIDI) test to separate causality or pleiotropy from linkage [[56]8]. An FDR adjusted P_SMR < 0.05, P_HEIDI > 0.05 were considered statistically significant. Construction of protein-protein interaction network and identification of hub genes We imported upregulated and downregulated functional genes (obtained through SMR analysis based on eQTLGen, and MR analysis based on proteome) into the String database ([57]https://cn.string-db.org/) to obtain protein interaction networks (confidence score: 0.4). Cytoscape software (version 3.10.3) was used to identify the hub genes on the basis of connectivity. Specifically, “Analyze Network” in Cytoscape was employed with default parameters, and hub genes with degrees higher than 20 were identified. In addition, we further used MCC (Maximum Neighborhood Component Centrality), degree and betweenness in Cytohubba (Version: 0.1) to verify hub genes. Functional pathway enrichment analysis Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of upregulated and downregulated functional genes was conducted using Sangerbox ([58]http://sangerbox.com) (Version: 3.0) [[59]21]. Pathway enrichment analysis of miRNAs was conducted using miEAA ([60]https://ccb-compute2.cs.uni-saarland.de/mieaa2/). An FDR adjusted P-value < 0.05 was considered significant. Results Causal effect of metabolites on POI Since the outcome (POI) was a binary variable, we used the odds ratio (OR) to identify the correlation between metabolites and POI [[61]22]. The IVW estimate suggested that five metabolites and one metabolite (Table [62]S1 and Table [63]S2) had positive and negative causal effects, respectively, on POI. MRE, WME, and WMO verified the positive effects of sphinganine-1-phosphate levels and X-23,636 levels and negative effects of 4-methyl-2-oxopentanoate levels on POI (Fig. S1A and Fig. [64]2A-C). Fig. 2. [65]Fig. 2 [66]Open in a new tab Scatter plots of the causal associations between metabolites (A-C)/plasma inflammatory proteins (D-E)/gut microbiota (F)/immunophenotype (G) and POI. This graph presents the MR analysis results. Each point represents an instrumental variable (SNP), and the lines at each point reflect the 95% confidence interval. The x-axis and the y-axis represent the impact of SNPs on exposure and outcomes, respectively. The regression line estimates the causal effect of the exposure (X) on the outcome (Y). Different MR methods estimate the regression line differently, and different coloured lines represent different algorithms. A slope greater than or less than 0 indicates that the exposure factor is positively or negatively correlated with the outcome Causal effect of plasma inflammatory proteins on POI The IVW estimate suggested that three plasma inflammatory proteins and no plasma inflammatory proteins (Table [67]S3 and Table [68]S4) had positive and negative causal effects, respectively, on POI. MRE, WME, and WMO verified the positive effects of fibroblast growth factor 23 levels and neurotrophin-3 levels on POI (Fig. S1B and Fig. [69]2D-E). Causal effect of gut microbiota on POI The IVW estimate suggested that one gut microbiota and no gut microbiota (Table [70]S5 and Table [71]S6) had positive and negative causal effects, respectively, on POI. MRE, WME, and WMO verified the positive effects of OTU99_16 (Faecalibacterium) abundance on POI (Fig. [72]S1C and Fig. [73]2F). Causal effect of immunophenotypes on POI The IVW estimate suggested that no immunophenotype and one immunophenotype (Table [74]S7 and Table [75]S8) had positive and negative causal effects, respectively, on POI. MRE, WME, and WMO failed to verify the negative effects of HVEM on naive CD8 + T cells on POI as there were fewer than three GWAS-significant SNPs (Fig. [76]2G). Causal effect of MiRNAs on POI and MiRNAs pathway enrichment analysis The IVW estimate suggested that 19 miRNAs and eight miRNAs (Table [77]S9 and Table [78]S10) had positive and negative causal effects, respectively, on POI. MRE, WME, and WMO verified the positive effects of 16 miRNAs (Fig. [79]3A) and the negative effects of 7 miRNAs (Fig. [80]3B) on POI (Fig. S2A and Fig. S2B). miRNA pathway enrichment analysis revealed the marked enrichment of miRNAs related to androgen-oestrogen-progesterone biosynthesis and the PI3 kinase pathway. Figure [81]4A and B present the noteworthy miRNA pathways. Fig. 3. [82]Fig. 3 [83]Open in a new tab Scatter plots of the causal associations between miRNAs and POI Fig. 4. [84]Fig. 4 [85]Open in a new tab Pathway enrichment analysis and protein-protein interactions. (A) Pathway enrichment analysis of the upregulated miRNAs. (B) Pathway enrichment analysis of the downregulated miRNAs. (C) Pathway enrichment analysis of the upregulated genes. (D) Pathway enrichment analysis of the downregulated genes. (E) Protein-protein interaction network for the upregulated genes. (F) Protein‒protein interaction network for the downregulated genes Functional genes for POI and KEGG pathway enrichment analysis On the basis of 4,907 proteins obtained from 35,559 Icelanders, the IVW estimate suggested that 16 proteins and no proteins (Table [86]S11 and Table [87]S12) had positive and negative causal effects, respectively, on POI. On the basis of 2,904 proteins obtained from 54,306 UK Biobank participants, the IVW estimate suggested that 52 genes and 15 genes (Table [88]S13 and Table [89]S14) had positive and negative causal effects, respectively, on POI. By combining GWAS summary data for POI and eQTL summary information obtained from eQTLGen, we identified, by employing SMR and HEIDI analyses, 469 potentially causal genes, including 226 genes that were positively (Table [90]S15) associated and 243 genes that were negatively (Table [91]S16) associated with POI. Overall, we identified 294 genes positively and 258 genes negatively associated with POI. KEGG pathway enrichment analysis revealed the marked enrichment of genes related to glutathione metabolism and valine, leucine and isoleucine degradation. Figure [92]4C and D present the genes enriched in noteworthy pathways. In addition to pathway enrichment, we also performed protein-protein interaction analysis to identify hub genes. On the basis of the interaction connectivity, we identified three genes, ESR1, ERBB2 and GART (Fig. [93]4E and F), as hub genes for predicting POI. Degree (Table [94]S17 and Table [95]S18), betweenness (Table [96]S19 and Table [97]S20) and MCC (Table [98]S21 and Table [99]S22) in Cytohubba further verified ESR1, ERBB2, and GART as hub genes due to their ranking in first or second place. Sensitivity analysis Genetic pleiotropy did not significantly influence the results (Table [100]1). In addition, neither the MR-Egger method nor the IVW method in Cochran’s Q test revealed significant heterogeneity. HVEM on naive CD8 + T cells was excluded from the Mendelian randomization sensitivity analysis, as there were fewer than three GWAS-significant SNPs. Table 1. Heterogeneity and Pleiotropy analyses Exposure Outcome Heterogeneity test Pleiotropy test IVW Q p- value MR Egger Q p- value MR-Egger regression intercept SE p- value Sphinganine-1-phosphate levels POI 7.75 0.96 7.73 0.93 0 0.06 0.91 X-23,636 levels POI 19.41 0.20 17.53 0.23 0 0.07 0.24 4-methyl-2-oxopentanoate levels POI 13.48 0.14 11.50 0.17 0.13 0.10 0.27 Fibroblast growth factor 23 levels POI 16.24 0.64 16.75 0.67 -0.03 0.04 0.48 Neurotrophin-3 levels POI 34.00 0.08 34.01 0.10 0 0.04 0.97 OTU99_16 (Faecalibacterium) abundance POI 6.74 0.47 6.86 0.55 0.04 0.12 0.73 HVEM on naive CD8 + T cell POI 8.68 0.99 - - - - - miR-500a-3p POI 5.25 0.73 5.47 0.79 0 0.07 0.65 miR-555 POI 1.36 0.716 1.42 0.84 0.02 0.09 0.82 miR-584-5p POI 1.76 0.88 2.77 0.84 0.07 0.07 0.36 miR-642a-5p POI 3.29 0.35 3.31 0.51 -0.01 0.10 0.90 miR-671-3p POI 8.41 0.49 9.03 0.53 -0.04 0.06 0.45 miR-1324 POI 2.120 0.71 2.65 0.75 0.07 0.09 0.51 miR-6870-3p POI 4.57 0.337 5.85 0.32 0.09 0.0923 0.34 miR-1468-5p POI 4.35 0.82 4.6 0.87 -0.04 0.08 0.62 miR-146a-3p POI 9.53 0.39 10.29 0.42 -0.05 0.06 0.41 miR-221-3p POI 3.68 0.45 3.74 0.59 -0.02 0.09 0.81 miR-3121-5p POI 3.70 0.72 3.72 0.81 0.02 0.11 0.87 miR-3184-3p POI 4.00 0.86 7.01 0.64 -0.09 0.05 0.12 miR-3185 POI 0.57 0.99 0.72 0.99 -0.02 0.06 0.73 miR-335-5p POI 8.09 0.53 9.17 0.52 -0.07 0.07 0.33 miR-4302 POI 6.26 0.39 6.26 0.51 0 0.10 0.99 miR-4506 POI 1.54 0.99 1.84 0.99 -0.04 0.07 0.60 miR-6808-5p POI 1.60 0.90 2.55 0.86 -0.08 0.09 0.37 miR-6894-5p POI 2.10 0.72 2.16 0.83 0.05 0.20 0.81 miR-145-5p POI 4.64 0.33 5.79 0.33 0.11 0.1· 0.38 miR-149-3p POI 2.84 0.24 3.00 0.39 0.08 0.22 0.77 miR-23a-3p POI 6.11 0.64 6.14 0.73 -0.01 0.07 0.86 miR-3141 POI 14.78 0.32 16.09 0.31 -0.08 0.07 0.30 miR-374b-5p POI 8.04 0.71 9.91 0.62 -0.05 0.04 0.20 ESR1 POI 21.08 0.74 25.67 0.54 0 0.03 0.04 ERBB2 POI 47.04 0.59 48.99 0.55 0.03 0.02 0.17 [101]Open in a new tab Discussion Markers, especially noninvasive warning markers, for POI are limited. To the best of our knowledge, this is the first study that combined MR analysis with multiomics to identify noninvasive markers for POI. In terms of metabolites, we first proposed that the upregulation of sphinganine-1-phosphate levels, and downregulation of 4-methyl-2-oxopentanoate levels may indicate POI. The potential mechanisms are currently unclear. Sphinganine-1-phosphate is a signalling molecule that may regulate cell growth, immune reactions, and apoptosis [[102]23]. Therefore, we hypothesize that sphinganine-1-phosphate may also regulate the apoptosis of oocytes and granulosa cells during folliculogenesis. For plasma inflammatory proteins, we revealed that the upregulation of fibroblast growth factor 23 and neurotrophin-3 may indicate POI. Mechanistically, the involvement of fibroblast growth factor 23 in fibrosis has been suggested [[103]24]. Therefore, we assume that fibroblast growth factor 23 may also induce ovarian stromal fibrosis, thereby lowering the quantity and quality of ovarian follicles. Neurotrophin-3, a neuroprotective growth factor, is involved mainly in the process of neuronal regeneration and the restoration of the physiological activity of neurons [[104]25]. The relationship between Neurotrophin-3 and POI is currently unclear. Our results suggest that Neurotrophin-3 may be involved in oogenesis and promote POI. OTU99_16 (Faecalibacterium) abundance and HVEM on naive CD8 + T cells were identified as potential gut microbiota and immunocyte predictors of POI, respectively. Faecalibacterium is associated with the production of short-chain fatty acids, which have immunoregulatory effects to maintain host immune and metabolic homeostasis for disease prevention and treatment [[105]26]. Min Liu et al. proposed that short-chain fatty acids affect the development of female reproductive disorders [[106]27]. Therefore, Faecalibacterium may be involved in POI through the production of short-chain fatty acids. Herpesvirus entry mediator (HVEM), which belongs to the tumour necrosis factor receptor (TNFR) superfamily, is recognized as a new immune target. HVEM is widely expressed on a variety of immune cells, including naive T and B cells, dendritic cells, and natural killer (NK) cells. By binding with BTLA (B- and T- lymphocyte attenuator) or LIGHT (lymphotoxin-like inducible protein that competes with glycoprotein D for herpes virus entry into T cells ), HVEM inhibits the activation and proliferation of T cells and B cells [[107]28]. Consistent with the results of our study, Lingbin Qi et al. found a significant reduction in naive CD8 + T cells in patients with ovulation dysfunction, such as POI [[108]29]. Twenty-three miRNAs were identified as potential markers for POI. Some miRNAs have been reported to be closely associated with POI. For example, similar to a previous study suggesting that miR-642a-5p inhibition may upregulate FOXO1 expression and thus protect ovarian function [[109]30], we hypothesized that the upregulation of miR-642a-5p expression may promote POI. In addition, Yanyang Lu et al. found that miR-145-5p attenuates granulosa cell oxidative injury and apoptosis and thus exacerbates ovarian injury and improves ovarian function in POI model rats [[110]31]. The results of our study also suggested that miR-145-5p is negatively correlated with POI. The role of other miRNAs in POI is currently unclear. Alterations in gene expression levels can indicate the occurrence and development of diseases. In our study, we hypothesized that the upregulation of ESR1 and ERBB2 expression in plasma may indicate POI. ESR1, which contributes to ovarian functions such as folliculogenesis and steroidogenesis [[111]32], has been reported to be a therapeutic target for POI [[112]33, [113]34]. Jeong Yong Lee et al. reported that ESR1 rs9340799 and rs2234693 are associated with POI incidence [[114]35]. ERBB2 is overexpressed in numerous cancers, including breast and ovarian tumours. However, we first proposed that the upregulation of ERBB2 expression may also be involved in POI in a mechanism mediated by tyrosine kinases [[115]36]. In addition, we revealed that the downregulation of and GART expression may promote POI; however, the mechanisms involved need to be explored further. On the basis of our findings, we believe that we should strengthen the monitoring of ovarian function in females with alterations in identified metabolites, plasma inflammatory proteins, the gut microbiota, immunophenotypes, miRNAs and genes, as these individuals are at increased risk of POI. Our research has several limitations. First, the data sources used in our study were predominantly European cohorts. Data from different ethnic populations may yield different results, thereby limiting the applicability of our results to other ethnic populations, such as Han Chinese and African cohorts. In future research, we will collect more omics data to explore the biomarkers in other populations. In addition, clinical cohort studies and functional investigations (such as in vitro or in vivo studies) related to newly discovered markers of POI are necessary in the future. Conclusion Our study identified noninvasive warning markers for POI through MR analysis. The findings of this study contribute to female fertility guidance. Electronic supplementary material Below is the link to the electronic supplementary material. [116]Supplementary Material 1^ (117.9KB, xlsx) [117]Supplementary Material 2^ (12.1KB, docx) [118]Supplementary Material 3^ (791.9KB, png) [119]Supplementary Material 4^ (1.2MB, png) Acknowledgements