Abstract Preeclampsia (PE) is a serious pregnancy complication that contributes to maternal and perinatal morbidity and mortality. Understanding its pathogenesis and revealing predictive biomarkers are essential for guiding treatment decisions. In order to explore the global changes of serum metabolites in PE patients and identify potential predictive biomarkers for suspected PE patients (pregnant women who had already shown PE-related symptoms in the middle to late stages of pregnancy, but were not yet confirmatively diagnosed as PE.), a large-scale serum metabolomic analysis was conducted in this study with a prospective cohort of 328 suspected PE patients in the middle or late pregnancy stages, as well as a retrospective cohort of 30 healthy pregnant women and 30 PE patients. Using liquid chromatography mass spectrometry (LC − MS), serum metabolomic profiling revealed that the development of PE was closely associated with disturbed amino acid metabolism. Moreover, a panel of seven predictive biomarkers including 2-methyl-3-hydroxy-5-formylpyridine-4-carboxylate, gamma-glutamyl-leucine, 2-hydroxyvaleric acid, LysoPC(16:1(9Z)/0:0), PC(DiMe(13,5)/MonoMe(13,5)), ADP-D-glycero-beta-D-manno-heptose and phenylalanyl-tryptophan were identified for PE development by performing multiple statistical analysis and LASSO regression analysis. The combination of these biomarkers showed promise in the prediction of PE development for suspected PE patients, with an AUC of 0.753 and 0.885 for the discovery and validation cohorts, respectively. These findings highlight the potential of large-scale prospective metabolomic studies combined with machine learning algorithms in identifying key biomarkers for predicting PE development, while retrospective metabolomics studies provide insights into the pathogenesis of PE. Supplementary Information The online version contains supplementary material available at 10.1038/s41598-025-87905-9. Keywords: Preeclampsia, Serum metabolomic study, Predictive biomarkers, Altered metabolic pathways, Machine learning algorithms Subject terms: Diseases, Biomarkers, Predictive markers Introduction Preeclampsia (PE) is a common and serious complication during pregnancy and postpartum, characterized by new-onset hypertension (> 140/90 mmHg) and proteinuria (> 300 mg/24 h), typically after 20 weeks of pregnancy. It can also be superimposed on preexisting hypertension or renal disease. Currently, PE has been recognized as public health concern due to several reasons: (1) it is diagnosed in 2–5% of pregnancies, with a higher incidence for other forms of pregnancy-related hypertension^[58]1; (2) there is an urgent need to explore more effective diagnostic biomarkers to assist in clinical decision-making for suspected PE patients, because the traditional clinical indicators have limited predictive value for PE or its adverse pregnancy outcomes^[59]2,[60]3; (3) PE is a complex multisystem disease and triggers problems in the liver, kidney, brain, and the clotting system^[61]4. Worsely, PE increases the risk of long-term complications, including metabolic disease and cardiovascular disease, in both mothers and offspring^[62]5. (4) there is currently no cure for PE, and treatments focus mainly on relieving the symptoms and minimizing complications^[63]6. Recent research has focused on revealing the development of PE and identifying biomarkers for prediction. The current strategy for predicting preeclampsia is based on a combination of baseline maternal factors, biophysical parameters, and placental-associated proteins^[64]7. Although this combination is effective in predicting PE in the early stage of pregnancy, it falls short in predicting PE development for those women have already shown PE-related symptoms^[65]8. Studies suggested that PE might stem from altered maternal pattern of circulating placentally derived proteins that regulating angiogenesis^[66]9, a specific sFlt-1:PlGF ratio cutoff of 38 has shown promise in predicting PE development for women with clinical suspicion of the condition^[67]10. However, evidence for the diagnostic effectiveness of the ratio in screening women without clinical suspicion of the disease is poor^[68]11. Metabolomics, as the final manifestation of integrated upstream biological information flow, determines the eventual phenotype. Compared to targeted metabolomics focusing on well-defined metabolites, the high-throughput untargeted metabolomics aims at monitoring all low-molecular-weight metabolites in a biological fluid and has been widely used to discover specific metabolic patterns of diseases^[69]12. Besides, recent work has highlighted the potential of machine learning (ML) algorithms for processing a large amount of data and screening the candidate biomarkers effectively. The combination of untargeted metabolomics and ML provides comprehensive insights into metabolic alterations, leading to improved predictions^[70]13–[71]15. Despite advancements in metabolomics for PE, there are still challenges for clinical application. The heterogeneity of metabolites contributed to poor specificity and low positive predictive values (PPV, 8-33%)^[72]16, leading to unnecessary tests and interventions for false-positive patients. Additionally, previous PE predictive models were established between healthy controls (HC) and PE patients, the differences between suspected-PE^+ (patients with PE-related symptoms who are diagnosed with PE) and suspected-PE^− (patients with PE-related symptoms but did not develop to PE) were not revealed. Therefore, more diverse populations should be recruited to validate the biomarkers. In this work, an integrated work-flow for metabolome profiling of maternal serum was carried out to explore the global changes of serum metabolites in PE and identify potential predictive biomarkers for PE development. Firstly, HC, PE patients and suspected PE participants were recruited in independent experiments for untargeted serum metabolomics study. Subsequently, multivariate statistical analysis was employed to reveal the metabolic differences between HC vs. PE and suspected-PE^−vs. suspected-PE^+. Thirdly, predictive biomarkers for suspected PE patients were identified and validated in two cohorts. To our knowledge, this work is the first to reveal the candidate predictive biomarkers for suspected PE patients based on large-scale prospective serum metabolomics, we envision that the integrated researches could deepen our understandings of PE. Results Characteristics of serum metabolomic profiles in all groups based on untargeted metabolomics Studies on determinants of PE in pregnant women suggested parity, age, pre-pregnancy BMI and gestational week at sampling as potential covariates of PE during pregnancy^[73]16–[74]18. As shown in Fig. [75]1, retrospective and prospective cohorts were constructed and the baseline characteristics including these potential covariates of the study population were summarized, no statistically significant differences were observed (Table [76]1). Following appropriate pre-treatment, serum samples were injected into UPLC − QE-MS for untargeted metabolomic analysis. A total of 8033 features were discovered in the positive mode, and 6291 features were detected in the negative mode (Figure [77]S1). After signal de-noising and dataset normalization, 1569 metabolites in the positive and 825 metabolites in the negative mode were annotated in the public database. A total of 187 compounds were identified in both positive and negative ion modes (Fig. [78]2A). Among these annotated metabolites, lipids and lipid-like molecules were the most abundant, accounting for 41.31% of the total metabolites. They were followed by organic acids and derivatives, organ heterocyclic compounds, benzenoids, organic oxygen compounds, phenylpropanoids and polyketides and organic nitrogen compounds, among others (Fig. [79]2B). Fig. 1. [80]Fig. 1 [81]Open in a new tab The schematic diagram represents the strategy for discovery and validation of the serum candidate biomarkers for PE prediction. Table 1. The baseline characteristics of the study population. Age, mean ± SD Pre-pregnancy BMI Sampling week Diagnosis week Retrospective cohort HC (n = 30) 32.83 ± 4.39 23.84 ± 3.17 28.73 ± 4.47 PE (n = 30) 33.33 ± 4.86 25.35 ± 4.02 29.23 ± 3.79 35.90 ± 2.95 p^a value 0.677 0.112 0.642 Discovery cohort Suspected-PE^−(n = 215) 32.75 ± 4.63 24.18 ± 4.29 30.45 ± 4.19 Suspected-PE^+ (n = 47) 33.91 ± 4.47 25.11 ± 4.77 28.93 ± 5.27 36.17 ± 2.91 p^a value 0.117 0.095 0.067 Validation cohort Suspected-PE^−(n = 56) 34.00 ± 4.88 23.92 ± 3.81 28.41 ± 4. 52 suspected-PE^+ (n = 10) 31.90 ± 4.04 24.43 ± 4.81 31.40 ± 3.53 35.40 ± 5.19 p^a value 0.204 0.705 0.052 p^b value 0.413 0.307 0.193 0.779 [82]Open in a new tab ^aCompared inter-groups; ^b: compared intra-groups. Fig. 2. [83]Fig. 2 [84]Open in a new tab Veen diagram of metabolites detected in the positive ion mode and negative ion mode (A); Types and proportions of metabolites detected in untargeted metabolomics (B); Score scattering plot yielded from partial least-squares discrimination analysis (PLS-DA) on the quantitative metabolome datafile of serum samples from HC, PE, suspected-PE^− and suspected-PE^+ groups (C). The metabolites collected in the positive and negative ion modes were combined for multivariate statistical analysis. Unfortunately, the unsupervised pattern recognition method was not sufficient to differentiate the four groups (Figure [85]S2). Subsequent partial least squares discriminant analysis (PLS-DA) analysis revealed that HC samples mainly located in the first quadrant, with mild variations observed for suspected-PE^− samples. On the other hand, significant intra-group variations were observed for suspected-PE^+ samples, particularly for PE samples. It is noteworthy that during the development of PE, there is a gradual deviation from normal physiological patterns (Fig. [86]2C). This suggests that certain candidate biomarkers could be utilized to reflect the pathophysiological progression of PE. To validate the reliability of the developed PLS-DA model, a permutation test was carried out and no overfitting was observed (Figure S3). Additionally, an unsupervised principal component analysis (PCA) was performed to evaluate the variance of QC samples (Figure S4). The results indicated that the QC samples clustered closely within a range of no more than two standard deviations, demonstrating the stability of the analysis system and the high quality of the data. Differential metabolites and dysregulated metabolic pathways between healthy pregnant women and PE patients In order to enhance the diversity among groups in metabolomics analysis and identify key differential metabolites, supervised orthogonal partial least squares discriminant analysis (OPLS-DA) was performed. The scattering plot showed a clear separation between HC and PE (Fig. [87]3A). To avoid overfitting, a 200 times permutation test was carried out and the results suggested that the supervised model was able to provide an objective comparison between HC and PE (Figure S5). Metabolites with VIP value above 1.5 were dispersed from the origin in the loading plot (Fig. [88]3B). Additionally, through univariate nonparametric Wilcoxon’s analysis, 98 differential metabolites with VIP > 1.5, p < 0.05 and fold change (FC) > 1.2 / FC < 0.8 were recognized as the primary contributors to group classification (Fig. [89]3C). Fig. 3. [90]Fig. 3 [91]Open in a new tab Orthogonal partial least-squares discrimination analysis (OPLS-DA) of serum samples from HC and PE groups: score scattering plots (A) and S-plot loading plots (B); Volcano plot analysis of significantly altered metabolites in the serum samples of HC versus PE (C); ROC analysis of the candidate biomarkers for PE prediction in the retrospective cohort (D). Among the 98 metabolites, 73 were up-regulated and 25 were down-regulated in the PE group. These significantly regulated metabolites fell into diverse categories of structural identities, including organic compounds, bile acids, steroid hormones, amino acids, nucleotides, and purine metabolites. The top 5 metabolites with the highest FC in both upregulation and downregulation were highlighted in Fig. [92]3C. Notably, citric acid, taurocholic acid, 3-dehydrosphinganine, glycocholic acid and 3-phosphonooxypyruvate were significantly elevated in PE, while cortisol, estrone sulfate, dehydroepiandrosterone sulfate, L-phenylalanine and L-cysteine were significantly reduced compared to HC. Afterwards, a genetic algorithm-based optimization evolutionary method was employed to identify discriminant metabolites, with the top metabolites selected from 200 repeated 10-fold cross-validations as potential biomarkers to enhance the reliability and clinical applicability of the prediction models. And a LASSO machine learning model was utilized for variable selection to yield concise outcomes (Figure S6). Ultimately, a total of 10 metabolites, including dehydroepiandrosterone sulfate, cortisol, valproic acid glucuronide, L-proline, 3-hydroxybutyric acid, L-histidine, 2-hydroxybutyric acid, L-glutamic acid, citric acid and 3-phosphonooxypyruvate were identified (Table [93]S1). As depicted in Fig. [94]3D, these selected biomarkers exhibited an area under the curve (AUC) ranging from 0.635 to 0.910 in the retrospective cohort. Unexpectedly, the combination of biomarkers showed outstanding accuracy (AUC = 1). Subsequently, the up-regulated and down-regulated metabolites were analyzed using MetaboAnalyst 5.0 to identify enriched pathways, respectively. We found that the down-regulated metabolites were mainly enriched in steroid hormone biosynthesis (p < 0.05, FDR < 0.25) (Fig. [95]4A, Table [96]S2). On the other hand, the up-regulated metabolites were associated with pathways such as phenylalanine, tyrosine and tryptophan biosynthesis, nitrogen metabolism, phenylalanine metabolism, glyoxylate and dicarboxylate metabolism, glycine, serine and threonine metabolism, biosynthesis of unsaturated fatty acids, arginine biosynthesis, butanoate metabolism and histidine metabolism (Fig. [97]4B, Table S3). However, some metabolites can participate in multiple metabolic pathways. To explore the potential functional relationships between these key metabolites, a metabolite-metabolite interaction network was built. As shown in Fig. [98]4C, key metabolites such as L-glutamic acid, citric acid, L-phenylalanine, L-cysteine and L-proline, which had high degrees in the network, appeared to play critical roles in connecting different pathways, indicating the dysregulated amino acid metabolism pathway is the premise of various dysregulated metabolic pathways in PE. Fig. 4. [99]Fig. 4 [100]Open in a new tab Enrichment pathway analysis of the metabolites that were differentially downregulated (A) or upregulated (B) in PE versus HC; metabolite–metabolite interaction network analysis of the differential metabolites between PE and HC (C). Biomarkers discovery between suspected-PE^− and suspected-PE^+ groups Predicting the development of PE in the patients with PE-related symptoms is crucial in clinical practice. Therefore, a prospective study was conducted with a large sample size of suspected-PE pregnant women (n = 336), which were randomly divided into discovery cohort and validation cohort. The discovery cohort aimed to reveal the metabolic differences between suspected-PE^− and suspected-PE^+ groups, nevertheless, initial analysis using PCA did not show clear distinctions between the groups (Figure S7). Subsequently, a supervised OPLS-DA model was employed to highlight the differences more effectively. The model demonstrated a slight separation in metabolite profiles between the two groups (Fig. [101]5A), supported by satisfactory permutation results between two similar groups (Figure S8). Suspected-PE^− samples were predominantly distributed on the positive half axis of the X-axis, while suspected-PE^+ samples were mainly on the negative half axis of the X-axis. According to the corresponding loading plots, 117 variables with VIP > 1.5 were retained and presented in the Fig. [102]5B. Moreover, volcano plots were used to visualize statistical significance and fold change values, leading to the identification of 19 differential metabolites meeting specific criteria (VIP > 1.5, p < 0.05, and FC > 1.2 / FC < 0.8) (Fig. [103]5C, Table S4). According to the significantly altered metabolites, metabolite set enrichment analysis (MSEA) and KEGG pathway analysis were used to determine the altered metabolic pathways in suspected-PE^+. As a result, the enriched pathways mainly involved in vitamin B6 metabolism, glutathione metabolism, arginine and proline metabolism and warburg effect (Fig. [104]5D). Fig. 5. [105]Fig. 5 [106]Open in a new tab Orthogonal partial least-squares discrimination analysis (OPLS-DA) of serum samples from suspected-PE^− and suspected-PE^+ groups: score scattering plots (A) and S-plot loading plots (B); Volcano plot analysis of metabolites significantly altered in the serum samples of suspected-PE^− versus suspected-PE^+ (C), Enrichment pathway analysis of the metabolites that were differentially altered in suspected-PE^− versus suspected-PE^+ (D). Development and validation of the diagnostic model for PE in suspected patients Systematic metabolomic investigations revealed metabolic alteration in serum between suspected-PE^− and suspected-PE^+ individuals. Differential metabolites were analyzed using logistic regression models with constrained parameters as in LASSO to establish the prediction models. A panel of seven potential biomarkers including 2-Methyl-3-hydroxy-5-formylpyridine-4-carboxylate, gamma-glutamyl-leucine, 2-hydroxyvaleric acid, LysoPC(16:1(9Z)/0:0), PC(DiMe(13,5)/MonoMe(13,5)), ADP-D-glycero-beta-D-manno-heptose and phenylalanyl-tryptophan, was selected for predicting PE diagnosis (Fig. [107]6A). These differentially altered metabolites fell into diverse categories of structural identities, including pyridinecarboxylic acids and derivatives, amino acids, peptides, and analogues, fatty acids and conjugates, glycerophosphocholines, and purine nucleotide sugars. The corresponding intercept and coefficients in the LASSO model were summarized in Table [108]2. As illustrated in Fig. [109]6B, levels of 2-Methyl-3-hydroxy-5-formylpyridine-4-carboxylate was significantly down-regulated from suspected-PE^− to suspected-PE^+, while the levels of gamma-glutamyl-leucine, 2-hydroxyvaleric acid, LysoPC(16:1(9Z)/0:0), PC(DiMe(13,5)/MonoMe(13,5)), ADP-D-glycero-beta-D-manno-heptose and phenylalanyl-tryptophan were significantly elevated in suspected-PE^+. Fig. 6. [110]Fig. 6 [111]Open in a new tab The minimum penalty coefficient model constructed using the LASSO regression model (A); Comparison between suspected-PE^− (n = 215) and suspected-PE^+ (n = 47) groups towards relative abundance of candidate biomarkers. Table 2. The metabolites with the highest potential prognostic significance were identified by LASSO regression analysis. Metabolites Coefficient HMDB ID Supper class Sub class (Intercept) 4.41 2-methyl-3-hydroxy-5-formylpyridine-4-carboxylate 2.4e^− 7 HMDB0006954 Organoheterocyclic compounds Pyridinecarboxylic acids and derivatives gamma-glutamyl leucine − 7.2e^− 8 HMDB0011171 Organic acids and derivatives Amino acids, peptides, and analogues 2-hydroxyvaleric acid − 3.5e^− 9 HMDB0001863 Lipids and lipid-like molecules Fatty acids and conjugates LysoPC(16:1(9Z)/0:0) − 1.6e^− 9 HMDB0010383 Lipids and lipid-like molecules Glycerophosphocholines PC(DiMe(13,5)/MonoMe(13,5)) − 9.1e^− 8 HMDB0061455 Lipids and lipid-like molecules Glycerophosphocholines ADP-D-glycero-beta-D-manno-heptose − 2.4e^− 7 HMDB0029952 Nucleosides, nucleotides, and analogues Purine nucleotide sugars Phenylalanyl-tryptophan − 8.6e^− 8 HMDB0029006 Organic acids and derivatives Amino acids, peptides, and analogues [112]Open in a new tab ROC analysis and ROC AUC (area under curve) were utilized to assess the diagnostic performance of all parameters. In the ROC analysis of discovery cohort, AUCs of seven biomarker candidates were presented in descending order: phenylalanyl-tryptophan, gamma-glutamyl-leucine, 2-methyl-3-hydroxy-5-formylpyridine-4-carboxylate, PC(DiMe(13,5)/MonoMe(13,5)), lysoPC(16:1(9Z)/0:0), 2-hydroxyvaleric acid and ADP-D-glycero-beta-D-manno-heptose, with the best AUC of 0.673 (95% CI, 0.597–0.750). Multivariate ROC curve analysis may yield a more effective approach for creating and evaluating predictive biomarker models compared to univariate ROC curve analysis, it yielded an AUC of 0.753 (95% CI, 0.683–0.824) and achieved a sensitivity of 67.91%, a specificity of 70.21%, a PPV of 80.86% and an NPV of 65.44% at the cutoff determined by the Youden’s index (Fig. [113]7A; Table [114]3). In the independent validation cohort, the panel of biomarker candidates showed the best predictive power with an AUC of 0.885 (95% CI, 0.789–0.982). Using the cutoff determined by the Youden’s index, the biomarker panel showed a sensitivity of 76.79%, a specificity of 80.00%, a PPV of 93.26% and an NPV of 75.77% (Fig. [115]7B; Table [116]3). Fig. 7. [117]Fig. 7 [118]Open in a new tab The ROC analysis of the candidate biomarkers for PE prediction in the discovery (A) (suspected-PE^+n = 47, suspected-PE^−n = 215) and validation (B) (suspected-PE^+n = 10, suspected-PE^−n = 56) cohorts. Table 3. ROC analysis of the potential biomarkers in PE prediction. Gamma-glutamyl- leucine 2-hydroxyvaleric acid lysoPC(16:1(9Z)/0:0) PC(DiMe(13,5)/MonoMe(13,5)) Phenylalanyl-tryptophan 2-Methyl-3-hydroxy-5-formylpyridine-4-carboxylate ADP-D-glycero-beta-D-manno-heptose Combination Discovery cohort (n = 262) AUC (95%CI) 0.6639 (0.5771–0.7508) 0.6113 (0.5181-0.7045) 0.6363 (0.5453–0.7273) 0.6391 (0.5530-0.7252) 0.6737 (0.5972–0.7503) 0.6582 (0.5715-0.7448) 0.6038 (0.5172–0.6903) 0.7539 (0.6833–0.8245) Sensitivity,% 66.05 57.21 60.47 61.86 63.26 68.69 56.28 67.91 Specificity,% 63.83 57.45 61.70 61.70 63.83 61.70 57.45 70.21 PPV,% 74.83 59.47 82.77 83.40 90.69 76.31 56.94 80.86 NPV,% 56.65 55.50 52.16 52.29 55.60 54.09 56.78 65.44 Validation cohort (n = 66) AUC (95%CI) 0.7482 (0.5925–0.9039) 0.6286 (0.4035-0.8536) 0.7929 (0.6272–0.9585) 0.7339 (0.5825-0.8853) 0.6214 (0.4518–0.7910) 0.7268 (0.4518-0.7910) 0.6393 (0.4353-0.8433) 0.8857 (0.7894–0.9820) Sensitivity,% 85.71 91.07 75.00 66.07 64.29 87.50 53.57 76.79 Specificity,% 60.00 50.00 70.00 60.00 50.00 60.00 60.00 80.00 PPV,% 81.70 64.55 80.42 81.7 79.69 68.63 76.27 93.26 NPV,% 61.91 84.83 60.43 61.91 61.17 82.76 57.01 75.77 [119]Open in a new tab PPV, positive predictive value; NPV, negative predictive value. Discussion Hypertension and proteinuria are critical clinical symptoms of PE, as well as major objective diagnostic indicators for this condition. However, the presence of proteinuria or hypertension does not always lead to the development of PE, and the metabolic alterations in suspected-PE^− and suspected-PE^+ remain poorly defined. In this untargeted UHPLC − MS metabolomics study, both retrospective and prospective cohorts were included and strict inclusion criterias were used to recruit healthy controls, PE and suspected-PE patients. After metabolomic profiling and systematic comparisons, significantly different serum metabolomic patterns were observed between HC and PE groups, while the metabolic patterns were similar between suspected-PE^− and suspected-PE^+. Pathway enrichment analysis combined with metabolite-metabolite interaction network analysis revealed that altered amino acids bridged the various dysregulated metabolic metabolites during PE, and the endpoint of interaction network mainly manifested as significant downregulation of steroid hormone metabolic pathways. Additionally, we constructed the predictive models based on differential metabolites between suspected-PE^− and suspected-PE^+ using machine learning algorithms, and then developed a consensus model with satisfactory predictive ability. In the perturbed metabolic network of PE patients, amino acids played a central role and connected the entire network. Among them, L-glutamic acid, a key regulator of glutathione metabolism, exhibited the highest degrees in the metabolic network. According to previous studies^[120]19,[121]20, aberrant glutathione homeostasis also contributes to complications in PE. Therefore, glutathione metabolism is one of the key pathways to target in PE. Except for amino acids, endogenous molecules including arachidonate and linoleate also served as important communication points. Over-accumulation of linoleate and arachidonate in PE indicated excessive biosynthesis of unsaturated fatty acids, leading to endothelial dysfunction due to oxidative stress and inflammation^[122]21. Moreover, metabolites like taurocholic acid, glycocholic acid, estrone sulfate and dehydroepiandrosterone sulfate, which are related to cholesterol metabolism, showed the highest FC between HC and PE. Interestingly, bile acids and steroid hormones exhibited opposite trends of variations, warranting further investigation. Metabolomics combined with ML algorithm has been increasingly implemented for developing diagnostic models for various human diseases in recent years^[123]22–[124]24. Systematic metabolomic investigations revealed metabolic alteration in plasma between HC and PE, as well as suspected-PE^− and suspected-PE^+ individuals. Diagnostic models based on differential metabolites were established for PE using machine learning algorithms. For the comparison between HC and PE, the selected biomarkers showed promising results with an AUC ranging from 0.635 to 0.910 in the retrospective cohort, but further validation with a larger population from multiple-center is necessary for clinical application. Currently, the sFlt-1/PlGF ratio cutoff of 38 has been employed for the short-term prediction of PE in Asian women with suspected PE with good performance, which has also been validated in our previous research (training cohort AUC = 0.637; validation cohort AUC = 0.733). Despite the similar metabolic profiles between suspected-PE^− and suspected-PE^+ were found in this research, a panel of seven features were screened as predictive biomarkers with higher diagnostic efficiency (training cohort AUC = 0.753; validation cohort AUC = 0.885). Interestingly, although these 7 biomarkers have not been previously reported in metabolomics studies on PE, they are all linked to the pathogenesis of PE. Inflammation, particularly in the preeclamptic placenta, has been highlighted as a key process. Molecules such as ADP-d-glycero-β-d-manno-heptose (ADP-heptose) and certain glycerophospholipids including lysoPC(16:1(9Z)/0:0) and PC(DiMe(13,5)/MonoMe(13,5)) have been implicated in modulating inflammation^[125]25–[126]27. Meanwhile, lipid over-accumulation in maternal serum contributes to endothelial dysfunction secondary to oxidative stress, and abnormal glucose metabolism and lipid metabolism often occur in parallel^[127]28. Moreover, increased 2-hydroxyvaleric acid, gamma-glutamyl-leucine and phenylalanyl-tryptophan were detected in suspected-PE^+ samples. 2-hydroxyvaleric acid, a known organic acid in human fluids, has shown potential as a predictor of type 1 diabetes^[128]29,[129]30. Additionally, gamma-glutamyl-leucine and phenylalanyl-tryptophan are bioactive peptides associated with obesity and type 2 diabetes^[130]31–[131]33. As an intermediate metabolite in vitamin B6 production, 2-methyl-3-hydroxy-5-formylpyridine-4-carboxylate plays a role in regulating cellular homocysteine levels in the transsulfuration pathway and acetylcholine-induced endothelium-dependent relaxation, thereby helping to prevent insulin resistance and vascular dysfunction^[132]34. Lower levels of 2-methyl-3-hydroxy-5-formylpyridine-4-carboxylate in suspected-PE^+ groups indicated an increase in insulin resistance. This study has several inevitable limitations that warrant acknowledgment. Firstly, inadequate clinical information was available