Abstract Gestational diabetes mellitus (GDM) is a pregnancy disorder associated with short‐ and long‐term adverse outcomes in both mothers and infants. The current clinical test of blood glucose levels late in the second trimester is inadequate for early detection of GDM. Here we show the utility of Raman spectroscopy (RS) for rapid and highly sensitive maternal metabolome screening for GDM in the first trimester. Key metabolites, including phospholipids, carbohydrates, and major amino acids, were identified with RS and validated with mass spectrometry, enabling insights into associated metabolic pathway enrichment. Using classical machine learning (ML) approaches, we showed the performance of the RS metabolic model (cross‐validation AUC 0.97) surpassed that achieved with patients' clinical data alone (cross‐validation AUC 0.59) or prior studies with single biomarkers. Further, we analyzed novel proteins and identified fetuin‐A as a promising candidate for early GDM prediction. A correlation analysis showed a moderate to strong correlation between multiple metabolites and proteins, suggesting a combined protein‐metabolic analysis integrated with ML would enable a powerful screening platform for first trimester diagnosis. Our study underscores RS metabolic profiling as a cost‐effective tool that can be integrated into the current clinical workflow for accurate risk stratification of GDM and to improve both maternal and neonatal outcomes. Keywords: first trimester, gestational diabetes, mass spectrometry, metabolism, pregnancy, Raman spectroscopy __________________________________________________________________ Translational Impact Statement. The utility of Raman spectroscopy (RS) as a clinically relevant tool is demonstrated for early screening of gestational diabetes during the first trimester. By metabolic profiling of patient plasma and combining RS data with machine learning (ML), we achieved an unprecedented accuracy of >90%, surpassing current clinical tests and prediction models. In addition to metabolites, we also identified new protein markers, and showed correlations of metabolites to proteins, and to patient clinical factors. Our findings suggest RS combined with protein analysis and ML may enable a transformative shift in the early and rapid detection of pregnancy disorders. 1. INTRODUCTION Gestational diabetes mellitus (GDM), characterized by the spontaneous development of hyperglycemia during pregnancy, impacts ~14% of pregnancies globally.[46] ^1 GDM is exaggerated by risk factors including obesity, familial diabetes, advanced maternal age, and nutrient‐deficient diet.[47] ^2 Population‐wide studies report a steady increase in GDM cases annually in the United States. Further, there are significant racial and ethnic disparities in both the occurrence of GDM, and the resulting adverse outcomes that include preterm birth, small or large neonate for gestational age, and gestational hypertension.[48] ^3 ^, [49]^4 Whereas lifestyle interventions such as diet and exercise, insulin therapy, and oral hypoglycemic medications are widely used to manage GDM, they have not been effective in mitigating maternal and neonatal risks.[50] ^1 ^, [51]^5 ^, [52]^6 The physiological risks of GDM extend beyond the gestational period and persist into later life for both mother and children, leading to morbidities such as type 2 diabetes, obesity, and metabolic syndrome.[53] ^6 ^, [54]^7 ^, [55]^8 In the United States, the current clinical approach screens for GDM in the late second trimester (24–28 weeks); to be ruled out patients must meet glycemic targets of fasting plasma glucose <92 mg/dL and either a 1‐h glucose threshold <180 mg/dL or a 2‐h threshold <153 mg/dL set by the International Association of Diabetes and Pregnancy Study Groups.[56] ^9 But this approach is inadequate in predicting those at risk of developing GDM early in the pregnancy. Many risk‐scoring systems have been established based on a patient's age, body mass index (BMI), and family history of diabetes. However, the accuracy of these models has been ≤60% and only improved slightly when the history of GDM in prior pregnancies was included.[57] ^10 ^, [58]^11 Therefore, current prediction models are often inconclusive and may not be applicable to nulliparous women. Advances in omics approaches, including metabolomics, have led to a paradigm shift in the early diagnosis of many disorders. Since GDM reflects an extreme manifestation of metabolic alterations, recent findings have leveraged metabolomics to identify key metabolites contributing to the pathophysiology of GDM.[59] ^12 Yet, the majority of these clinical studies have focused on late second and third trimester metabolites[60] ^13 ^, [61]^14 ^, [62]^15 and a few in the early second trimester.[63] ^16 ^, [64]^17 Metabolic assessment in the first trimester remains underdeveloped. This knowledge gap may be in part because few metabolites are enriched in maternal circulation very early in the pregnancy, falling below the sensitivity level of current gold standard techniques (such as nuclear magnetic resonance and mass spectrometry [MS]). Further, metabolites are also extremely sensitive to their environment, where complex metabolite extraction processes in poorly enriched samples can often lead to measurement inconsistencies. MS approaches are also often limited by matrix effects, where the ionization of target analytes is altered, that leads to signal suppression.[65] ^18 ^, [66]^19 Finally, while expansive metabolomics databases are available for metabolite identification, confounding overlap with unrelated molecules can often influence the results. Raman spectroscopy (RS) has emerged as a powerful optical technique offering a label‐free, extraction‐free, low cost, and rapid approach for metabolic profiling in cells, tissues, and patient biofluids.[67] ^20 ^, [68]^21 RS measures the inelastic scattering of incident light via the molecular vibrations of the various biochemical species in a sample.[69] ^22 The unique spectra resulting from this light scattering include many classes of metabolites such as nucleic acids, amino acids, fatty acids, and sugars among others with well‐established peak positions.[70] ^23 Indeed RS has enabled early detection and treatment monitoring in multiple disease models by identifying specific spectral biomarkers. These include the identification of tumor margin for resection,[71] ^24 ^, [72]^25 distinguishing various pathogen strains and assessing antimicrobial susceptibility,[73] ^26 ^, [74]^27 diagnosis of metabolic disorders by allowing a non‐invasive alternative for glucose monitoring,[75] ^28 ^, [76]^29 ^, [77]^30 and identification of inflammatory processes and tissue damage characteristic of gut disorders.[78] ^31 ^, [79]^32 Our group and others have shown that RS is highly sensitive, even at low metabolite concentrations in the early diagnosis and prognosis of metabolically active diseases such as cancer.[80] ^33 ^, [81]^34 ^, [82]^35 This work also builds upon our prior studies in pregnancy disorders, in which RS data combined with clinical information were used to predict preterm birth in the first trimester[83] ^36 and for the longitudinal screening of the maternal metabolome of preeclamptic patients.[84] ^37 In this work, the strengths of RS were leveraged for maternal metabolome screening in first‐trimester pregnant patient plasma who were later diagnosed with GDM and compared to those who had normal blood sugar levels throughout pregnancy (healthy). Raman spectral results were integrated with unsupervised machine learning (ML) for data visualization to distinguish the two patient cohorts. We identified key metabolites that were correlated with the late second trimester blood sugar levels. Raman findings were validated with MS metabolomics showing synergy between the two approaches in metabolites measured, and the related metabolic pathways enriched were identified using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. We leveraged a support vector machine (SVM) model and achieved an area under the curve (AUC) of 0.59 ± 0.12 for the clinical data alone, that includes demographics and obstetric data shown in Table [85]1. The Raman metabolites achieved an AUC of 0.97 ± 0.06, highlighting the ability of RS to make highly accurate predictions even with a small sample size. Protein analysis with an enzyme‐linked immunosorbent assay (ELISA) identified fetuin‐A as a promising marker that would complement metabolic profiling early in pregnancy. A correlation analysis of metabolites to clinical data and metabolites to proteins showed moderate to strong correlation, suggesting that a combined metabolic and protein screening may enable GDM risk stratification. We demonstrate that RS has the potential to complement the current clinical workflow for early assessment of the risk of pregnancy disorders. TABLE 1. Average demographic, obstetric history, and clinical information of patients. Clinical parameter of patients Healthy (mean ± SD) GDM (mean ± SD) p‐Value Number of patients 34 34 – Maternal age (years) 31.8 ± 4.7 31.9 ± 5.6 0.463 BMI (kg/m^2) 29.5 ± 10.1 31.9 ± 7.8 0.138 Gestational age (weeks) 38.5 ± 1.8 38.8 ± 1.3 0.281 Gravida 2.5 ± 1.5 2.7 ± 1.7 0.306 Parity 1.1 ± 0.9 1.2 ± 1.4 0.46 Pregnancy loss 0.4 ± 0.7 0.5 ± 0.8 0.322 Blood sugar concentration (mg/dL) 116.0 ± 20.1 185.4 ± 29.2 1.07E−16 Comorbidities (%) 17.6 2.9 0.0234 [86]Open in a new tab Abbreviations: BMI, body mass index; GDM, gestational diabetes mellitus. 2. RESULTS AND DISCUSSIONS In this study, n = 34 first‐trimester plasma samples of pregnant patients who were diagnosed with GDM later in the pregnancy were evaluated and compared to n = 34 patients with healthy pregnancies. This research was reviewed and approved by the Institutional Review Board (IRB#200910784). The coded samples were received from the Perinatal Family Tissue Bank (PFTB) at the University of Iowa (UI), which is an academic biobank where patient biofluids are continuously banked for research and education purposes from patients who provided informed consent. Patients were not specifically recruited for this study but had previously consented to have their samples stored and made available for research purposes. First‐trimester samples are in high demand at this tissue bank, and thus we were limited in the number of samples that were available to us. Therefore, for the healthy patients' data, we used our group's previously published Raman data set[87] ^37 that was released to the Iowa State University's open data repository[88] ^38 and supplemented with additional healthy patient samples. It is also important to note that the majority of our subjects are White (85%), which is representative of the Iowa population. Table [89]1 summarizes the average maternal demographic data, including maternal age, BMI, number of pregnancies (gravida), number of births (parity), pregnancy losses, gestational age at delivery, blood sugar from a glucose test taken after 24 weeks, and % comorbidities. This information for individual de‐identified patients in the two cohorts is provided in Tables [90]S1 and [91]S2. The average 1‐h blood glucose level for the GDM patients was 185.4 ± 29.2 mg/dL, which was significantly higher than that of the healthy patients. The Raman spectra of the GDM and healthy patient samples were obtained by measurement with a 785 nm laser with 40 mW power at the point of excitation using a 50× objective. The measured spectra were smoothed, background corrected, normalized with the standard normal variate (SNV) method, and then ratiometrically analyzed using the 1448 cm^−1 lipid and protein‐associated peak. This peak was chosen because of its negligible variation among the samples. Complete RS methods can be seen in the experimental methods section and Figure [92]S1. The Raman spectra of healthy and GDM patients were first compared to identify changes in their metabolic profiles. Normalized representative Raman spectra of a healthy and a GDM patient are shown in Figure [93]1a, and the normalized Raman spectra of all healthy and GDM patients are shown in Figure [94]S2. The differences in the RS profiles between the two cohorts (Figure [95]1b) highlight metabolic changes at the onset of disease. The difference spectrum was obtained by subtracting the healthy spectrum from the GDM spectrum; hence, the positive values identify increases in abundance of associated metabolites, and negative values represent a decrease in metabolites. Tentative Raman peaks are then assigned to the corresponding metabolites based on previous studies from our group,[96] ^36 ^, [97]^37 as well as highly cited and established Raman references on biological samples such as tissues,