Abstract Purpose This study aims to identify key genes that may be involved in the pathogenesis of gestational diabetes mellitus and to preliminarily elucidate the underlying mechanisms. Methods High-throughput transcriptome sequencing was employed to identify Differentially expressed genes (DEGs) in placental tissue samples of GDM and normal pregnant women. Functional and pathway analyses of these DEGs were conducted using bioinformatics databases. Significant DEGs were validated through real-time quantitative PCR in conjunction with relevant literature. Results In comparison to the normal pregnancy group, 435 DEGs were identified in the GDM group, comprising 128 upregulated and 307 downregulated genes. GO enrichment analysis revealed that DEGs were primarily associated with biological processes, such as cellular processes, biological regulation, regulation of biological processes, and response to stimuli. Cell component enrichment analysis indicated their association with cellular anatomical entities and protein-containing complexes. Molecular function enrichment analysis highlighted their roles in binding and catalytic activities. KEGG pathway enrichment analysis indicated the involvement of DEGs in signalling pathways related to PI3K-Akt signaling pathway and ECM-receptor interaction. qRT-PCR validation of five randomly selected DEGs confirmed consistent expression trends with RNA-Seq quantification. Conclusion YWHAB, LEP, CCL21, PAPPA2, and SFN may be potential biological markers for the diagnosis of GDM, involved in the occurrence and development of GDM, and have certain value for disease prediction and diagnosis. Keywords: gestational diabetes, transcriptome sequencing, qRT-PCR validation, placental tissue, pathogenic genes Introduction Gestational Diabetes Mellitus (GDM) refers to any degree of glucose intolerance that manifests during pregnancy and is recognized for the first time—a common metabolic complication of pregnancy.[36]^1 Currently, GDM remains a global healthcare concern, with an escalating incidence rate. Research indicates that GDM not only elevates the risk of adverse outcomes such as preterm birth, polyhydramnios, macrosomia, and preeclampsia but also predisposes newborns to hypoglycemia, potential obesity or overweight during their growth and an increased risk of developing type 2 diabetes in later life. Early diagnosis and intervention for GDM hold profound significance in averting adverse maternal and neonatal pregnancy outcomes. Therefore, the quest for novel molecular biomarkers with potential diagnostic and therapeutic assessment value has become imperative. GDM is closely associated with insulin resistance,[37]^2 but it also exhibits links to inflammatory responses,[38]^3 oxidative stress,[39]^4 and genetic variations.[40]^5^,[41]^6 The changes in epigenetic factors play an important role in the pathogenesis of GDM. Epigenetics refers to heritable changes in gene expression that occur without altering the nucleotide sequence of a gene, including DNA methylation, histone modification, chromosome remodeling, and non coding RNA regulation. Epigenetics mainly affects the function and characteristics of genes by regulating their transcription or translation processes. GDM belongs to a polygenic disease, and basic research on the relationship between GDM and genetic factors and abnormal gene expression should be emphasized, which can help deepen the understanding of the etiology of GDM and provide new ideas and effective molecular targets for the prevention, diagnosis, treatment, and prognosis of GDM. Hence, this study conducts transcriptome sequencing and qRT-PCR validation on placental tissue samples from GDM and normal pregnant women to identify differentially expressed genes. Employing bioinformatics analysis techniques, we annotate the functions and pathways of these differentially expressed genes. The objective is to facilitate the discovery of novel molecular markers for the prevention and diagnosis of GDM, contributing to a deeper understanding of the molecular mechanisms underpinning GDM onset. Materials and Methods General Materials Basic Data of Two Groups of Pregnant Women Through extensive literature research in the early stage, the sample size of this study was determined. Collect 20 GDM pregnant women (GDM group) and 20 normal pregnant women (CG group) who were admitted to the obstetrics department of Tai’an Central Hospital and Tai’an Maternal and Child Health Hospital from May 1, 2023 to May 10, 2023, and randomly select 7 cases each. Placental tissue samples were collected for subsequent sequencing and qRT-PCR validation studies. All participating pregnant women met the diagnostic criteria for “Gestational Diabetes Mellitus (GDM)” as outlined in the 2023 “Diagnosis and Treatment Guidelines for Gestational Diabetes Mellitus” by the American Diabetes Association. They all had full-term, singleton pregnancies, with other pregnancy complications and comorbidities excluded. This study has been approved by the Ethics Committee of Taian Central Hospital and an informed consent form has been signed. Our research complies with the Helsinki Declaration. Basic data information for both groups of pregnant women, including age, gestational weeks, pre-delivery BMI, glycosylated hemoglobin, fasting blood glucose for glucose tolerance, blood glucose 1 hour after glucose tolerance, and blood glucose 2 hours after glucose tolerance, was recorded. Placental Specimen Collection Specimens were collected within 5 minutes of the placenta being delivered. The specific procedure was as follows: on the fetal side of the placenta, avoiding calcified areas, several pieces of placental tissue, each approximately 1 cm³ in size, were excised from different regions of the placenta. These tissue samples were rinsed twice with PBS, excess moisture was removed using filter paper, and they were then placed in cryovials and rapidly frozen in liquid nitrogen. Subsequently, they were stored in a −80 °C freezer. Research Methods RNA Extraction and Quality Assessment Total RNA extraction from the samples was performed using the Tianmo#TR205-200 kit, strictly following the standard operating procedure provided by the kit manufacturer. The extracted RNA was assessed for quality using an Agilent Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, US) to verify RNA integrity. Concentration and purity of the total RNA were determined using the Qubit^® 3.0 Fluorometer (Life Technologies, CA, USA) and Nanodrop One spectrophotometer (Thermo Fisher Scientific Inc, USA). RNA Library Construction RNA library construction was carried out following the experimental workflow, including mRNA isolation, mRNA fragmentation, first-strand cDNA synthesis, second-strand cDNA synthesis, end repair, 3ʹ end adenylation, adapter ligation, size selection, and library enrichment. The concentration of the constructed library was assessed using the Qubit^® 3.0 Fluorometer, and the library size was confirmed using Agilent 2100. Paired-end libraries were synthesized by using the TruSeq™ RNA Sample Preparation Kit (Illumina, USA) following TruSeq™ RNA Sample Preparation Guide. Once the constructed library met the standard criteria, it was ready for subsequent sequencing. Sequencing Cluster Generation Cluster generation and hybridization with the first sequencing primer were performed on the Illumina NovaSeq 6000 (Illumina, USA) sequencer cBot following the standard procedure outlined in the cBot User Guide. Sequencing Process The sequencing reagents were prepared in advance according to the Illumina NovaSeq 6000 User Guide, and the flow cell carrying the clusters was loaded onto the sequencer. Paired-end (PE) sequencing was selected. The sequencing process was entirely controlled by Illumina’s data collection software, which also performed real-time analysis of the sequencing results. Pathogenic genes refer to genes that can cause diseases. These genes have normal physiological functions under normal circumstances, but in some cases, mutations may occur, leading to the occurrence of diseases. In this project, the criteria for selecting differentially expressed genes initially specified an absolute fold change (FC) of ≥2 or ≤0.5 with a q-value <0.05. However, as the number of differentially expressed genes obtained was insufficient, the threshold criteria for selecting differentially expressed genes were subsequently adjusted to FC≥1.5 and p-value <0.05. Functional Annotation and KEGG Pathway Enrichment Analysis of Differentially Expressed Genes To clarify the biological functions of the genes and the involved signaling pathways, we annotated each Gene based on the Gene Ontology and KEGG database. Enrichment calculations were performed using Fisher’s exact test. Further, we also need to conduct GO and pathway enrichment analysis of the genes. The specific principle is to carry out annotation mapping of differentially expressed genes in GO and KEGG database entries, calculate the number of the genes in each GO and pathway entry, and then use hypergeometric test for statistics. Select the GO and KEGG entries that are significantly enriched in the differentially expressed genes. After the calculated p-value was corrected by multiple hypothesis tests, the P-value 0.05 was taken as the threshold, and the GO and KEGG term meeting this condition was defined as the GO and KEGG term significantly enriched in the target genes, Rich Factor, whose calculation formula is: (diff_gene_in_this_pathway/diff_gene_in_all_pathway) (all_gene_in_this_pathway/all_gene_in_all_pathway). The biological processes of Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analyses were carried out by clusterProfiler package of R. The main biological functions and signalling pathways related to the differentially expressed genes were identified. qRT-PCR Validation To validate the reliability of the RNA-Seq analysis, this project uses FC≥1.5 and P<0.05 as screening criteria to obtain differentially expressed genes between the GDM group and the control group. Finally, insulin related molecules LEP and PAPPA2, immune related chemokine CCL21 and metabolism related molecule YWHAB, as well as a partner protein SFN, were selected for subsequent qRT PCR validation. Using GAPDH as an internal reference gene, select Power SYBR^® Green PCR Master Mix (Applied Biosystems) Perform qRT-PCR validation on the reagent kit. The relative expression levels of the selected genes were analysed using the 2^−ΔΔCt method. All samples were subjected to triplicate biological replicates. (1) Design and Synthesis of qRT-PCR Primers Quantitative PCR primers were designed using Primer Premier 6.0 and Beacon designer 7.8 software and synthesized by Shenggong Bioengineering (Shanghai) Co., Ltd. The primer sequences are as follows ([42]Table 1). Table 1. qRT-PCR Primers Information Gene GenBank Accession Primer Sequences (5ʹ to 3ʹ) Size (bp) Annealing (°C) GAPDH [43]NM_002046.5 F:CATGACAACTTTGGTATCGTGGAA 107 60 R:GGCCATCACGCCACAGTTTC LEP [44]NM_000230.2 F:ACCAGACACTGGCAGTCTACCAA 103 60 R:CGTGAAGAAGATCCCGGAGGTT PAPPA2 [45]NM_020318.3 F:CTATCCACTGGCTGCCTCCTAT 149 60 R:GGTCCAATAACCTGAGGAGTCAC YWHAB [46]NM_003404.5 F:CTGAAGTGGCATCTGGAGACAAC 114 60 R:GACGAATTGGGTGTGTAGGCTG SFN [47]NM_006142.5 F:GCTGGACAGCCACCTCATCAA 136 60 R:GGCTGAGTCAATGATGCGCTTC CCL21 [48]NM_002989.4 F:AGGGGCTCAGGACTGTTG 104 60 R:CTGGGATGGAGCAGCCTAA [49]Open in a new tab (2) Real-Time PCR Amplification System and Reaction Conditions ([50]Table 2) Table 2. Quantitative PCR Reaction System and Conditions 20 μL System SDW 8.0 μL Power SYBR^® Green Master Mix 10.0 μL Forward Primer (10 μM) 0.5 μL Reverse Primer (10 μM) 0.5 μL cDNA 1.0 μL [51]Open in a new tab Notes: Reaction conditions: 95 °C, 1 minute; 40 cycles (95 °C, 15 sec, 63 °C, 25 sec, fluorescence collection) 55–95 °C melting point curve. Statistical Analysis Statistical software IBM SPSS 29.0 and GraphPad Prism 9.5.0 were used for analysis. The measurement data is expressed as “mean±standard deviation”, and the differences between groups are analyzed using t-test. P<0.05 indicates that the differences are statistically significant. The relative expression levels of the differentially expressed genes (DEGs) detected by qRT-PCR in both sample groups were calculated using the 2^−ΔΔCt method. Results Comparison of Basic Data Between the Two Groups of Pregnant Women ([52]Table 3) Table 3. Comparison of Basic Data Between Two Groups of Pregnant Women (X±s) Basic Information GDM Group (n=14) Control Group (n=14) t-value P-value Age (years) 27.60±3.34 27.20±2.39 0.31 0.49 Gestational Age (weeks) 38.64±0.90 39.13±0.97 −1.17 0.88 Pre-Delivery BMI (kg/m²) 28.20±2.48 26.99±1.73 1.26 0.13 Glycosylated Hemoglobin (%) 4.76±0.96 4.75±0.88 0.24 0.59 Fasting Blood Glucose for Glucose Tolerance (mmol/L) 4.26±0.45 5.48±0.23 −7.67 0.01 Blood Glucose 1 hour After Glucose Tolerance (mmol/L) 7.49±1.23 10.72±0.17 −8.23 0.00 Blood Glucose 2 hours After Glucose Tolerance (mmol/L) 6.56±1.61 8.11±0.70 −2.79 0.05 Primipara 4 4 0.00 1.00 Multipara 6 6 Cesarean section delivery 6 5 0.20 0.65 Vaginal delivery 4 5 [53]Open in a new tab Data Quality Control Sequencing Results Quality Assessment ([54]Table 4). Table 4. Quality Control Results of Sequencing Data Sample Name Data Output (Gb) Q30 (%) GDM_8 6.55 91.84 GDM_9 8.4 92.02 GDM_10 6.72 91.71 GDM_11 6.33 91.76 GDM_12 6.65 91.60 GDM_13 6.61 91.97 GDM_14 5.92 91.92 CG_8 6.71 91.91 CG_9 6.35 91.69 CG_10 5.12 91.62 CG_11 6.52 91.98 CG_12 6.78 91.95 CG_13 6.48 91.71 CG_14 6.75 92.42 [55]Open in a new tab Notes: Q20=bases of Q≥20/all bases of sequencing, Q30=bases of Q≥30/all bases of sequencing, Q30 is an indicator that is one order of magnitude stricter than Q20. The quality control criteria for data include an output of approximately 5–8 Gb per sample and a proportion of bases with a quality greater than 20 (Q20) not less than 90%. The sequencing results meet these standards and are suitable for subsequent data analysis. According to [56]Table 5, it was observed that the RNA Integrity Number (RIN) was equal to or greater than 6.0 and the 28S/18S ratio was equal to or greater than 0.7. There was no contamination from eukaryotic or prokaryotic organisms, and no DNA or protein contamination was detected. Additionally, an excessive amount of 5S rRNA was absent, demonstrating the good integrity of the RNA. This high-quality RNA can be utilised for transcriptome sequencing and real-time quantitative PCR (qRT-PCR) validation experiments. Table 5. Electrophoresis Quality Inspection Results Table Serial Number Sample Name Concentration (ng/µL) Volume (µL) Total Amount (µg) A260/A280 A260/A230 Electrophoresis RIN 28S/18S 001 GDM_8 192.4 45 8.66 2.03 1.98 N/A 2.7 002 GDM_9 111.0 45 5.00 2.00 1.95 N/A 1.6 003 GDM_10 283.2 45 12.75 2.02 2.10 7.2 1.5 004 GDM_11 170.4 45 7.67 2.01 2.02 7.1 1.8 005 GDM_12 237.9 45 10.71 2.01 1.85 8.3 2.6 006 GDM_13 376.3 45 16.93 2.05 2.02 7.8 2.6 007 GDM_14 137.8 45 6.20 2.01 1.78 6.5 1.8 008 CG_8 265.5 45 11.95 2.04 2.09 6.4 1.7 009 CG_9 154.2 45 6.94 2.01 1.91 6.4 1.8 010 CG_10 14.8 45 0.67 1.97 1.93 6.8 1.2 011 CG_11 194.8 45 8.76 2.03 1.73 6.5 3.0 012 CG_12 53.4 45 2.40 1.97 1.80 7.1 1.2 013 CG_13 250.9 45 11.29 2.00 1.74 7.2 3.3 014 CG_14 163.9 45 7.38 2.05 1.89 8.3 2.6 [57]Open in a new tab Sequence Filtering and Statistics The sequences were subjected to filtering using the fastp software. The primary objectives of this filtering process were to remove the following sequences: 1. Eliminate sequencing primer adapter sequences present in reads. 2. Remove bases at the 3ʹ end of reads with a quality score (Q) lower than 20. 3. Discard reads with a length less than 25 bases. 4. Exclude ribosomal RNA reads from the sequencing target species ([58]Table 6). Table 6. Sequencing Quality and Reads Sample Total Reads Clean Reads Clean Ratio No rRNA rRNA Ratio No rRNA pair CG_10 34102556 33628576 0.986101335 33448810 0.005345632 33448810 CG_11 43473916 42903082 0.986869506 41935970 0.022541784 41935970 CG_12 45233032 44636974 0.986822506 43857798 0.017455843 43857798 CG_13 43198972 42650488 0.987303309 41725764 0.02168144 41725764 CG_14 44974454 44471890 0.988825568 44001316 0.010581381 44001316 CG_8 44749600 44168876 0.987022811 43604974 0.012766954 43604974 CG_9 42364114 41753164 0.985578596 41235768 0.012391779 41235768 GDM_10 44771086 44119146 0.98543837 43105894 0.022966265 43105894 GDM_11 42189878 41637940 0.986917763 41344190 0.007054864 41344190 GDM_12 44307952 43671392 0.985633279 42577230 0.025054434 42577230 GDM_13 44087330 43553538 0.987892394 42771136 0.017964143 42771136 GDM_14 39480014 38932322 0.986127361 37972786 0.024646257 37972786 GDM_8 43634388 43060950 0.986858118 42310460 0.017428552 42310460 GDM_9 56020426 55306228 0.987251114 54713688 0.010713802 54713688 [59]Open in a new tab Notes: In the table, “Clean Reads” represents the number of sequences retained after processing, “Clean Ratio” is calculated as (clean reads)/(raw reads), indicating the ratio of sequences before and after processing. The last three columns depict the status of rRNA removal. Genome Alignment The software employed for sequence alignment in this project was Hisat2, using the GRCh38.102 reference genome version ([60]Table 7). Table 7. The Results of GDM and CG Groups Compared with Reference Genomes Sample Total_Reads Mapped_Reads Pair_Mapped_Reads Single_Mapped_Reads Mapped_Ratio CG_10 33448810 31571591 30460518 1111073 0.943877854 CG_11 41935970 39749780 38485912 1263868 0.947868381 CG_12 43857798 41424195 40103308 1320887 0.94451151 CG_13 41725764 39570933 38243076 1327857 0.948357303 CG_14 44001316 41789439 40493680 1295759 0.949731572 CG_8 43604974 41472579 40152882 1319697 0.951097437 CG_9 41235768 39161213 37826266 1334947 0.9496904 GDM_10 43105894 40803665 39375046 1428619 0.946591318 GDM_11 41344190 39250888 37874624 1376264 0.949368896 GDM_12 42577230 40150704 38834442 1316262 0.943008834 GDM_13 42771136 40684415 39246770 1437645 0.951211934 GDM_14 37972786 36042481 34880220 1162261 0.949166095 GDM_8 42310460 40187587 38891976 1295611 0.949826284 GDM_9 54713688 51897069 50170794 1726275 0.948520761 [61]Open in a new tab Notes: Total reads: Clean reads; Mapped reads refers to the number of reads that have been successfully aligned to the reference genome. Mapped ratio represents the percentage of reads successfully aligned out of the total reads. Pair/single mapped reads indicates the number of reads that align as paired or single reads on the genome. Results of Differential Gene Selection Sample Relationships and Principal Component Analysis In [62]Figure 1A, the depth of color represents the magnitude of the correlation coefficient. The closer the correlation coefficient is to 1, the higher the similarity between samples. [63]Figure 1B illustrates the relationships and variations among all samples computed using principal component analysis. ([64]Supplementary file The raw data for [65]Figure 1B can be found in [66]Supplementary file Excel 1 (A)). Figure 1. [67]Figure 1 [68]Open in a new tab Relationship between Samples and Principal Component Analysis. (A) Relationship diagram between samples. (B) Principal Component Analysis Chart. (The raw data for [69]Figure 1B can be found in [70]Excel 1 (B)). Visualization of Differential Gene Expression In the volcano plot, upregulated genes are represented in red, while downregulated genes are in blue. As depicted in the volcano plot ([71]Figure 2A), differentially expressed genes are evenly distributed on both sides, indicating the reliability of the sequencing results. In the heatmap ([72]Figure 2B), each row represents a gene, and each column represents a sample. A total of 435 differentially expressed genes were identified between the GDM group and the control group, with 128 genes showing upregulation and 307 genes exhibiting downregulation. Figure 2. [73]Figure 2 [74]Open in a new tab Comparative transcriptome analysis of placental tissue between GDM and normal pregnant women. (A) Differential gene volcano map. (B) Differential gene heat map. (C) GO Enrichment Map of Differential Genes in GDM and CG Groups (Top 30). (D) KEGG enrichment map of differential genes between GDM group and CG group (Top 30). Enrichment Analysis of Differentially Expressed Genes in the GDM and Control Groups As shown in [75]Figure 2C, when comparing the GDM group with the control group, differentially expressed genes are primarily enriched in biological processes such as cellular processes, biological regulation, regulation of biological processes, and response to stimuli. The major enrichment in cellular components includes cellular anatomical entities and protein-containing complexes. Additionally, the primary molecular functions enriched consist of binding and catalytic activity. [76]Figure 2D reveals that, when comparing the GDM group with the control group, the differentially expressed genes are primarily enriched in KEGG pathways related to PI3K-Akt signaling pathway, ECM-receptor interaction, staphylococcus aureus infection, and protein digestion and absorption. qRT-PCR Validation Results YWHAB, LEP, CCL21, PAPPA2, and SFN, exhibiting differential expression in the transcriptome sequencing, were selected for qPCR validation using GAPDH as the reference gene. The expression trends of these five genes in placental tissues of pregnant women with gestational diabetes and normal pregnant women were consistent with the transcriptome sequencing results ([77]Figure 3). This confirms the accuracy of the results obtained through transcriptome sequencing in this study and the reliability of the core candidate genes selected. These genes can be utilised for subsequent functional validation and the development of molecular markers. Figure 3. [78]Figure 3 [79]Open in a new tab qRT-PCR verification of differentially expressed genes. *P < 0.05, **P < 0.01, and ***P < 0.001 versus the Control group. Discussion In this study, high-throughput transcriptome sequencing and real-time quantitative PCR (qRT-PCR) validation were performed on placental tissue samples from GDM and normal pregnant women. The results revealed YWHAB, LEP, CCL21, PAPPA2, and SFN may be potential biological markers for the diagnosis of GDM. This provides new insights into the search for novel diagnostic and therapeutic biomarkers for GDM. Pregnancy-associated plasma protein A2 (PAPPA2) is a highly specific metalloproteinase that regulates the dissociation of insulin-like growth factor 3(IGFBP-3) and 5(IGFBP-5) from the ternary complex by proteolytically cleaving them, thereby modulating insulin-like growth factor’s (IGF-I) release from binary and ternary complexes.[80]^7 PAPPA2 is a key regulator of free IGF-I, and a decrease in serum free IGF-I levels is associated with insulin resistance, thus suggesting that the lack of PAPPA2 has an impact on glucose and insulin metabolism. There is ongoing research regarding the effects of PAPPA2 deficiency on glucose metabolism, which has yielded controversial results. Martín-Rivada et al[81]^7 treated two Spanish patients with PAPPA2 deficiency with recombinant human IGF-1 (rhIGF-1) and found that both patients exhibited glucose intolerance during the sixth year of treatment. Insulin tolerance tests showed increased insulin resistance in these patients.[82]^7 On the contrary, Christians et al[83]^8 reported that PAPPA2 deficiency disrupts the balance between circulating IGF-I, IGFBP-3, and IGFBP-5, but their study found that PAPPA2 deficiency did not affect glucose metabolism, whether on a standard or high-fat diet.[84]^8 However, the relationship between PAPPA2 and GDM has not been confirmed yet. In our experiment, highly expressed PAPPA2 was found in placental tissue of GDM pregnant women. This discovery provides new ideas and directions for the diagnosis and treatment of diseases. Tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein beta (YWHAB) is a novel interaction factor for glucagon-like peptide-1 receptor (GCGR), significantly inhibiting glucose production stimulated by glucagon-like peptide-1 in vitro. Hwang et al[85]^9 found that YWHAB is involved in hepatic gluconeogenesis through glucocorticoid receptor-mediated signaling, leading to increased hepatic glucose production.[86]^9 Feng et al,[87]^10 through Western blot, qRT-PCR, and CCK-8 experiments, observed an upregulation of YWHAB expression with increasing glucose concentrations (p<0.05).[88]^10 Ji et al[89]^11 discovered that the loss of YWHAB in primary mouse liver cells increased glucose production by enhancing the action of glucagon-like peptide-1 in liver cells.[90]^11 However, there are few studies on YWHAB and gestational diabetes. In this study, we found for the first time that YWHAB is highly expressed in the tissues of GDM patients, which has potential diagnostic value for the disease. Leptin (LEP), a neuroendocrine hormone involved in food intake, energy expenditure, reproduction, and metabolic disorders, has a direct impact on insulin sensitivity and insulin secretion. Excessive leptin production can enhance the synthesis of pro-inflammatory cytokines, affecting insulin resistance. In a retrospective single-center cohort study, Kapustin et al[91]^12 observed an increase in leptin expression in the placental tissue of GDM patients who did not receive preconception care as well as in those who received insulin treatment.[92]^12 In another retrospective cohort study, West et al[93]^13 found that leptin levels were significantly higher in offspring aged 6–13 years born to 99 GDM mothers compared to 422 non-GDM pregnant women.[94]^13 This study identified a significant upregulation of LEP in the placental tissue of GDM pregnant women, consistent with previous literature reports. Stratifin (SFN) is a highly conserved member of the soluble acidic 14-3-3 protein family and belongs to the sigma subtype. The 14-3-3 proteins are essential regulatory proteins, and their abnormal expression has been shown to lead to adverse immune reactions and chronic inflammation.[95]^14 As a tissue kallikrein-binding protein, SFN has been demonstrated to regulate inflammatory responses in various diseases, including autoimmune and respiratory diseases.[96]^15 Wang et al[97]^15 observed a significant increase in SFN expression levels in patients with acute kidney injury (AKI) and in mice with cisplatin or ischemia-reperfusion-induced AKI. In proximal tubular epithelial cells treated with cisplatin or subjected to hypoxia/reoxygenation (H/R), the researchers found that silencing SFN effectively alleviated cisplatin-induced renal injury and inflammation.[98]^15 Arakawa et al[99]^16 found a significant increase in serum SFN levels in severe COVID-19 patients compared to those with mild or moderate symptoms.[100]^16 This study found that SFN was downregulated in placental tissue of GDM pregnant women compared to normal pregnant women. However, the correlation between SFN and GDM is still unclear and requires further clinical and molecular experiments to verify. In addition, SFN is a partner protein, and future research can identify regulated molecules through protein-protein interactions for further investigation. Recent studies have confirmed the association between the C-C motif chemokine ligand 21 (CCL21) and chronic inflammation. CCL21 induces the activation of immune cells such as dendritic cells and T cells by binding to CCR7, directing them to migrate to peripheral tissues, thereby triggering an inflammatory response. Van et al[101]^17 observed elevated expression of CCR7 ligand CCL21 in samples from rheumatoid arthritis (RA). In a meta-analysis, Li et al[102]^18 found a significant correlation between the CCL21 rs2812378G>A polymorphism and rheumatoid arthritis risk in dominant, recessive, and heterozygous models. Further research revealed that CCL21 could increase the risk of rheumatoid arthritis in the overall population and the Caucasian population.[103]^18 At present, the role of CCL21 in gestational diabetes has not been reported. In this study, we found for the first time that it is highly expressed in placental tissue of GDM pregnant women, which is of great significance for the diagnosis of the disease. This study has certain limitations. Firstly, the sample size of this study is relatively small, and larger sample size repeated experiments need to be conducted in multiple centers in order to more accurately understand the gene expression profile of placental tissue in GDM pregnant women. Secondly, in the process of collecting placental specimens from pregnant women, potential confounding factors were not considered or corrected for their impact on differential gene expression, as well as their further influence on the occurrence of GDM. Genetic factors are important confounding factors that affect the correlation between gene expression and GDM. Differences in genetic background may lead to varying susceptibility of individuals to GDM, thereby affecting gene expression. Environmental factors such as diet, lifestyle, and medication use can also affect gene expression, thereby influencing the occurrence of GDM. GDM is a chronic low-grade inflammation in pathological and physiological processes. Inflammatory factors may exacerbate insulin resistance through multiple mechanisms and participate in the pathogenesis of GDM. Therefore, inflammatory response is also an important confounding factor affecting the correlation between gene expression and GDM. Other chronic diseases such as hypertension may also indirectly affect the occurrence of GDM by affecting gene expression. There may be complex interactions between these chronic diseases and GDM, leading to the introduction of confounding factors. By understanding these confounding factors and their impact mechanisms on GDM, we can more comprehensively evaluate the relationship between gene expression and GDM, thereby more accurately diagnosing and treating GDM. In addition, the key pathogenic genes identified in this study have not yet been functionally validated at the cellular level, and the exploration of their molecular mechanisms has not been completed. Therefore, the use of YWHAB, LEP, CCL21, PAPPA2, and SFN genes as clinical diagnostic markers for GDM still has certain limitations. The main innovation of this study is to use high-throughput transcriptome sequencing technology to screen differentially expressed genes in placental tissue of GDM pregnant women, discover more unknown genes highly correlated with GDM, and elucidate the main biological functions and possible signaling pathways of these differentially expressed genes. In the future, our research team will screen out pathogenic genes that may be related to GDM, and explore the relationship between changes in the expression levels of relevant pathogenic genes and the onset of GDM through RT-PCR, Western Blot, and related cell function experiments. Collect placental specimens from patients undergoing cesarean section in our hospital, use enzyme digestion combined with tissue adhesion method to obtain HTR-8/SVneo cells, and subculture them. Analyze and identify the morphology and immunophenotype of cells cultured through passage. By upregulating and inhibiting the expression of differential genes in human chorionic trophoblast cells (HTR-8/SVneo), this study aims to investigate their effects on the proliferation, apoptosis, and migration ability of placental trophoblast cells, and explore their possible mechanisms of action in GDM. Conclusions In summary, we found that the mechanism of GDM may be related to the abnormal expression of YWHAB, LEP, CCL21, PAPPA2, and SFN. This study failed to explore in depth the specific mechanisms of differential genes involved in disease progression. In the future, further animal and cell experiments will be conducted to improve the findings of this study and provide a richer theoretical basis for the diagnosis and clinical treatment of GDM. Funding Statement This study was supported by the Science and Technology Development Plan Project of Tai’an City of China, No. 2022NS289 (to SY). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Disclosure The author(s) report no conflicts of interest in this work. References