Abstract Preeclampsia (PE) is the serious obstetric-related disease characterized by newly onset hypertension and causes damage to the kidneys, brain, liver, and more. To investigate genes with key roles in PE’s pathogenesis and their contributions, we used a microarray dataset of normotensive and PE patients and conducted a weighted gene co-expression network analysis (WGCNA). Cyan and magenta modules that are highly enriched with differentially expressed genes (DEGs) were revealed. By using the molecular complex detection (MCODE) algorithm, we identified five significant clusters in the cyan module protein–protein interaction (PPI) network and nine significant clusters in the magenta module PPI network. Our analyses indicated that (i) human accelerated region (HAR) genes are enriched in the magenta-associated C6 cluster, and (ii) positive selection (PS) genes are enriched in the cyan-associated C3 and C5 clusters. We propose these enriched HAR and PS genes, i.e., EIF4E, EIF5, EIF3M, DDX17, SRSF11, PSPC1, SUMO1, CAPZA1, PSMD14, and MNAT1, including highly connected hub genes, HNRNPA1, RBMX, PRKDC, and RANBP2, as candidate key genes for PE’s pathogenesis. A further clarification of the functions of these PPI clusters and key enriched genes will contribute to the discovery of diagnostic biomarkers for PE and therapeutic intervention targets. Keywords: preeclampsia, evolutionary analysis, human accelerated region, positive selection, pathway analysis 1. Introduction Preeclampsia (PE) is a serious complication of pregnancy that affects an estimated 2–8% of pregnancies worldwide [[30]1,[31]2,[32]3]. It is characterized by newly onset hypertension, often proteinuria and causes damage to multiple organs, especially the kidneys, brain, and liver [[33]4]. Pathologic changes in women who experience PE and in their fetuses have been reported to lead to higher risks of metabolic, cardiovascular, and renal diseases later in life [[34]5,[35]6,[36]7,[37]8,[38]9,[39]10,[40]11,[41]12]. The precise pathogenesis of PE is not yet known, and the only treatment that is currently available for PE is simply the delivery of the fetus [[42]4,[43]13,[44]14,[45]15]. Microarray analyses can identify differentially expressed genes (DEGs) in investigations of the pathogenesis of PE. For example, a microarray analysis revealed that the genes FLT1 (fms-like tyrosine kinase 1) and ENG (endoglin) are involved in the pathogenesis of PE [[46]16,[47]17]. However, there is little consensus among several microarray studies, and the DEGs for PE have not been fully defined [[48]18,[49]19]. Microarray analyses have provided a foundation for understanding PE’s pathogenesis, but a system-level analysis that focuses on gene sets could be more informative. A weighted gene co-expression network analysis (WGCNA) focuses on sets of genes that are not among those identified in observed gene expression data [[50]20]. With a WGCNA (which is network-based), critical gene modules and co-expression networks can be screened in datasets (such as microarray and RNAseq datasets). Microarray analyses using a large number of samples of decidual tissues have been used to investigate the pathogeneses of PE [[51]21], and several researches using WGCNA analysis could be found [[52]22,[53]23,[54]24], however, the exact pathogenesis of PE remains unknown. Comparative evolutionary analyses of human genes have also contributed to the discovery of key disease genes [[55]25]. For example, although Alzheimer’s disease (AD) is common in humans, it is extremely rare in other mammals, which suggests the possibility that genetic changes that have taken place over the course of evolution are related to humans’ vulnerability to this disease. Bufill et al. [[56]26] proposed that an evolutionary approach can be used to combine data from different disciplines that appear to be unrelated in a manner that could clarify complex diseases. The insights obtained from evolutionary analysis of human genes have already proven their value for the discovery of key disease genes [[57]25,[58]27]. PE does not seem to occur in species other than humans [[59]4], and we thus speculated that approaching PE from an evolutionary standpoint could help identify the cause(s) of PE. To gain in-depth insights into the pathogenesis of PE, we conducted a WGCNA in the present study to determine the status of PE-related gene modules, and we performed a system-level analysis by integrating a protein–protein interaction (PPI) network analysis, a sub-network cluster analysis, and a gene ontology (GO) and reactome pathway enrichment analysis. We applied an evolutionary approach and evolutionary data for both human accelerated region (HAR) and positive selection (PS) genes. 2. Materials and Methods 2.1. Microarray Data We downloaded the microarray data of Yong et al. ([60]GSE60438) from the Gene Expression Omnibus (GEO) database ([61]https://www.ncbi.nlm.nih.gov/gds (accessed on 4 April 2018)). The [62]GSE60438 data, based on the platform of the Illumina Human WG-6 v3.0 Expression BeadChip, contains 23 decidual basalis samples from normotensive samples and 25 from PE samples in 34 PE/eclampsia pedigrees. The mean gestational age (weeks) of normotensive and PE patients are 39.0 ± 0.859 and 32.0 ± 3.48, respectively, and the mean maternal age (years) of normotensive and PE patients are 31.4 ± 4.34 and 30.2 ± 4.71, respectively. 2.2. Differentially Expressed Gene Screening To screen for DEGs, we first normalized the raw signal intensities by using Chipster software [[63]28]. We conducted a significance analysis of microarrays in order to identify DEGs; we calculated the fold changes (FCs) in the individual genes’ expressions, and those with a p-value < 0.05 and |log[2]FC| > 1.5 were considered significant. We observed significant differences in gene expressions between the normotensive and PE patients. For each individual gene, as a measure of the gene’s differential expression between the normotensive and the PE patients, we calculated the gene significance (GS) as the −log[10] of the p-value obtained by Student’s t-test. We also determined the module significance (MS) as the mean GS value for all of the module genes. 2.3. WGCNA In order to identify co-expression modules associated with PE and their key genes, we performed a WGCNA [[64]20]. The WGCNA package implemented in the R program was used to build an unsigned weighted gene co-expression network based on the expression value of 48 microarray datasets, and the WGCNA package’s blockwiseModules function was applied to build the network. The similarity matrix between each pair of genes across all samples was calculated based on its Pearson’s correlation value. Then, the similarity matrix was transformed into an adjacency matrix. Afterward, the ‘topological overlap matrix (TOM)’ and the ‘corresponding dissimilarity (1-TOM)’ value were calculated. Finally, a dynamic tree cut algorithm was employed to detect gene co-expression modules. We set the minimum module size of 30 genes to obtain appropriate modules, and for merging modules, we used 0.25 as the minimum height (default settings). The WGCNA results revealed 21 modules, each of which can be summarized by a single representative expression profile (referred to hereafter as the ‘module eigengene (ME)’). The ME of a module is defined as the first principal component of the module and is a comprehensive representation of the relationships among the gene co-expressions in each network. For our assessment of the relationship between each of the modules and PE, we determined the correlations of the eigengenes in various PE samples. The module eigengene-based intramodular connectivity measure kME roughly approximates the standard intramodular connectivity (kIN). The kME value is the Pearson’s correlation between a given ME and a specific gene’s expression level. The kME (also known as module membership) was calculated to identify the genes that are significantly connected in the module. 2.4. Gene Annotation and Enrichment Analysis We used the Cytoscape plugin ‘ClueGO’ to identify the gene ontology (GO) terms that are significantly associated with the modules revealed by the WGCNA [[65]29]. A GO term analysis was used to identify molecular functions and biological processes, with the 100 top-ranked genes vis-à-vis the kME value of the modules revealed by the WGCNA. The ClueGO parameters were as follows: (i) a minimum GO-level interval of 3 and a maximum interval of 8, (ii) a minimum of three genes per category, (iii) 4% gene association, and (iv) a kappa score threshold ≥ 0.4. The statistical analyses included Bonferroni step-down false discovery rate (FDR) correction for enrichment against the ClueGO human reference genome [[66]30]. Probability values < 0.05 were considered significant. To investigate gene clusters from a human PPI network analysis, we conducted a reactome pathway analysis and an analysis of GO term enrichment, using clusterProfiler ([67]http://bioconductor.org/packages/release/bioc/html/clusterProfiler .html (accessed on 12 January 2022)) with a strict FDR cut-off, i.e., <0.05 [[68]31]. The clusterProfiler package includes a compareCluster function, which we used to determine and compare the enriched functional categories of each gene cluster. By using the gene ratios (i.e., the proportions of genes enriched in each category) and adjusted p-values, we constructed a dot plot that depicts the between-cluster differences in enriched functional categories. 2.5. Construction of a Human Protein–Protein Interaction (PPI) Network For the initial dataset, we downloaded protein interaction data from the iRefIndex database ([69]http://irefindex.org (accessed on 24 December 2018)). This database is a union of several primary PPI databases, including the MPPI, MPIDB, MPact, MINT, MatrixDB, IntAct, InnateDB, HPRD, DIP, CORUM, BioGRID, and BIND databases [[70]32]. We integrated PPIs from the large-scale BioPlex 2.0 interaction dataset in order to increase the confidence and the completeness of the PPI network [[71]33]; this dataset was obtained by high-throughput affinity purification mass spectrometry. We also incorporated interaction data other than direct physical bindings (e.g., protein phosphorylation) from an earlier study [[72]34] into our dataset. We filtered the final dataset to remove interactions from non-human sources; we excluded redundant and self-interacting pairs. The final result was a human protein–protein interaction (PPI) network containing a total of 22,616 nodes and 515,015 edges. 2.6. PPI Subnetworks among Proteins Encoded by PE-Associated Genes We mapped the member genes in the cyan and magenta modules as seed genes to the above-described human PPI network and then used Cytoscape ver. 3.8.2 to extract the maximal connected component as the protein-interaction subnetwork [[73]35]. We used a plugin implemented in Cytoscape, i.e., the molecular complex detection (MCODE) clustering tool [[74]36], to analyze the resultant network and identify highly interconnected PPI clusters. We applied 0.5 as the fluff density cutoff, 2 as the K-core, 0.2 as the node score cutoff, and maximum depth up to 100. The Cytoscape-plugin CytoHubba was used to evaluate hub genes in the PPI networks [[75]37]. We calculated the following to identify potential hub genes: the node degree, the betweenness centrality, the maximal clique centrality (MCC), and the maximum neighborhood component (MNC) [[76]37,[77]38]. 2.7. Enrichment Analysis of HAR and PS Genes We retrieved the HAR genes from Doan et al.’s study [[78]27], which describes their comparative genome analysis identifying conserved genomic loci with elevated divergence in humans; 2737 HARs were identified, representing 2163 HAR-associated genes. We next downloaded the lists of genomic regions showing PS signatures based on population genetic statistics obtained from the PopHumanScan database [[79]39] in order to identify recent PS signatures inferred from human polymorphism data. Although the PopHumanScan database includes the integration of eight different population genetic statistics for 22 non-admixed Phase III human populations of the 1000 Genomes Project for detecting selective sweeps at different historical ages, in the present study, we used only positively selected regions that overlapped with protein-coding sequences as PS genes. We performed an enrichment analysis using the sets of both HAR genes and PS genes using the one-sided Fisher’s exact test in the R program’s fisher.test function. 3. Results 3.1. WGCNA Results The expression matrices of 19,305 genes were obtained from the 48 samples after the data preprocessing. We grouped the genes with similar expression patterns into modules by using the R program’s WGCNA package. Using the protocol described in the Materials and Methods section, we identified 21 modules. [80]Figure 1 provides the numbers of genes in each module. We used two more network metrics to identify PE-associated modules: the GS and MS values. The GS was calculated as the −log[10] transformation of the p-value for each gene; we used Student’s t-test to measure the strength of the differential gene expressions. The MS was calculated as the average GS value for all genes within each module. Cyan and magenta modules had the two top-ranked MS values ([81]Figure 2), and each of these modules was enriched for genes that were differentially expressed between normotensive and PE patients as shown by an increased MS value. We thus focused on the cyan and magenta modules and evaluated each of the modules’ member genes. Figure 1. [82]Figure 1 [83]Open in a new tab The results of the weighted gene co-expression network analysis (WGCNA) of the transcriptome in the normotensive and PE groups. In the clustering dendrogram with genes, the module membership is represented by colors. The numbers of genes in each module are also shown. Twenty-one modules were identified. Figure 2. [84]Figure 2 [85]Open in a new tab Each module’s MS (module significance) value. Both the cyan and magenta modules were highly enriched with differentially expressed genes (DEGs). Columns: mean values. Bars: standard error of the mean (SEM). The ME expression profiles for the cyan and magenta modules in the normotensive and PE patients are depicted in [86]Figure 3. In both modules, the eigengene expression was increased in the PE patients. For a further clarification of the relationship between the modules and PE, we determined the mean ME value and used the Student’s t-test to compare the normotensive group to the PE group. As shown in [87]Figure 4, the ME value of both the cyan module and the magenta module was significantly greater in the PE group compared to the normotensive group. Together, these results indicate that the genes in the cyan and magenta modules were mostly upregulated in the PE group. Figure 3. [88]Figure 3 [89]Open in a new tab The eigengene expression patterns for the cyan (A) and magenta (B) modules in the normotensive and PE groups. Figure 4. [90]Figure 4 [91]Open in a new tab The module eigengene (ME) expressions of the cyan (A) and magenta (B) modules. Columns: mean values. Bars: SEM. 3.2. GO Analysis We used the ‘biological process’ and ‘molecular function’ categories for genes in the cyan and magenta modules for the GO analysis by using the Cytoscape plugin ClueGO, which provides a functional annotation map for module member genes in which genes are enriched corresponding to their respective GO terms and pathway. The results of this analysis demonstrated that the cyan module is associated mostly with protein stabilization, ribosomal small subunit biogenesis, the positive regulation of ATPase activity, endopeptidase complex, Ran GTPase binding, and aerobic respiration ([92]Figure 5A), and the magenta module is associated mostly with the cytosolic proteasome complex, protein localization to the chromosomes, the telomeric region, the endoplasmic reticulum-Golgi intermediate compartment membrane, and the regulation of muscle adaptation ([93]Figure 5B). [94]Table 1 provides the complete lists of the significantly enriched GO terms of the cyan and magenta modules and their associated genes. Figure 5. [95]Figure 5 [96]Open in a new tab The gene annotation enrichment analysis results for the ‘biological process and molecular function’ GO categories for the cyan (A) and magenta (B) modules. A functionally grouped network is shown. The terms are represented as nodes that are linked based on the terms’ kappa scores levels (≥0.4). Table 1. List of significant GO terms for the cyan and magenta modules identified by the WGCNA. Module Term p−Value Count Genes Cyan GO:0050821-protein stabilization 2.66 × 10^−5 7 [ATP1B3, NLK, PPIB, PTGES3, RPS7, SUMO1, TCP1] GO:0061077-chaperone-mediated protein folding 3.93 × 10^−4 4 [FKBP3, PPIB, PTGES3, TCP1] GO:0042274-ribosomal small subunit biogenesis 4.39 × 10^−4 4 [DCAF13, RPS7, RPSA, UTP3] GO:0008536-Ran GTPase binding 6.45 × 10^−4 3 [RANBP1, RANBP2, RANGRF] GO:0030684-preribosome 6.58 × 10^−4 4 [DCAF13, RPS7, RPSA, UTP3] GO:0000462-maturation of SSU-rRNA from tricistronic rRNA transcript (SSU-rRNA, 5.8S rRNA, LSU-rRNA) 7.64 × 10^−4 3 [DCAF13, RPSA, UTP3] GO:0032040-small-subunit processome 8.28 × 10^−4 3 [DCAF13, RPS7, UTP3] GO:0003755-peptidyl-prolyl cis-trans isomerase activity 1.20 × 10^−3 3 [FKBP3, PPIB, RANBP2] GO:0000413-protein peptidyl-prolyl isomerization 1.20 × 10^−3 3 [FKBP3, PPIB, RANBP2] GO:0016859-cis-trans isomerase activity 1.38 × 10^−3 3 [FKBP3, PPIB, RANBP2] GO:0022627-cytosolic small ribosomal subunit 2.11 × 10^−3 3 [MCTS1, RPS7, RPSA] GO:0030490-maturation of SSU-rRNA 2.23 × 10^−3 3 [DCAF13, RPSA, UTP3] GO:0032781-positive regulation of ATPase activity 2.49 × 10^−3 3 [ATP1B3, DNAJB4, SUMO1] GO:0018208-peptidyl-proline modification 2.90 × 10^−3 3 [FKBP3, PPIB, RANBP2] GO:0000502-proteasome complex 4.98 × 10^−3 3 [PSMA3, PSMD14, TXNL1] GO:1905369-endopeptidase complex 5.18 × 10^−3 3 [PSMA3, PSMD14, TXNL1] GO:0009060-aerobic respiration 5.61 × 10^−3 3 [FH, SDHD, UQCRC2] GO:0043462-regulation of ATPase activity 6.28 × 10^−3 3 [ATP1B3, DNAJB4, SUMO1] Magenta GO:0031597-cytosolic proteasome complex 3.33 × 10^−5 3 [IDE, PSMC4, PSMC6] GO:0070198-protein localization to chromosome, telomeric region 4.90 × 10^−4 3 [ATRX, GNL3, POT1] GO:0032508-DNA duplex unwinding 6.27 × 10^−4 4 [ATRX, DHX9, GTF2H3, POT1] GO:0032392-DNA geometric change 1.03 × 10^−3 4 [ATRX, DHX9, GTF2H3, POT1] GO:1905368-peptidase complex 1.26 × 10^−3 4 [IDE, PSMC4, PSMC6, USP22] GO:0032206-positive regulation of telomere maintenance 2.49 × 10^−3 3 [ATRX, GNL3, POT1] GO:0012507-ER to Golgi transport vesicle membrane 4.03 × 10^−3 3 [LMAN1, TMED10, TMED7] GO:0000502-proteasome complex 4.98 × 10^−3 3 [IDE, PSMC4, PSMC6] GO:1905369-endopeptidase complex 5.18 × 10^−3 3 [IDE, PSMC4, PSMC6] GO:0033116-endoplasmic reticulum-Golgi intermediate compartment membrane 5.39 × 10^−3 3 [LMAN1, TMED10, TMED7] GO:0051436-negative regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle 5.83 × 10^−3 3 [ANAPC7, PSMC4, PSMC6] GO:0051439-regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle 6.05 × 10^−3 3 [ANAPC7, PSMC4, PSMC6] GO:0043502-regulation of muscle adaptation 6.28 × 10^−3 3 [GLRX3, GTF2I, TWF1] [97]Open in a new tab 3.3. PPI Analysis and the Identification of Densely Connected Clusters We mapped the 319 genes in the cyan module and the 524 genes in the magenta module to the human PPI network. The genes of the cyan module’s retrieved PPI contained 158 nodes and 376 edges, and the genes of the magenta module’s retrieved PPI contained 298 nodes and 692 edges ([98]Figure 6A,B). It is known that disease-associated genes tend to interact and work together in the same biological cluster in a molecular interaction network [[99]40], and we suspected that the detection of such clusters of interactions could help identify the key pathways and genes in PE and elucidate the molecular mechanisms underlying its etiology. To determine whether the cyan and magenta modules’ genes form any highly connected molecular clusters, we used the MCODE plugin with tuned settings—instead of default settings—to identify pathway-like clusters in the cyan and magenta PPI network (see the Materials and Methods). The MCODE analysis revealed five densely connected clusters (C1 to C5) with 4–47 genes in the cyan module and nine densely connected clusters (C1 to C9) with 4–35 genes in the magenta module ([100]Supplementary Table S1). Figure 6. [101]Figure 6 [102]Open in a new tab The preeclampsia (PE) protein–protein interaction (PPI) network constructed with the cyan (A) and magenta (B) module genes in the WGCNA. The genes of the cyan and magenta modules’ retrieved PPIs contained 158 and 298 nodes and 376 and 692 edges, respectively. The size of each node is proportional to its degree. The colors of the nodes (from lighter to darker) are proportional to their kME values. To determine the specific biological relevance of the C1 to C5 clusters among the cyan module’s genes and the C1 to C9 clusters among the magenta module’s genes, we performed a GO analysis on the ‘biological process’ enrichment. As shown in [103]Figure 7, in the cyan-associated C1–C5 clusters, the respective GO biological processes of cytoplasmic translation, rRNA metabolic process, RNA splicing, clathrin coat assembly, and nucleotide-excision repair were identified. In the magenta-associated C1–C9 clusters, the following respective GO biological processes were identified: RNA splicing, transesterification reactions with bulged adenosine as the nucleophile, cytoplasmic translation, nucleotide-excision repair, proteasome-mediated ubiquitin-dependent protein catabolic protein, aerobic electron transport chain, translational initiation, protein localization to the centrosome, macroautophagy, and Rac protein signal transduction ([104]Figure 7). Overall, GO terms for RNA processing and protein processing were prominent in these clusters. [105]Supplementary Table S2 provides the complete lists of the significantly enriched GO biological processes and their associated proteins. Figure 7. [106]Figure 7 [107]Open in a new tab The results of the GO biological process enrichment analysis of the genes identified in the cyan-associated and magenta-associated PPI clusters. The most over-represented GO terms are depicted as dot plots; the gene ratio is denoted by size, and the significance is indicated by color. 3.4. Enrichment of HAR and PS Genes in the PPI Clusters Human accelerated regions (HARs) were human genome regions that had diverged over the past five to six million years after the divergence of humans and chimpanzees from their last common ancestor [[108]41]. Several HAR-associated genes identified in comparative genomic studies are linked to various human traits and to human-specific diseases, such as autism spectrum disorder [[109]27,[110]42]. Disease-related gene groups have also shown a strong propensity to contain PS genes, as inferred from human polymorphism data [[111]43]. We thus used the HAR genes and PS genes we retrieved from earlier investigations [[112]27,[113]44] to conduct a search of each PPI cluster, and we then used these HAR and PS gene sets to evaluate the enrichment of the clusters. As shown by the −log[10] (p-value) in [114]Figure 8A, the cyan-associated C3 and C5 clusters showed a strong propensity to contain PS genes. Of the nine magenta-module PPI clusters, the C6 cluster was the most significantly enriched for the HAR gene dataset ([115]Figure 8B). The PS genes identified in the cyan module were CAPZA1, DDX17, PSPC1, SRSF11, and SUMO1 in the C3 cluster, and MNAT1 and PSMD14 in the C5 cluster. The HAR genes identified in the magenta C6 cluster were EIF4E, EIF3M, and EIF5. In [116]Figure 9, these genes are represented by the node color (pale red for the PS genes and light green for the HAR genes). The full name of the enriched genes in the clusters are provided in [117]Table 2. All of the data of the cyan and magenta cluster genes overlapping with the HAR and PS genes are given in [118]Supplementary Table S3. Figure 8. [119]Figure 8 [120]Open in a new tab Densely connected clusters in the cyan (A) and magenta (B) module PPI network. With the use of the MCODE plugin, five significantly connected PPI clusters were extracted from the cyan module PPI network, and nine significantly connected PPI clusters were extracted from the magenta module PPI network. Cluster-level enrichment was assessed for two evolutionary gene sets: HAR genes and PS genes. In each cell, the shades of blue represent the −log[10] (p-value) for the significance of the overlap, obtained by Fisher’s exact test. Figure 9. [121]Figure 9 [122]Open in a new tab PPI clusters enriched with HAR genes and PS genes: (A) The cyan-associated C3 cluster is enriched with PS genes. The functional enrichment analysis (FEA) results of the member genes are shown. (B) The cyan-associated C5 cluster is enriched with PS genes. The results of the FEA of the member genes are presented. (C) The magenta-associated C6 cluster is enriched with HAR genes. The FEA results are for the member genes. The PS genes are shown in light red and the HAR genes are shown in light green in the network. We used clusterProfiler to determine the reactome pathways for each gene set. The top five reactome pathways are represented by bar plots. The gene count is denoted by the bar’s length, and significance is indicated by color. The Benjamini–Hochberg method was used to adjust the p-values. Table 2. HAR genes and PS genes detected in the cyan module C3 and C5 clusters and the magenta module C6 cluster. Scheme Gene Full Name Module Cluster Degree Betweenness HAR Gene PS Gene DDX17 DEAD-Box Helicase 17 cyan C3 13 533.6621 O SRSF11 Serine and Arginine Rich Splicing Factor 11 cyan C3 13 785.37203 O PSPC1 Paraspeckle Component 1 cyan C3 5 364.12826 O SUMO1 Small Ubiquitin-like Modifier 1 cyan C3 4 620 O CAPZA1 Capping Actin Protein of Muscle Z-Line cyan C3 2 206.9429 O Subunit α 1 PSMD14 Proteasome 26S Subunit, Non-ATPase 14 cyan C5 8 2130.39568 O O MNAT1 Menage A Trois 1 cyan C5 5 226.87372 O EIF4E Eukaryotic Translation Initiation Factor 4E magenta C6 7 354.64491 O EIF5 Eukaryotic Translation Initiation Factor 5 magenta C6 4 48.22838 O EIF3M Eukaryotic Translation Initiation Factor 3 Subunit M magenta C6 4 617.89841 O [123]Open in a new tab Degree: numbers of connection in PPI, Betweenness: the value of betweenness centrality in PPI. For the determination of the specific biological relevance of the cyan-associated C3 and C5 clusters and the magenta-associated C6 cluster, we subjected 38 genes in the C3 cluster, eight genes in the C5 cluster, and four genes in the C6 cluster to reactome pathway enrichment analyses. The results demonstrated that the cyan-associated C3 cluster is associated mainly with aspects of RNA metabolism, such as the processing of capped intron-containing pre-mRNA, an mRNA splicing major pathway, mRNA splicing, and mRNA 3′-end processing. This cluster is also associated with a gene expression pathway, i.e., RNA polymerase II transcription termination ([124]Figure 9A). Regarding the C5 cluster, the significant reactome pathways were associated mainly with aspects of cell-cycle regulation, such as cyclin E-associated events during G1/S transition, cyclin A:Cdk2-associated events at S phase entry, and the G1/S transition ([125]Figure 9B). In contrast, the magenta-associated C6 cluster was significantly enriched for genes involved in protein translation, such as ribosomal scanning and start codon recognition, GTP hydrolysis and joining of the 60S ribosomal subunit, eukaryotic translation initiation, and cap-dependent translation initiation ([126]Figure 9C). [127]Supplementary Table S4 provides the complete lists of the significantly enriched reactome pathways and the associated proteins. As noted above, we used the Cytoscape-plugin Cytohubba to perform a hub gene analysis in order to identify high-significance genes that have key roles in the PPI networks [[128]37]. As shown in [129]Table 2, DDX17, SRSF11, and PSMD14 in the cyan-associated C3 and C5 clusters and EIF4E in the magenta-associated C6 cluster showed higher node-degree values. In addition, among all the nodes in other cyan- and magenta-associated clusters, heterogeneous nuclear ribonucleoprotein A1 (HNRNPA1) in the cyan C1 cluster was identified as the most connected node overlapping with the PS gene set. Genes including RNA binding motif protein X-linked (RBMX) overlapping with the HAR gene set and protein kinase, DNA-activated, catalytic subunit (PRKDC) and Ran-binding protein 2 (RANBP2) overlapping with the PS gene set also showed higher node-degree values ([130]Supplementary Table S5). In these high node-degree genes, the hub gene parameter of betweenness centrality values were also extremely high ([131]Table 2, [132]Supplementary Table S5). Together, the above-described findings indicate that the pathways associated with the cyan and magenta clusters may be critically affected by the defects of these highly connected hub genes, which are possible key genes in the pathogenesis of PE. 4. Discussion Inferring functional relationships by analyzing gene co-expression networks has been fruitful in studies of various disease-related gene functions [[133]45,[134]46]. In the present study, we conducted a WGCNA to identify PE-associated co-expression modules from publicly available gene expression data, and the WGCNA revealed two co-expression modules, cyan and magenta, which are highly enriched for genes differentially expressed between normotensive patients and PE patients. We conducted a GO term enrichment analysis for both modules and observed that the cyan module was enriched mainly for genes involved in protein stabilization and chaperone-mediated protein folding, whereas the magenta module was enriched for genes involved in the cytosolic proteasome complex, protein localization to the chromosome, and the telomeric region. Each of these biological processes is known to be involved in intracellular protein processing. Interestingly, preeclamptic placentas were described as having accumulated a number of misfolded proteins, and it was speculated that this accumulation may contribute to the pathophysiology of PE [[135]47]. It was also reported that hypoxia, ischemia, and the production of proinflammatory cytokines associated with PE lead to protein misfolding and initiate endoplasmic reticulum (ER) stress [[136]48,[137]49,[138]50]. Furthermore, Zhou et al. recently showed that the genes with upregulated expression in syncytiotrophoblast (SCT) were involved in protein folding by using single-cell RNA sequencing [[139]51]. These observations suggest that cyan and magenta module genes are related to some extent to the functions that regulate protein processing and folding, and thus these genes may contribute to the PE risk during placenta development and maintenance. Our findings indicate that the identified cyan and magenta modules would be useful for investigating the molecular mechanisms of the development of PE. Many PE-associated genes and pathways have been reported [[140]52,[141]53], but it is unlikely that the etiology of PE would be explained by any one of these factors; it is highly probable that multiple associated genes act together. Expanding our knowledge of the inter-connectivity of PE-associated genes in functional pathways is thus crucial to fully elucidate the mechanisms underlying PE. In the present study, a PPI network with genes contained in the cyan and magenta modules was constructed, and because the identification of small functional clusters that might contain proteins that participate in similar biological processes would help clarify PE’s functional pathways, we used the MCODE clustering tool in Cytoscape to extract densely connected protein clusters in the PPI network. Five significant functional clusters were identified in the cyan module, and nine significant functional clusters were identified in the magenta module. The results of the GO enrichment analysis showed that (i) several biological functions related to protein processing are highly enriched in clusters associated with the cyan and magenta modules, but (ii) GO terms for RNA processing were linked mainly to clusters in the cyan module. These results indicate an efficient extraction of the protein clusters by the MCODE plugin and suggest that these clusters may form biological entities or pathways that are involved in PE. To evaluate signatures of evolutionary forces acting on the MCODE-detected clusters, we performed an enrichment analysis for the sets of HAR genes and PS genes. The divergence of HAR genes between humans and other primate species was proposed to reflect the genes’ possible roles in the evolution of human traits (e.g., cognitive ability) [[142]25,[143]27]. It was also suggested that evolutionary pressure on HAR genes may have preferentially elicited human-specific functions [[144]44]. In contrast, PS genes that were inferred by using human polymorphism data were described as reflecting humans’ recent adaptations to a wide variety of new environments [[145]54]. Such an evolutionary scenario might indicate roles for PS genes in achieving human adaptability to local environments. We observed that HAR genes were mainly enriched in the magenta-associated C6 cluster. The HAR genes identified in this cluster are directly related to several pathways involved in protein translation. For example, EIF4E in the C6 cluster is involved in protein synthesis and a key checkpoint in the control of the rate of mRNA translation [[146]55], and EIF4E has a critical role in the pathology of various types of cancer [[147]56,[148]57]. Consistent with these findings, it has been recently shown that the genes with upregulated expression in extravillous trophoblast (EVT) were involved in translation by using single-cell RNA sequencing [[149]51]. In contrast, we observed that PS genes were mainly enriched in the cyan-associated C3 and C5 clusters. DDX17 and SRSF11 are PS genes detected in the C3 cluster, and they showed higher node degrees in the PPI network. DDX17 plays an important role in various RNA-related functions, such as the processing of primary microRNA transcripts and pre-mRNA alternative splicing [[150]58,[151]59]. SRSF11 gene is required for the recruitment of telomerase to telomeres and telomere elongation [[152]60]. We identified PSMD14 (a component of the 19S regulatory cap in 26S proteasome) in the C5 cluster, and it also showed higher node degree. PSMD14 is known to be associated with several types of cancer [[153]61,[154]62,[155]63] and was suggested to be a hub gene in PE [[156]64]. These results are consistent with the possibility that HAR genes and PS genes function as key players in the PPI clusters obtained from the PE-associated cyan and magenta modules. While HAR and PS genes seem to have conferred fitness advantages during some stages of humans’ evolutionary history [[157]27,[158]65], it has also been speculated that these genes increase the risk of human-specific disorders via antagonistic pleiotropy [[159]66,[160]67]. PE appears to be a human-specific disease [[161]4], at least in part because the highest ratio of brain/body weight of the human fetus among the primate species requires an extremely high level of nutritional exchange from the mother [[162]68]. It is thus possible that the HAR and PS genes we identified in the cyan- and magenta-associated clusters have critical roles in the pathology of PE. Collectively, the misregulation of the HAR genes (EIF4E, EIF5, and EIF3M) and the PS genes (DDX17, SRSF11, PSPC1, SUMO1, CAPZA1, PSMD14, and MNAT1) in the enriched clusters is more likely to underlie the molecular pathways that are involved in the development of PE ([163]Table 2). New insights into the molecular mechanisms underlying the development of PE may be obtained by further analyses of the HAR and PS genes in the clusters derived from WGCNA modules, and such insights may help identify targets for PE treatment. In summary, we primarily used WGCNA analysis and an evolutionary approach to investigate the pathway and key genes associated with PE. We propose that these enriched HAR and PS genes (EIF4E, EIF5, EIF3M, DDX17, SRSF11, PSPC1, SUMO1, CAPZA1, PSMD14, and MNAT1) and the highly connected hub genes HNRNPA1, RBMX, PRKDC, and RANBP2 are key genes for both diagnostic biomarkers in preeclampsia and therapeutic intervention targets. Further experimental studies are necessary to confirm the roles of these genes and molecular pathways. Supplementary Materials The following supporting information can be downloaded at: [164]https://www.mdpi.com/article/10.3390/genes13112134/s1, Table S1: The complete results of the MCODE analysis. Table S2: The significantly enriched GO biological processes in the cyan and magenta PPI clusters and the processes’ associated proteins. Table S3: Cyan and magenta cluster genes overlapping with the HAR gene set and PS gene set. Table S4: The significantly enriched reactome pathways in the cyan and magenta PPI clusters and the pathways’ associated proteins. Table S5: The results of the Cytohubba analysis of the cyan and magenta PPI clusters. [165]Click here for additional data file.^ (168.6KB, zip) Author Contributions Conceptualization, K.K. and T.T.; methodology, K.K. and Y.M.; software, K.K. and H.A.; validation, Y.M. and T.T.; formal analysis, K.K., H.A. and Y.M.; investigation, K.K., H.A., Y.M. and T.T.; resources, K.K.; writing—original draft preparation, K.K. and T.T.; writing—review and editing, K.K., H.A., Y.M. and T.T.; visualization, K.K.; supervision, T.T.; project administration, T.T.; funding acquisition, T.T. All authors have read and agreed to the published version of the manuscript. Institutional Review Board Statement Not applicable. Informed Consent Statement Not applicable. Data Availability Statement The data reported in this study are available in the [166]supplementary information provided in the supplementary data. Conflicts of Interest The authors declare that they have no potential conflict of interest. Funding Statement This work was supported in part by a JSPS KAKENHI Grant Number JP21K07747. Footnotes Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. References