Abstract Atherosclerosis (AS) is a lipid-induced, chronic inflammatory, autoimmune disease affecting multiple arteries. Although much effort has been put into AS research in the past decades, it is still the leading cause of death worldwide. The complex genetic network regulation underlying the pathogenesis of AS still needs further investigation to provide effective targeted therapy for AS. We performed a bioinformatic microarray data analysis at different atherosclerotic plaque stages from the Gene Expression Omnibus database with accession numbers [38]GSE43292 and [39]GSE28829. Using gene set enrichment analysis, we further confirmed the immune-related pathways that play an important role in the development of AS. We are reporting, for the first time, that the metabolism of the three branched-chain amino acids (BCAAs; leucine, isoleucine, and valine) and short-chain fatty acids (SCFA; propanoate, and butanoate) are involved in the progression of AS using microarray data of atherosclerotic plaque tissue. Immune and muscle system-related pathways were further confirmed as highly regulated pathways during the development of AS using gene expression pattern analysis. Furthermore, we also identified four modules mainly involved in histone modification, immune-related processes, macroautophagy, and B cell activation with modular differential connectivity in the dataset of [40]GSE43292, and three modules related to immune-related processes, B cell activation, and nuclear division in the dataset of [41]GSE28829 also display modular differential connectivity based on the weighted gene co-expression network analysis. Finally, we identified eight key genes related to the pathways of immune and muscle system function as potential therapeutic biomarkers to distinguish patients with early or advanced stages in AS, and two of the eight genes were validated using the gene expression dataset from gene-deficient mice. The results of the current study will improve our understanding of the molecular mechanisms in the progression of AS. The key genes and pathways identified could be potential biomarkers or new drug targets for AS management. Keywords: atherosclerosis, microarray data, bioinformatics, gene network, gene connectivity 1. Introduction Atherosclerosis (AS) is a chronic inflammatory disease of the arterial wall [[42]1]. The primary lesion of AS is characterized by lipid deposition and accompanied by the proliferation of smooth muscle cells and fibrous matrix, which gradually form into atherosclerotic plaque [[43]2]. Surface rupture of the plaque leads to cardiovascular and cerebrovascular diseases such as ischemic attack and stroke [[44]3,[45]4]. The pathogenesis of AS is mainly involved in endothelial dysfunction, abnormal smooth muscle cell (SMC) proliferation and migration, oxidized lipid deposition, vascular matrix changes, inflammatory cell infiltration, and oxidative stress [[46]5,[47]6]. These processes mainly involve immune cells, foam cells, vascular endothelial cells, and SMCs, and contribute to the formation of AS. Innate and adaptive immune responses that trigger inflammation have been identified in AS, and thus may be a target for developing a new therapy [[48]7]. Vascular smooth muscle cells (VSMC) have also been confirmed to play an essential role in the development of AS. SMC transdifferentiation into macrophage-like and fibrochondrocyte-like cells has been demonstrated in AS [[49]8,[50]9]. Additionally, inhibition of VSMC phenotypic switching may be beneficial in the advanced stages of AS [[51]10]. Although several studies have vastly improved our understanding of the pathogenesis of AS, AS is still the leading cause of death worldwide and poses heavy economic and social burdens in society [[52]11]. The pathogenesis of AS still needs to be further explored to develop targeted treatment and early gene therapy. Understanding the molecular and cellular processes that convert asymptomatic plaques into symptomatic ones may facilitate the development of preventive pharmacotherapy with unprecedented impact on cardiovascular mortality and morbidity. In recent years, a series of genome-wide associations have been performed to identify genetic loci for human cardiovascular diseases [[53]12,[54]13]. Microarray data have been widely used to measure the expression of genome-wide genes in relevant tissues and to identify genes and pathways associated with diseases such as heart disease [[55]14], type II diabetes [[56]15], and nonalcoholic fatty liver disease [[57]16]. Microarray data studies in AS with relevant tissues have demonstrated that certain genes and pathways play a critical role in the progression of carotid atherosclerotic plaque [[58]17,[59]18]. Studies confirmed a central role for lipid accumulation, inflammation, and proteases in plaque instability, and highlighted hemoglobin metabolism and bone resorption as critical enriched pathways in plaques [[60]19]. SMC-related functional categories were most significantly affected in plaques [[61]20]. However, the differential gene co-expression network modules connectivity in different atherosclerotic plaque stages of AS and the key genes for immune-related pathways especially downregulated SMC-related pathways were studied poorly in vascular disease. In addition, the molecular mechanisms involved in the formation of atherosclerotic plaque have not been fully elucidated. In this study, we performed bioinformatic analysis aiming to uncover the gene expression patterns of the development of AS in humans based on microarray data ([62]GSE43292 and [63]GSE28829). The key genes were identified by integrating PPI, GO and WGCNA analysis, and were validated based on the microarray data ([64]GSE9083 and [65]GSE168610) obtained from Gene Expression Omnibus (GEO) database. Firstly, we compared the gene expression of the non-plaque stage to the plaque stage and early stage to the advanced stage based on gene set enrichment analysis (GSEA) (Materials and methods 2.2) and gene expression pattern analysis (GEPA) (Materials and methods 2.3). Secondly, we constructed a weighted gene co-expression network to investigate differential co-expression network module connectivity in different atherosclerotic plaque stages. Then, we integrated the results of weighted gene co-expression network analysis (WGCNA), protein and protein interaction network (PPI), and go ontology (GO) analysis (Materials and methods 2.4–2.6) to reveal key genes that are related to immune and muscle system pathways during the development of AS. Finally, we performed the receiver operating characteristic (ROC) curves (Materials and methods 2.7) for key genes to classify early stage and advanced stage of AS and validated key genes using the gene expression dataset from murine samples with specific gene deficiencies. The analyses provide novel insight into the pathogenesis of atherosclerotic plaque and provide valuable information for developing new targets and drugs in AS. 2. Materials and Methods 2.1. Microarray Data Collection Gene expression data from human carotid macroscopically intact tissue (non-plaque stage) and atheroma plaque (plaque stage) collected in 32 patients undergoing carotid endarterectomy were obtained from [66]GSE43292, which contained 32 normal carotid artery samples and 32 corresponding atherosclerotic plaque samples [[67]21]. The data were analyzed with Expression Console software (version 1.1, Stephen Fodor, San Francisco, CA, USA) using the default RMA summarization method. Furthermore, we also downloaded gene expression data of in early stage (n = 13) and advanced-stage (n = 16) atherosclerotic plaque from human wastid which were obtained from [68]GSE28829 [[69]22]. To validate key genes, we also downloaded the gene expression dataset of TYROBP gene (DAP12) deficient mouse brain from [70]GSE9083, and RNAseq data on mouse hearts for PLN gene-deficient mice from [71]GSE168610 [[72]23]. 2.2. Gene Set Enrichment and Differentially Expressed Genes Analysis Gene set enrichment analysis (GSEA) was performed using GSEA software (version 4.1.0, Vamsi K Mootha, Cambridge, MA, USA) based on c5.bp.v7.0.symbols.gmt (GOBP) and c2.cp.kegg.v7.0 symbols.gmt (KEGG) reference gene sets that were downloaded from the official GSEA [[73]24,[74]25]. Briefly, GSEA is a computational method that determines whether a priori defined set of genes show statistically significant, concordant differences between two biological states. The number of permutations was set to 1000, and the permutation type was set as “gene set”. The nominal p value < 0.05 is the significance threshold of the gene set in GSEA analysis. Moreover, differentially expressed genes (DEGs) between two groups were identified using the DESeq2 package in R. DEGs were defined as those with adjusted p values < 0.05 (adjusted by the Benjamini Hochberg method) and fold change > 2. 2.3. Gene Expression Patterns Analysis Gene expression patterns analysis (GEPA) was used to calculate the major regulated pathways in the development of AS using R software (version 4.2.0, Ross Ihaka and Robert Gentleman, Vienna, Austria ). First, we selected the top 500 most significantly upregulated and downregulated genes (adjusted p value < 0.05) from the comparison of non-plaque stage with plaque stage and early with advanced stage based on the datasets of [75]GSE43292 and [76]GSE28829. Next, we obtained common significantly upregulated and downregulated genes from selected top 500 differentially expressed genes. Finally, these common upregulated and downregulated genes from non-plaque to plaque stage and early stage to advanced stage were retained for subsequent analysis. The enrichment of GO and KEGG pathways was performed using R package “clusterProfile” in R. 2.4. Constructing Weighted Co-expression Gene Networks Weighted gene co-expression networks were constructed and analyzed using the WGCNA [[77]26] (Weighted Gene Co-expression Network Analysis) package in R, which calculated topological overlap measures among genes and assigned the genes into different modules through hierarchical clustering. A dynamic tree cutoff of 0.25 was set to merge similar trees. Module eigengene was also calculated using WGCNA, which is the first principal component of gene expression values in each module. Genes with more than 25% variance were used to construct the weighted gene co-expression network. Finally, a weighted gene co-expression network was constructed using 14,256 and 16,220 genes in [78]GSE43292 and [79]GSE28829 datasets, respectively. The enrichment of genes in each module was analyzed using the R package “clusterProfile” [[80]27]. Visualization of the network was performed using Cytoscape (version 3.8.2, Paul Shannon, Washington, DC, USA) [[81]28]. 2.5. GO and KEGG Enrichment Analyses The R package “clusterProfile” was used to perform Gene Ontology (GO) and Kyoto Encyclopedia of Genes and genomes (KEGG) pathway enrichment analysis [[82]27]. The Benjamin–Hochberg approach was used to correct multiple tests. The adjusted p value ≤ 0.05 was used as a threshold of significance for the enriched GO and KEGG terms for target genes. 2.6. Protein-Protein Interaction Network Analysis The search tool for retrieving interacting genes (STRING; [83]https://www.string-db.org, accessed on 10 June 2022) is a database of known and predicted protein-protein interactions that can be used to predict and track the protein–protein interactions network. Analyzing the interaction between different proteins can provide new insight into the mechanism of AS. This study used the STRING database to construct the PPI network of common upregulated and downregulated genes in different atherosclerotic plaque stages. 2.7. Logistic Regression Models with the ROC Curve The logistic regression model was constructed using glm in R. The key genes were identified as predictive variables, and the sample type with early stage or advanced stage was considered a binary responsive variable. The 3-fold cross-validation was performed to validate the accuracy of the logistic regression models by caret package in R. The receiver operating characteristic (ROC) curves were generated to evaluate the sensitivity and specificity of the logistic regression models. The average area under the curve (AUC) was calculated to assess the models’ accuracy. 3. Results 3.1. Identification of Key Gene Sets in Different Stages of Atherosclerotic Plaque To explore the key gene sets in different stages of atherosclerotic plaque, we performed the gene set enrichment analysis (GSEA) based on GOBP and KEGG gene sets, respectively. Using a p value threshold of 0.05, a total of 339 and 1124 gene sets were significantly enriched in non-plaque and plaque stages, respectively, when comparing non-plaque and plaque stages. Similarly, a total of 292 and 1141 gene sets were significantly enriched in early and advanced stages, respectively, when comparing early and advanced stages ([84]Figure 1). Among these, 120 common gene sets were identified between gene sets enriched in non-plaque stage and early stage, and 870 common gene sets were identified between gene sets enriched in plaque stage and advanced stage ([85]Figure 1). Notably, an average of 52% significantly enriched gene sets was common from non-plaque stage to plaque stage and early stage to the advanced stage of AS ([86]Figure 1). This indicates the potential roles of these gene sets driving the progression of AS. To better comprehend which regulated gene sets play a more critical role in AS development, the top 10 GOBP and 10 KEGG gene sets enriched in different stages of atherosclerotic plaque were retained for further analysis. We found eight common enriched gene sets in non-plaque stage and early stage and 11 common enriched gene sets in plaque stage and advanced stage based on intersection of the top 10 GOBP and 10 KEGG enriched gene sets in corresponding stages. ([87]Figure S1). For GOBP gene sets, myofibril assembly and cell communication by electrical coupling gene sets were significantly enriched in the non-plaque stage and early stage when compared to the non-plaque stage with plaque stage and early stage with advanced stage. Compared with the corresponding non-plaque stage and early stage, the gene sets, including adaptive immune response, leukocyte migration, positive regulation of leukocyte cell–cell adhesion and positive regulation of cell activation gene sets were significantly enriched in the plaque stage and advanced stage ([88]Table 1). For KEGG gene sets, propanoate/butanoate metabolism, cardiomyopathy and valine, leucine and isoleucine degradation gene sets were significantly enriched in non-plaque stage and early stage when compared non-plaque stage with plaque stage and early stage with advanced stage. Compared with the corresponding non-plaque stage and early stage, cytokine–cytokine receptor interaction, intestinal immune network for IGA production, B cell receptor signaling pathway and Toll-like receptor signaling pathway gene sets were significantly enriched in the plaque and advanced stages of AS ([89]Table 1). These results indicated that apart from the immune-related gene sets, the metabolism of short-chain fatty acids and branched-chain amino acids also plays a key role in different stages of atherosclerotic plaque. Figure 1. [90]Figure 1 [91]Open in a new tab The summary of GSEA analysis based on GOBP and KEGG gene sets. (A) Venn diagram of gene sets enriched in non-plaque and early stage. (B) Venn diagram of gene sets enriched in plaque and advanced stage. The red color represents gene sets enriched in the non-plaque and plaque stages based on the dataset of [92]GSE43292. The blue color represents gene sets enriched in the early stage and advanced stage based on the dataset of [93]GSE28829. Table 1. The gene set enrichment analysis of gene expression omnibus database with accession numbers [94]GSE43292 and [95]GSE28829. GS Types p Value Myofibril assembly GOBP <0.001 ^a and <0.001 ^b Cell communication by electrical coupling GOBP <0.001 ^a and 0.005 ^b Propanoate metabolism KEGG 0.018 ^a and <0.001 ^b Butanoate metabolism KEGG 0.011 ^a and <0.001 ^b Hypertrophic cardiomyopathy hcm KEGG <0.001 ^a and <0.001 ^b Dilated cardiomyopathy KEGG <0.001 ^a and <0.001 ^b Arrhythmogenic right ventricular cardiomyopathy arvc KEGG <0.001 ^a and 0.006 ^b Valine leucine and isoleucine degradation KEGG 0.02 ^a and 0.006 ^b Adaptive immune response GOBP <0.001 ^c and <0.001 ^d Leukocyte migration GOBP <0.001 ^c and <0.001 ^d Positive regulation of leukocyte cell cell adhesion GOBP <0.001 ^c and <0.001 ^d Positive regulation of cell activation GOBP <0.001 ^c and <0.001 ^d Leukocyte cell cell adhesion GOBP <0.001 ^c and <0.001 ^d Cytokine cytokine receptor interaction KEGG <0.001 ^c and <0.001 ^d Leishmania infection KEGG <0.001 ^c and <0.001 ^d Intestinal immune network for iga production KEGG <0.001 ^c and <0.001 ^d B cell receptor signaling pathway KEGG <0.001 ^c and <0.001 ^d Hematopoietic cell lineage KEGG <0.001 ^c and <0.001 ^d Toll-like receptor signaling pathway KEGG <0.001 ^c and <0.001 ^d [96]Open in a new tab ^a represents the p value of gene set significantly enriched at the stage of non-plaque stage; ^b represents the p value of gene set significantly enriched at the stage of early stage; ^c represents the p value of gene set significantly enriched at the stage of plaque stage; ^d represents the p value of gene set significantly enriched at the stage of advanced stage. 3.2. The Mainly Regulated Pathways in Different Stages of Atherosclerotic Plaque To investigate the mainly regulated pathways in different stages of atherosclerotic plaque, we selected the top 500 most significantly upregulated and downregulated genes from non-plaque to plaque in the dataset of [97]GSE43292 and early to advanced stage in the dataset of [98]GSE28829 ([99]Table S1). Of them, 259 genes were significantly upregulated and 218 genes were significantly downregulated from non-plaque to plaque and early to advanced stage ([100]Figure S2). Then, we performed GO and KEGG analysis for these commonly regulated genes in different stages of atherosclerotic plaque. The 259 common upregulated genes significantly enriched in 586 GOBP and 31KEGG signaling pathways ([101]Table S2). In addition, the 218 common downregulated genes significantly enriched in 96 GOBP and 13 KEGG signaling pathways ([102]Table S3). Interestingly, we found that these 259 common upregulated genes were mainly involved in B cell receptor and NF-kappa B signaling pathways to regulate immune-related biological processes, such as neutrophil activation in immune response, T cell activation, leukocyte cell–cell adhesion and mononuclear cell proliferation, etc ([103]Figure 2A,B; [104]Table S2). We also found that these 218 common downregulated genes were mainly enriched in biological processes, including muscle system process, muscle contraction and regulation of ion transmembrane transport, etc ([105]Figure 2C; [106]Table S3). KEGG analysis of the 218 common downregulated genes identified some pathways mainly associated with cAMP/cGMP-PKG signaling pathway and VSMC contraction ([107]Figure 2D; [108]Table S3). The results identified these immune and muscle related pathways as major regulated pathways in different stages of atherosclerotic plaque. Figure 2. [109]Figure 2 [110]Open in a new tab The functional enrichment analysis of common upregulated and downregulated genes using GO and KEGG analysis. (A,B) The top 10 enrichment pathways of 259 common upregulated genes using GO and KEGG analysis, respectively. (C,D) The top 10 enrichment pathways of 218 common downregulated genes using GO and KEGG analysis, respectively. The sizes of the dots represent the counts of enriched genes, and the dot color represents the adjusted p-value. 3.3. Remodeling of the Molecular Interaction Structure in Atherosclerosis The number of neighboring genes associated with a gene in a network was defined as the connectivity of a gene which is an important index to identify the functional importance of a gene [[111]29]. To characterize and compare the connectivity of genes in non-plaque stage and plaque stage, we constructed a multiple stage-weighted gene co-expression network encompassing 14,256 genes with the first 75% variance in 64 individuals and identified 14 network modules containing 34 to 3815 gene members ([112]Figure 3A,B). Next, we used a metric known as modular differential connectivity (MDC) [[113]30], which is defined as MDC = [MATH: rA¯ :MATH] / [MATH: rB¯ :MATH] , where [MATH: rA¯ :MATH] and [MATH: rB ¯ :MATH] indicate the average absolute correlation coefficients of all pairwise gene members of a module in non-plaque stage and plaque stage, respectively. Using an empirical p value of 0.05 determined by a 1000 permutation test that shuffles the sample labels. Interestingly, we found that four modules were significantly differences in the co-regulation of genes in non-plaque stage and plaque stage ([114]Figure 3C). Four modules are mainly related to histone modification (p = 1.0 × 10^−10, module 3), T cell activation (p = 2.7 × 10^−24, module 5), macroautophagy (p = 7.0 × 10^−3, module 7) and B cell activation (p = 8.5 × 10^−6, module 12) ([115]Figure 3D; [116]Tables S4 and S5). In addition, we also constructed a multiple stage-weighted gene co-expression network encompassing 16,220 genes with the first 75% variance in 29 individuals and identified 15 network modules containing 33 to 2148 gene members ([117]Figure 4A,B). We also found that three modules related to immune-related processes (p = 9.8 × 10^−41, module 4), B cell activation (p = 4.9 × 10^−10, module 11) and nuclear division (p = 2.5 × 10^−15, module 15) showed significantly diverse differential connectivity in the co-regulation of genes in early and advanced stages using MDC analysis ([118]Figure 4C,D; [119]Tables S6 and S7). Notably, genes with the function of immune-related processes, including macrophage, T cell and B cell activation clustered in modules, displayed significantly modular differential connectivity from non-plaque stage to plaque stage and early stage to advanced stage, respectively ([120]Figure S3). In addition, the genes with the function of nuclear division only showed differential module connectivity in early stage and advanced stage. These results suggest that the connectivity of genes involved in these pathways exhibits large differences in the co-regulation of genes during the development of atherosclerotic plaque. Figure 3. [121]Figure 3 [122]Open in a new tab Differential modular connectivity analyses for individual genes in non-plaque and plaque stages. (A) Gene co-expression network module in different stages of atherosclerotic plaque. The rows and columns represent the same set of 14,256 genes with the first 75% variance in the dataset of [123]GSE43292. (B) The numbers of modules and genes in weighted gene co-expression network modules based on the dataset of [124]GSE43292. (C) Differential connectivity modules are highlighted. The y−axis represents the log2 transformed modular differential connectivity values. The x−axis denotes the number of genes in the corresponding modules. (D) Heatmap of gene-gene correlation in the differential connectivity of modules in the non-plaque stage (the upper right triangle of each module) compared with that in the plaque stage (the lower left triangle of each module). Figure 4. [125]Figure 4 [126]Open in a new tab Differential modular connectivity analyses for early and advanced stage individual genes. (A) Gene co-expression network module in different stages of atherosclerotic plaque. The rows and columns represent the same set of 16,220 genes with the first 75% variance in the dataset of [127]GSE28829. (B) The numbers of modules and genes in weighted gene co-expression network modules based on the dataset of [128]GSE28829. (C) Differential connectivity modules are highlighted. The y-axis represents the log2 transformed modular differential connectivity values. The x-axis denotes the number of genes in the corresponding modules. (D) Heatmap of gene-gene correlation in the differential connectivity of modules in the early stage (the upper right triangle of each module) compared with that in the advanced stage (the lower left triangle of each module). 3.4. Identification of Key Genes As Potential Therapeutic targets for Atherosclerosis To reveal key genes related to the immune and muscle system in the development of atherosclerotic plaque, we integrated the results of GO, PPI, and WGCNA analysis. First, we examined the 259 and 218 common upregulated and downregulated genes in the context of network modules. The 259 common upregulated genes related to immune-related pathways are significantly enriched in module 5 ([129]GSE43292) and module 4 ([130]GSE28829) with modular differential connectivity in different stages of atherosclerotic plaque ([131]Figure S4). The 218 common downregulated genes related to muscle system related pathways are significantly enriched in module 1 ([132]GSE43292) and module 5 and 6 ([133]GSE28829) ([134]Figure S5). The enriched genes in module 5 and 6 ([135]GSE28829) ed modular differential connectivity ([136]Figure S5). These results further supported that these modules regulate the immune or muscle system related pathways in the development of atherosclerotic plaque. Next, we also performed the PPI analysis and calculated the number of immune-related pathways involved in a single gene based on the results of GO analysis for the 259 genes. The same analysis was performed in 218 genes to calculate the number of muscle system related pathways. Finally, we integrated the results of these three analyses and defined the genes with top 20 degree values in PPI network analysis, top 20 number of involved in the most immune or muscle-related pathways and exhibit an absolute correlation coefficient >0.85 within the eigengene of the enriched module as the key genes in the development of atherosclerotic plaque. Five genes (PTPRC, FCGR2B, FCER1G, ITGB2 and TYROBP) were identified as key genes mainly involved in immune-related pathways, and LMOD1, CFL2 and PLN genes were identified as key genes for muscle system related pathways in AS ([137]Figure 5). These eight key genes were annotated using MGI and GWAS catalog databases ([138]Table 2). We found that most of these key genes are involved in cardiovascular system phenotype from MGI database and lipid-related phenotype from GWAS catalog. Interestingly, we found that LMOD1 is a target gene for coronary artery disease from GWAS catalog. Furthermore, we investigated the expression of key genes in different tissues following GTEx expression database ([139]Figure S6). We found that these key genes were highly expressed in arteries, with LMOD1 and PLN gene expression particularly higher when compared to other tissues ([140]Figure S6). These results indicated that these key genes might play essential roles in developing atherosclerotic plaque. Finally, we used the 3-fold cross-validation method to evaluate the reliability of the model. The results showed that the AUC of the three verification sets in the logistic model constructed by the 3-fold cross-validation method was 0.883, 0.925 and 0.808 with an average AUC of 0.872 which much higher than the average of AUC of 0.674 with randomly selected eight genes ([141]Figure 6). Taken together, these results suggest that these eight key genes could be used as therapeutic target genes in different stages of atherosclerotic plaque. Figure 5. [142]Figure 5 [143]Open in a new tab The eight essential genes were identified by integrating the results of PPI, GO, and Gene co-expression module analysis. (A) The top 20 genes with the highest degree value of protein-protein network interactions for 259 highly upregulated genes. (B) GO analysis demonstrates that the top 20 genes primarily involve immune-related pathways. (C) Five genes were identified as key genes for immune responses based on PPI, GO, and Gene co-expression module analysis results. (D) The top 20 genes with the highest degree value of protein-–protein network interactions for 218 highly downregulated genes. (E) The top 20 genes are involved in muscle-related pathways, as shown by GO analysis. (F) The results of three genes were identified as key for muscle system-related pathways based on PPI, GO, and Gene co-expression module analysis. Hub genes in modules represent genes exhibiting an absolute correlation coefficient > 0.85 within the eigengene of the modules related to immune or muscle-related pathways based on WGCNA analysis of [144]GSE43292 and [145]GSE28829 datasets. The dot represents different genes. The red dot represents the top 20 genes in the corresponding analysis. Table 2. The detailed annotation of key genes based on MGI and GWAS Catalog databases. Genes Full Name Related Pathway (Gene Cards) MGI Phenotype Phenotypes from GWAS Catalog PTPRC Receptor-type tyrosine-protein phosphatase C B Cell Receptor Signaling Pathway cardiovascular system phenotype low density lipoprotein cholesterol measurement FCGR2B Low affinity immunoglobulin gamma Fc region receptor II-b B Cell Receptor Signaling Pathway cardiovascular system phenotype lipid measurement FCER1G High affinity immunoglobulin epsilon receptor subunit gamma Innate Immune System cardiovascular system phenotype lipoprotein measurement ITGB2 Integrin Subunit Beta 2 ERK Signaling cardiovascular system phenotype high density lipoprotein cholesterol measurement TYROBP TYRO protein tyrosine kinase-binding protein Innate Immune System immune system phenotype NA LMOD1 Leiomodin-1 Smooth Muscle Contraction NA coronary artery disease CFL2 Cofilin-2 NA muscle phenotype low density lipoprotein cholesterol measurement PLN Phospholamban Activation of cAMP-Dependent PKA cardiovascular system phenotype resting heart rate [146]Open in a new tab Figure 6. [147]Figure 6 [148]Open in a new tab Receiver operating characteristic curves of eight essential genes and randomly selected eight genes. (A) The ROC curve of eight essential gene combinations is based on the logistic regression model. (B) The ROC curve of eight randomly selected gene combinations is based on the logistic regression model. The x-axis represented the 1-specificity of the negative–positive rate (false positive rate FPR), and the y-axis represented the true positive rate (TRR) sensitivity. Different colors represent the results of 3-fold cross-validation. 3.5. Validation of Potential Therapeutic Target Genes Using Gene Deficiency Mouse Expression Pro-Files To validate the potential therapeutic target genes, we obtained the gene expression of TYROBP (DAP12) deficient mouse brain from the dataset of [149]GSE9083, and RNAseq on mouse hearts from PLN deficient mice from the dataset of [150]GSE168610. First, we performed gene set enrichment and differential gene expression analysis in these two datasets ([151]Figure S7). For TYROBP gene dataset, most of the significantly enriched immune-related pathways in plaque and advanced stages are also significantly enriched in TYROBP-deficient mice (TYROBP^−/−) in the comparison of TYROBP^−/− samples with TYROBP^+/− samples ([152]Figure 7A). We found 873 significantly differentially expressed genes in comparing TYROBP^−/− samples with TYROBP^+/− samples ([153]Figure 7B). Furthermore, among the 259 common upregulated genes, 20 genes with the functions related to immune pathways were also found in 873 significantly differentially expressed genes ([154]Figure 7C,D). For PLN gene dataset, we also performed gene set enrichment and differential gene expression analysis between PLN^−/− samples and PLN^+/− samples ([155]Figure S8). We found that the gene sets of butanoate/propanoate metabolism and valine/leucine/isoleucine degradation significantly enriched in non-plaque stage and early stage when compared non-plaque stage with the plaque stage early stage with advanced stage are also significantly enriched in PLN^+/− samples. Among the 218 gene-enriched pathways, cardiac muscle cell action potential and contraction pathways were also enriched in PLN^+/− samples ([156]Figure 8A). In addition, we found that 3542 genes exhibited a significantly differential expression between PLN^−/− samples and PLN^+/− samples ([157]Figure 8B). Among the 218 common downregulated genes, 25% (55/218) of them with the function related to muscle system pathways also display significantly differential expression in the comparison of PLN^−/− samples and PLN^+/− samples ([158]Figure 8C,D). These results further confirmed that TYROBP and PLN genes could regulate immune and muscle-related pathways to affect the development of atherosclerotic plaque. Figure 7. [159]Figure 7 [160]Open in a new tab Validation of TYROBP key gene using gene expression profile of TYROBP^+/− and TYROBP^−/− mouse brain tissue. (A) GSEA analysis showed gene sets in TYROBP^−/− samples were significantly enriched. (B) Volcano diagram of the differentially expressed genes comparing samples from TYROBP^+/− and TYROBP^−/− mouse brain tissue. (C) Venn diagrams showing common genes between DEGs and commonly upregulated genes in [161]GSE43292 and [162]GSE28829 datasets. (D) The bubble diagram of GO-BP enrichment analyses of 20 common genes. Dot sizes represent counts of enriched common genes, and dot colors represent adjusted p-value. Figure 8. [163]Figure 8 [164]Open in a new tab Validation of PLN key gene using gene expression profile of PLN^+/− and PLN^−/− mouse heart tissue. (A) Gene sets were significantly enriched in PLN^+/− samples based on GSEA analysis. (B) Representation of the differentially expressed genes comparing PLN^+/− and PLN^−/− mouse heart tissue. (C) Venn diagrams showing the common genes between DEGs and commonly downregulated genes in [165]GSE43292 and [166]GSE28829 datasets. (D) The bubble diagram of GO-BP enrichment analyses of 55 common genes. Dot sizes represent counts of enriched common genes, and dot colors represent adjusted p-value. 4. Discussion This study investigated the gene expression profiles of different atherosclerotic plaque stages with an accession number of [167]GSE43292 and [168]GSE28829. The key genes and pathways were identified based on systematic bioinformatic analysis. Two of eight key genes were validated using the gene expression dataset of gene-deficient murine samples, which are valuable for understanding the molecular mechanism in the development of atherosclerotic plaque. Previous studies have used the microarray datasets of [169]GSE43292 and [170]GSE28829 to screen DEGs, key genes, and pathways in a single atherosclerotic plaque stage [[171]17,[172]18,[173]31,[174]32,[175]33]. In comparison, the current study comprehensively investigated the gene expression profiles of atherosclerotic plaque in the stages of non-plaque to plaque as well as early to advanced. We report for the first time the differential modular connectivity in different stages of atherosclerotic plaque based on differential modular connectivity (MDC) analysis. We also performed gene set enrichment analysis (GSEA) in two databases, which can provide complementary information for differential gene expression analysis. Compared to the previous studies that identified key genes using PPI analysis [[176]34,[177]35], the current study established key genes by integrating the results of protein and protein interaction network (PPI), go ontology (GO), and modular differential connectivity (MDC) analysis in two datasets. The common regulated pathways and key genes were identified using the bioinformatic analysis of two datasets, which can effectively reduce the impact of confounding factors on the results of our study. We also validated two of eight key genes using the gene expression dataset of deficiency genes in mice. Using gene set enrichment analysis with GOBP gene sets, we found that adaptive immune response, leukocyte migration, positive regulation of leukocyte cell-cell adhesion and positive regulation of cell activation gene sets were significantly enriched in the stages of plaque and advanced when compared to the corresponding stages of non-plaque and early in AS. These results indicated that the adaptive immune response and immune cell activation were turned on during the development of atherosclerotic plaque. Previous studies have shown that immune cells, such as B and T cells, contribute to AS [[178]36,[179]37] by playing an important role in the immune response and inflammation [[180]17,[181]38]. For KEGG analysis, we found that cytokine-cytokine receptor interaction, intestinal immune network for IGA production, B cell receptor signaling pathway, and Toll-like receptor signaling pathway gene sets were significantly enriched in the early stage of plaque and advanced lesions when compared to the corresponding stage of non-plaque and early plaque lesions, respectively. These results further supported that immune cells and immune-related pathways were activated in the development of atherosclerotic plaque. Interestingly, we also found that propanoate/butanoate metabolism and valine, leucine and isoleucine degradation gene sets were inhibited in the development of atherosclerotic plaque. It was previously reported that short-chain fatty acid (propanoate and butanoate) and branched-chain amino acid (valine, leucine, and isoleucine) are related to energy metabolism. For example, propionate and butyrate, the major metabolites of dietary fiber, are the main products of bacterial metabolism and constitute an essential source of energy [[182]39]. The function of three branched-chain amino acids (BCAAs; leucine, isoleucine, and valine) can work together to modulate the insulin signal and glucose use by skeletal muscle [[183]40]. Recent studies have suggested that the dysfunction of BCAA catabolism is associated with the risk of cardiovascular disease [[184]41,[185]42]. Therefore, the regulation of metabolism of short-chain fatty acids and branched-chain amino acids in the host may be a novel intervention strategy to hinder atherosclerotic plaque development. Using gene pattern expression analysis, we investigated the highly regulated pathways in different stages of atherosclerotic plaque based on two datasets. The 259 commonly upregulated genes are mainly enriched in immune responses and immune cell activation during the development of atherosclerotic plaque. Immune-related signaling pathways, including B cell receptor and NF-kappa B, are classic signaling pathways widely reported in AS [[186]37,[187]43]. We also found 218 commonly downregulated genes that are typically involved in muscle system processes, muscle contraction, and regulation of ion transmembrane transport. Furthermore, cAMP/cGMP-PKG signaling pathway and VSMC contraction signaling pathways were enriched in the 218 commonly downregulated genes. VSMCs are the major cell types present in all stages of atherosclerotic plaque and play an essential role in the intervention of AS [[188]44]. VSMC phenotypic switching affects plaque stability, and inhibiting phenotypic switching may benefit advanced AS [[189]45], as seen by a reduction in atherosclerotic burden and improved fibrous cap stability when blocking [[190]46]. Moreover, VSMC contraction mediated by ion transmembrane transport is rarely mentioned based on microarray data of different atherosclerotic stages. The intervention of ion transmembrane transport may be an effective method for treating and preventing hypertension and atherosclerosis. Previous studies also support that ion transmembrane transport [[191]47]. Using modular differential connectivity analysis, we found that four modules displayed significant differential connectivity in non-plaque and plaque stages, and three modules displayed significant differential connectivity in early stage and advanced stage. The genes in modules 5 and 12 of multiple stages weighted gene co-expression network based on [192]GSE43292 dataset and modules 4 and 11 of multiple stages weighted gene co-expression network based on [193]GSE28829 dataset are mainly enriched in immune-related pathways including macrophage, T cell, and B cell activation, and these modules displayed modular differential connectivity. The genes associated with immune-related pathways identified by the GSEA and GEPA analyses also showed modular differential connectivity in corresponding modules by MDC analysis. These results further emphasized the important role of immune-related pathways in different stages of atherosclerotic plaque. This is consistent with the finding that innate as well as adaptive immune responses have been identified in AS, with components of T cell activation and antibody production during disease [[194]7]. Our results demonstrated that modifying the co-regulation of immune-related genes in corresponding modules may interfere with atherosclerotic plaque development. We also found that the genes in module 7 of multiple stages weighted gene co-expression network based on [195]GSE43292 dataset related to macroautophagy showed differential connectivity in the development of atherosclerotic plaque. The co-regulation of macroautophagy was stronger in non-plaque stage when compared with plaque stage. A previous study reported that the defective autophagy is one of the causes for atherosclerotic plaque [[196]48]. The genes in module 15 of multiple stages weighted gene co-expression network based on [197]GSE28829 dataset with the function of nuclear division showed higher co-regulation connectivity in advanced stage when compared to early stage, which indicates that nuclear division could be stronger in the advanced stage. Since uncontrolled nuclear division is a common feature of several human tumor cell lines [[198]49], we speculated that the stronger co-regulation of genes associated with nuclear division in the advanced stage is one of the causes of the aneurysm. The regulation of co-expression of macroautophagy and nuclear division modules may be a new strategy for treating AS. Interestingly, we also found the biological pathway of B cell activation was enriched in modules 5 and 12 with differential connectivity based on multiple stages weighted gene co-expression network in [199]GSE43292 dataset ([200]Figure S9). Similarly, the pathway of B cell activation enriched in modules 4 and 11 with differential connectivity based on multiple stages weighted gene co-expression network in [201]GSE28829 dataset ([202]Figure S10). These results suggest that B cells have different functions in the development of atherosclerotic plaque. Our results further support that the B1 and B2 cells display unique functions in the development of AS [[203]50]. The pathways of macroautophagy and nuclear division were not previously revealed by microarray data of different atherosclerotic stages, suggesting that MDC analysis provides new insights in the development of atherosclerotic plaque. Immune and muscle system pathways play an important role in the development of atherosclerotic plaque. To uncover key genes related to immune and muscle system pathways in the development of atherosclerotic plaque, we integrated the results of GO, PPI and WGCNA analysis. PTPRC, FCGR2B, FCER1G, ITGB2, and TYROBP were identified as key genes regulating immune related pathways. PTPRC gene is an essential regulator of T and B cell antigen receptor-mediated activation [[204]51], and its dysfunction can result in immunodeficiency, autoimmunity, or malignancy [[205]52]. FCGR2B gene is mainly involved in a variety of effector and regulatory functions, such as phagocytosis of immune complexes and modulation of antibody production by B-cells [[206]53]. The function of FCER1G gene is essential in chronic inflammation by correlating immune reactions [[207]54]. ITGB2 gene can mediate leukocyte migration through adhesive interactions between leukocytes and inflamed endothelial cells, which are critical for defense against bacteria and wound healing [[208]55]. TYROBP gene is an adaptor in TREM2 signaling, and its activation can modulate cell proliferation, survival, and inflammation pathways [[209]56]. Previous studies have reported TREM2 as a key player in Alzheimer’s disease [[210]57] and TYROBP as key gene for AS [[211]17,[212]58]. Therefore, we speculate that AS and Alzheimer’s disease may have similar TREM2 signaling regulatory mechanisms. Likewise, LMOD1, CFL2, and PLN were identified as key genes in functions related to muscle processing systems. LMOD1 gene has been identified as a causal gene for coronary artery disease by maintaining the differentiated SMC phenotype [[213]59]. Mutations in CFL2 gene mutations have been associated with congenital myopathies, including nemaline and myofibrillar myopathy [[214]60]. PLN gene is an important regulator for sarcoendoplasmic reticulum (SR) calcium transport ATPase (SERCA), which uptakes Ca^2+ to SR during the diastolic phase of cardiomyocytes to maintain intracellular calcium homeostasis [[215]23]. Notably, TYROBP and PLN gene deficiency was validated in this study by analyzing gene expression in experimental mice. These genes will provide valuable information to understand the mechanisms underlying the progression of AS. The expression of key genes in multiple tissues based on GTEx expression database suggests that these key genes are highly expressed in arteries, with the expression of LMOD1 and PLN genes being especially higher in arteries than in other tissues ([216]Figure S6). The results further indicate that LMOD1 and PLN genes may regulate the functional changes of VSMCs, thereby participating in the development of atherosclerotic plaque. Furthermore, we identified that the AUC of logistic regression model based on eight key gene combinations was 0.883, 0.925, and 0.808 with an average AUC of 0.872, much higher than the average AUC of 0.674 with randomly selected eight gene combinations. These results indicate that our logistic regression model for these key genes can reliably predict the stages of patients with AS. The present study has several limitations. Firstly, compared to RNAseq, the microarray data might lead to some bias, which might affect the interpretation of the results. Secondly, there may be some confounding factors, including age and sex, which might affect gene expression, and could not be considered in this study. Thirdly, the key genes identified in the current study were not validated in animals in vivo or humans. Future studies using genetically modified animal and atherosclerotic animal models are warranted. 5. Conclusion In summary, this study provides a comprehensive view of the molecular mechanisms at different stages of atherosclerotic plaque. It identifies several molecular mechanisms that potentially link the progression of atherosclerotic plaque. The differential modular connectivity in different stages of atherosclerotic plaque was first reported in this study. In addition, the eight key genes related to immune or muscle system pathways were considered to play a critical role in the development of atherosclerotic plaque. The eight key gene combinations can reliably predict the stages of patients with AS. Among these eight key genes, TYROBP and PLN genes were validated using the gene expression dataset of deficiency genes in mice. These results may help us better understand the functional mechanisms of AS in different atherosclerotic plaque stages. The essential genes and pathways found in this study might be potential biomarkers or drug targets for diagnosing or treating AS. Supplementary Materials The following supporting information can be downloaded at: [217]https://www.mdpi.com/article/10.3390/cells11243976/s1, Figure S1: Venn diagrams showing the common gene sets for the top 10 GOBP and 10 KEGG enriched gene sets in different atherosclerotic plaque stages based on the dataset of [218]GSE43292 and [219]GSE28829; Figure S2: The common genes for the top 500 most significantly differential expression genes in different atherosclerotic plaque stages based on the dataset of [220]GSE43292 and [221]GSE28829; Figure S3: The connectivity of genes related to macrophage cell activation in modules; Figure S4: The 259 common upregulated related to immune responses pathways in the co-expression modules; Figure S5: The 218 common downregulated related to muscle system pathways in the co-expression modules; Figure S6: The expression profiles of eight key genes in human tissues (from GTEx Portal: [222]https://gtexportal.org/home/); Figure S7: The summary of GSEA analysis based on the dataset of [223]GSE9083; Figure S8: The summary of GSEA analysis based on the dataset of [224]GSE168610; Figure S9: The connectivity of genes related to B cell activation in module 5 and 12 based of multiple stages weighted gene co-expression network based on the dataset of [225]GSE43292; Figure S10: The connectivity of genes related to B cell activation in module 4 and 11 based of multiple stages weighted gene co-expression network based on the dataset of [226]GSE28829; Table S1: The differential genes with FDR < 0.01 in different stages of atherosclerotic plaques based on the dataset of [227]GSE43282 and [228]GSE28829; Table S2: The GO and KEGG analysis of the 259 common upregulated genes; Table S3: The GO and KEGG analysis of the 218 common downregulated genes; Table S4: The results of modular differential connectivity analysis based on dataset of non-plaque and plaque stage; Table S5: The enrichment pathways of module 3, 5, 7 and 12 using GO analysis; Table S6: The results of modular differential connectivity analysis based on dataset of early and advanced stage; Table S7: The enrichment pathways of module 4, 11 and 15 using GO analysis. [229]Click here for additional data file.^ (3.8MB, zip) Author Contributions Conceptualization, Y.Z. and J.Y.; Data curation, L.H., L.Q., X.Y. and Y.Z.; Funding acquisition, Y.Z.; Investigation, L.H., A.P.-J., Y.Y., M.Q. and Y.Z.; Methodology, L.H.; Validation, Y.Z.; Writing―original draft, Y.Z.; Writing―review & editing, A.P.-J., Y.Z. and J.Y.. All authors will be informed about each step of manuscript processing, including submission, revision, revision reminder, etc., via emails from our system or assigned Assistant Editor. All authors have read and agreed to the published version of the manuscript. Institutional Review Board Statement The current study was approved by the Institutional Animal Care and Use Committee of Jiangxi University of Chinese medicine, China. Informed Consent Statement Not applicable. Data Availability Statement The datasets created and/or analyzed during the current study will be available from the corresponding author on reasonable request. There are no security, licensing, or ethical issues related to these data. Conflicts of Interest The authors have no conflict of interest to disclose. Funding Statement This work was supported by grants from the Science and Technology Planning Project of Jiangxi Traditional Chinese Medicine Science (No. 2021B668), Start-up Fund for Scientific Research, Jiangxi University of Traditional Chinese Medicine (No. 2020BSZR014), the Jiangxi Key Laboratory grant in Science and Technology Department of Jiangxi Province (No. 20202BCD42014), Foundation of Jiangxi Educational Committee (No. GJJ211235) and a grant from American Heart Association (No. 18EIA33900065). Footnotes Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. References