Abstract
The rupture of atherosclerotic plaques is essential for cardiovascular
and cerebrovascular events. Identification of the key genes related to
plaque rupture is an important approach to predict the status of plaque
and to prevent the clinical events. In the present study, we downloaded
two expression profiles related to the rupture of atherosclerotic
plaques ([38]GSE41571 and [39]GSE120521) from GEO database. 11 samples
in [40]GSE41571 were used to identify the differentially expressed
genes (DEGs) and to construct the weighted gene correlation network
analysis (WGCNA) by R software. The gene oncology (GO) and Kyoto
Encyclopedia of Genes and Genomes (KEGG) enrichment tool in DAVID
website, and the Protein-protein interactions in STRING website were
used to predict the functions and mechanisms of genes. Furthermore, we
mapped the hub genes extracted from WGCNA to DEGs, and constructed a
sub-network using Cytoscape 3.7.2. The key genes were identified by the
molecular complex detection (MCODE) in Cytoscape. Further validation
was conducted using dataset [41]GSE120521 and human carotid
endarterectomy (CEA) plaques. Results: In our study, 868 DEGs were
identified in [42]GSE41571. Six modules with 236 hub genes were
identified through WGCNA analysis. Among these six modules, blue and
brown modules were of the highest correlations with ruptured plaques
(with a correlation of 0.82 and −0.9 respectively). 72 hub genes were
identified from blue and brown modules. These 72 genes were the most
likely ones being related to cell adhesion, extracellular matrix
organization, cell growth, cell migration, leukocyte migration,
PI[3]K-Akt signaling, focal adhesion, and ECM-receptor interaction.
Among the 72 hub genes, 45 were mapped to the DEGs (logFC > 1.0,
p-value < 0.05). The sub-network of these 45 hub genes and MCODE
analysis indicated 3 clusters (13 genes) as key genes. They were LOXL1,
FBLN5, FMOD, ELN, EFEMP1 in cluster 1, RILP, HLA-DRA, HLA-DMB, HLA-DMA
in cluster 2, and SFRP4, FZD6, DKK3 in cluster 3. Further expression
detection indicated EFEMP1, BGN, ELN, FMOD, DKK3, FBLN5, FZD6, HLA-DRA,
HLA-DMB, HLA-DMA, and RILP might have potential diagnostic value.
Subject terms: Computational biology and bioinformatics, Genetics,
Molecular biology, Biomarkers, Diseases
Introduction
Atherosclerosis is a chronic inflammatory disease responsible for many
clinical manifestations, such as coronary artery disease and ischemic
stroke^[43]1. The deposition of lipid under a blood vessel wall leads
to the formation of atherosclerotic plaques. The rupture of the plaque
blocks the blood vessel and is the main reason for cardiovascular or
cerebral vascular events^[44]2. Identification of the key genes related
to plaque rupture is an important approach to predict the status of
plaque and to prevent the clinical events.
With the development of new technologies, microarray analysis has been
a vital method for the research of genetic alteration in various
diseases^[45]3. To deal with the big data from microarray profile,
bioinformatics analysis has become a useful tool. For instance, linear
models for microarray data (LIMMA) was used to identify the
differential expressed genes (DEGs) from big data, Gene Oncology (GO)
and Kyoto Encyclopedia of Genes and Genomes (KEGG) were used to predict
the functions and mechanisms of genes^[46]4. Weighted gene correlation
network analysis (WGCNA) is an advanced data mining method. It is used
to describe the correlation patterns among genes, to find the clusters
(modules) of highly correlated genes, to identify the hub genes in each
module, and to associate the modules to external sample traits^[47]5.
WGCNA provides another powerful method to group the genes into modules
and coordinate the modules to biological functions and regulatory
mechanisms^[48]6. WGCNA has been widely used in cancers, cardiovascular
diseases and many other diseases^[49]7,[50]8.
Recently, some microarray studies have been conducted to compare the
gene expression profiles between carotid atheroma and the
microscopically intact tissue ([51]GSE43292); human normal arterial
intimae and advanced atherosclerotic plaques ([52]GSE97210); and
ruptured and stable atherosclerotic human plaques ([53]GSE41571,
[54]GSE120521)^[55]9,[56]10. WGCNA was used to identify the key lncRNAs
and mRNAs associated with atherosclerosis or coronary artery disease
progression^[57]8,[58]11. The bioinformatics method LIMMA was used to
identify key genes in [59]GSE41571. COL3A1, COL1A2, ASPN, PPBP were
thought to play critical roles in atherosclerotic plaque
rupture^[60]10. However, they only identified individual DEGs in
microarray profiles. No WGCNA analysis was conducted on stable and
ruptured atherosclerotic plaques.
In the present study, we downloaded the datasets [61]GSE41571 and
[62]GSE120521 from GEO database
([63]https://www.ncbi.nlm.nih.gov/geo/), identified key genes by using
bioinformatics tools, such as LIMMA, WGCNA, GO, KEGG, and Cytoscape.
The key genes were further validated in human carotid endarterectomy
(CEA) plaques.
Results
Datasets and workflow
The datasets [64]GSE41571 and [65]GSE120521 were downloaded from the
GEO database ([66]http://www.ncbi.nlm.nih.gov/geo/). [67]GSE41571
contains 5 ruptured and 6 stable atherosclerotic plaques, while
[68]GSE120521 contains 4 ruptured and 4 stable plaques.
The workflow of the whole study was shown in Fig. [69]1. DEGs in
[70]GSE41571 were firstly screened. The WGCNA was conducted on the
ruptured and stable plaques in [71]GSE41571. The functional modules and
the hub genes were identified. Then the hub genes were mapped to the
DEGs and formed a sub-network by Cytoscape. Key genes were identified
based on this network. The key genes were validated using [72]GSE120521
and human CEA plaques.
Figure 1.
[73]Figure 1
[74]Open in a new tab
Workflow of the whole study.
Identification of the DEGs
The matrix file from [75]GSE41571 was pre-processed and annotated with
the official gene symbol. The DEGs were identified using “LIMMA”
package in the R software^[76]12. The genes with fold change over 2.0
(logFC > 1.0) and p-value < 0.05 were considered DEGs. Eventually, 868
genes were identified (Fig. [77]2A). Compared to the stable plaques,
342 genes were upregulated and 526 genes were downregulated in the
ruptured plaques. The heatmap of the top 40 upregulated and 40
downregulated DEGs were shown in Fig. [78]2B, and Table [79]S1.
Figure 2.
[80]Figure 2
[81]Open in a new tab
Identification of DEGs in [82]GSE41571. A: volcano plot of
[83]GSE41571; B: Heatmap of the top 40 upregulated and top 40
downregulated DEGs.
Construction of gene co-expression networks by WGCNA
The gene expression profile [84]GSE41571 (11 samples, 8413 genes) was
downloaded. The top 50% variant genes (4207 genes) in [85]GSE41571 were
screened out by LIMMA and were used for the construction of gene
co-expression network by WGNCA in the R software.
The plaque traits of each sample were shown in Fig. [86]3A. To ensure a
scale-free network for the construction of the co-expression network, a
power of β = 9 was selected as a soft-thresholding parameter
(Fig. [87]3B). The hierarchical clustering based on the topological
overlap matrix (TOM) dissimilarity measure revealed six modules,
namely, the blue, brown, green, red, turquoise, and yellow modules
(Fig. [88]3C). Each module contained a group of coordinately expressed
genes with high TOM value and was potentially related to the same
clinical traits. Genes with the absolute values of Pearson’s
correlation over 0.9 were considered hub genes. The number of genes
involved in each module was 319 for the blue module (48 hub genes), 124
for the brown module (24 hub genes), 51 for the green module (15 hub
genes), 22 for the red module (14 hub genes), 391 for the turquoise
module (103 hub genes) and 88 for the yellow module (32 hub genes)
(Table [89]S2). Among these modules, blue and brown modules were found
to have the highest association with atherosclerotic plaque rupture
(correlations of 0.82, p-value of 0.002 for the blue module, and
correlations of −0.9, p-value of 0.0001 for the brown module)
(Fig. [90]3 D). Therefore, the hub genes in blue and brown modules were
used for further analysis.
Figure 3.
[91]Figure 3
[92]Open in a new tab
WGCNA identification of plaque rupture related hub genes. (A)
Clustering dendrogram of 11 samples; (B) Determination of
soft-threshoding power in the WGCNA; (C) Dendrogram of all
differentially expressed genes clustered based on a dissimilarity
measure (1-TOM); (D) Association between each individual module and the
rupture of atherosclerotic plaque;.
Gene Oncology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG)
enrichment of hub genes in blue and brown modules
To analyze the functions and mechaisms of the identified hub genes in
blue and brown modules, GO and KEGG enrichment in DAVID (website:
[93]https://david.ncifcrf.gov/) were used. The q-value < 0.05 was
considered to be statistically significant for the correlations. The
top 10 GO enrichment terms were visualized in the bubble charts by the
Omicshare online tool ([94]http://www.omicshare.com/tools).
As shown in Fig. [95]4, these hub genes were mostly enriched in
Biological Process (BP) of cell adhesion and extracellular matrix
organization (Fig. [96]4A); Cellular Components (CC) of extracellular
exosome, plasma membrane (Fig. [97]4B); and Molecular Functions (MF) of
protein binding (Fig. [98]4D). KEGG enrichment indicated the genes most
likely affect the signal pathways of PI3K-Akt signaling and focal
adhesion (Fig. [99]4C).
Figure 4.
[100]Figure 4
[101]Open in a new tab
GO and KEGG enrichment of genes in blue and brown modules. (A)
Biological Process of GO enrichment; (B) Cellular components of GO
enrichment; (C) KEGG pathway enrichment; (D) Molecular function of GO
enrichment; RichFactor = “count”/ “pop hits”. The “counts” is the
number of hub genes enriched in a certain term. The “pop hits” is the
number of all genes enriched in a certain term.
Protein-protein interaction (PPI) network of the hub genes in the blue and
brown modules
The PPI was analyzed by the STRING online tool (website:
[102]https://string-db.org/). 72 hub genes in the blue and brown
modules were analyzed. As a result, the 72 hub genes formed a network
with 82 nodes and 97 edges (Fig. [103]5). BGN, CTSA, FBLN5, and HLA-DRA
were at the core of the network. These genes are mainly involved in the
process of collagen formation and extracellular matrix degradation, and
the functions of the immune system. All of the above pathological
processes play critical roles in atherosclerosis.
Figure 5.
[104]Figure 5
[105]Open in a new tab
PPI network for the 72 hub genes in blue and brown modules. The circles
represent the proteins encoded by the corresponding genes; lines
represent the interactions between the proteins.
Identification of key genes
To figure out the key genes related to the ruptured plaques, we mapped
the 72 hub genes to the DEGs. 45 hub genes with logFC > 1.0 and p-value
< 0.05 were screened out. Among them, 12 were upregulated and 33 were
downregulated (Table [106]S3). These 45 hub genes were subsequently
used to construct the sub-network by Cytoscape 3.7.2 software. The
plugin molecular complex detection (MCODE) tool was used to calculate
the most significant clusters in the sub-network. As a result shown in
Fig. [107]6 and Table [108]1, 3 clusters with 13 key genes were figured
out. They were LOXL1, FBLN5, FMOD, ELN, EFEMP1 in cluster1, RILP,
HLA-DRA, HLA-DMB, HLA-DMA in cluster 2, and SFRP4, FZD6, DKK3 in
cluster 3. These genes were considered to be key genes.
Figure 6.
[109]Figure 6
[110]Open in a new tab
The 45 hub genes sub-network constructed in Cytoscape 3.7.2 software.
Yellow boxes were key genes in cluster 1; green boxes were key genes in
cluster 2; red boxes were key genes in cluster 3; blue boxes were genes
not clustered.
Table 1.
The list of key genes.
Gene symbol MCODE score Official full name
Cluster 1 FMOD 3 Fibromodulin
EFEMP1 3 EGF containing fibulin extracellular matrix protein 1
LOXL1 2.7 Lysyl oxidase like 1
BGN 2.7 Biglycan
FBLN5 2.4 Fibulin 5
ELN 2.4 Elastin
Cluster 2 RILP 3.0 Rab interacting lysosomal protein
HLA-DRA 3.0 Major histocompatibility complex, class II, DR alpha
HLA-DMB 3.0 Major histocompatibility complex, class II, DM beta
HLA-DMA 3.0 Major histocompatibility complex, class II, DM alpha
Cluster 3 SFRP4 2.0 Secreted frizzled related protein 4
FZD6 2.0 Frizzled class receptor 6
DKK3 2.0 Dickkopf WNT signaling pathway inhibitor 3
[111]Open in a new tab
Validation of key genes in [112]GSE120521 and human CEA plaques
To further validate the key genes, we detected the expression of the 13
key genes in the microarray profile [113]GSE120521, and in the human
CEA plaques (10 stable and 10 ruptured). In [114]GSE120521, EFEMP1,
BGN, ELN, FMOD, DKK3, FBLN5, and FZD6 were decreased in ruptured
plaques, while HLA-DRA, HLA-DMB, HLA-DMA, and RILP were increased.
(Fig. [115]7A). Similar results were found in human CEA plaques
(Fig. [116]7B).
Figure 7.
[117]Figure 7
[118]Open in a new tab
The expression of 13 key genes in [119]GSE120521 and human CEA plaques.
(A) the expression of 13 key genes in [120]GSE120521. Black round:
stable plaques; black square: ruptured plaques; n = 4 for both stable
and ruptured plaques. (B) the expression of 13 key genes in human CEA
plaques. n = 10 for both stable and ruptured plaques. *P < 0.05,
**P < 0.01 vs Stable group.
Discussion
In the present study, six modules were detected by WGCNA using the
dataset [121]GSE41571. Among the six modules, blue and brown modules
were associated with the rupture of atherosclerotic plaques
significantly. 72 hub genes were screened out from the blue and brown
modules. After mapping the 72 hub genes to the DEGs, 45 genes were
found and used to construct a sub-network. Eventually, 13 key genes
were identified. The expression levels of EFEMP1, BGN, ELN, FMOD, DKK3,
FBLN5, FZD6, HLA-DRA, HLA-DMB, HLA-DMA, and RILP were significantly
changed in ruptured plaques. These genes may play critical roles in the
stability of atherosclerotic plaques.
The stability of atherosclerotic plaques is now considered to be
essential in the clinical events of atherosclerosis. Studies have shown
the abundant presence of inflammatory cells, such as monocyte-derived
macrophages and T-lymphocytes at the site of ruptured plaques. The
mediators secreted by the inflammatory cells lead to the activation and
proliferation of smooth muscle cells, lesion progression and matrix
degradation of plaque fibrous cap^[122]13. Therefore, the active
inflammation and matrix degradation might be the two most crucial
processes that contribute to the rupture of plaques. However, the
molecular mechanisms are still unclear.
Previous studies compared the gene expression profile between unstable
plaques and normal tissues^[123]14. They screened out the DEGs from
[124]GSE41571, [125]GSE118481, and E-MTAB-2055, and analyzed the
functions and mechanisms of these DEGs by GO and KEGG enrichment. The
genes COL1A2, ADCY3, CXCR4, TYROBP, and IGFBP6 were thought to play
roles in atherosclerotic plaques^[126]14. Another research analyzed the
DEGs in [127]GSE41571^[128]10. COL3A1, COL1A2, ASPN, PPBP were thought
to probably play important roles in the rupture of atherosclerotic
plaques^[129]10. Our research used the WGCNA to construct the
co-expression network. The modules with high association with the
rupture of plaques were identified first. And then we mapped the hub
genes to DEGs. The key genes screened out by this method were based on
not only the expression level, but also the ruptured traits of the
plaques.
In the present study, we identified six modules through WGCNA analysis.
Among these six modules, blue and brown modules were with the highest
correlations with plaque traits (with a correlation of 0.82 and −0.9
respectively); 72 hub genes were identified in these two modules.
Further GO enrichment of the 72 hub genes from blue and brown modules
indicates these genes be mostly related to cell adhesion, extracellular
matrix organization, cell growth, cell migration, and leukocyte
migration. These five processes are related to the key pathogenic steps
of atherosclerosis progression and plaque rupture. In KEGG enrichment,
terms of PI[3]K-Akt signaling, focal adhesion, ECM-receptor interaction
were enriched. Focal adhesion, also named cell-matrix adhesion, is a
structure which assembles the extracellular matrix (ECM) and
interacting cells, through which regulatory signals are
transmitted^[130]15. The interaction between ECM and cells is also
mediated by some transmembrane receptors, such as integrins and CD36.
The cell adhesion, migration, differentiation, proliferation and
apoptosis, and the component of ECM are controlled by the interactions
between ECM and cells. PI[3]K-Akt signaling is an important signaling
pathway, which is included in variable processes such as protein
synthesis, cell survival, apoptosis, proliferation, metabolism, etc.
Taking all these together, the hub genes may affect the signal
transduction and subsequent cell adhesion, migration, proliferation,
and ECM degradation, which eventually lead to the progress and rupture
of plaques.
Further analysis discovered 13 key genes. Among them, EFEMP1, BGN, ELN,
FMOD, DKK3, FBLN5, and FZD6 were decreased in ruptured plaques, while
HLA-DRA, HLA-DMB, HLA-DMA, and RILP were downregulated.
HLA-DRA, HLA-DMB, and HLA-DMA belong to the major histocompatibility
complex, class II. They participate in the adaptive immune response and
T cell receptor signaling pathway. They are binding peptides derived
from antigens that access the endocytic route of antigen presenting
cells (APC) and presents them on the cell surface for recognition by
the CD4 T-cells^[131]16. The site of intimal rupture or erosion of
thrombosed coronary atherosclerotic plaques were always characterized
by abundant expression of HLA antigens on both inflammatory cells and
adjacent smooth muscle cells, suggesting an active inflammatory
reaction^[132]17. Our present research identified HLA-DRA, HLA-DMB, and
HLA-DMA are key genes for the rupture of the plaques. The mechanisms
may relate to the inflammation in atherosclerotic plaques. RILP encodes
Rab interacting lysosomal protein. It is involved in the regulation of
lysosomal morphology and distribution^[133]18. It promotes the
centripetal migration of phagosomes and the fusion of phagosomes with
the late endosomes and lysosomes^[134]18. However the mechanisms of
RILP on atherosclerosis are still unknown. Based on the results of our
present study, RILP might be a new target for the pathogenesis of
atherosclerosis. BGN encodes biglycan. It is involved in collagen fiber
assembly^[135]19. FMOD encodes fibromodulin. It affects the rate of
fibrils formation and has a primary role in collagen
fibrillogenesis^[136]20. FBLN5 encodes fibulin 5, while ELN encodes
elastin. Elastin is a major structural protein of aorta^[137]21. It
controls the stabilization of arterial structure by regulating
proliferation and organization of vascular smooth muscle^[138]22. FBLN5
is essential for elastic fiber formation. It is involved in the
assembly of continuous elastin polymer and promotes the interaction of
microfibrils and ELN^[139]21. These genes might contribute to the
pathogenesis of atherosclerosis via affecting the collagen and the
fibrous cap stability. FZD6 encodes the protein frizzled-6, a 7
transmembrane protein receptor for Wnt4^[140]23,[141]24. FZD6 mediates
the functions of Wnt signaling by both canonical and non-canonical
pathways, but mainly through non-canonical pathway. The activation of
Wnt4/FZD6 involves in the different diseases such as cancer, embryonic
development, bone marrow mesenchymal stem cell
dysfunction^[142]23,[143]25,[144]26. EFEMP1 encodes EGF containing
fibulin extracellular matrix protein 1. EFEMPL binds to EGFR and play
roles in cell adhesion and migration^[145]27. However, the roles of
FZD6 and EFEMP1 in atherosclerosis are still unclear. Our present study
indicates FZD6 might be a novel mechanism of atherogenesis. DKK3
encodes protein dickkopf homolog 3, a secreted glycoprotein belongs to
the dickkopf family. In the cardiovascular system, DKK3 is involved in
the regulation of cardiac remodeling and vascular smooth muscle
differentiation^[146]28,[147]29. In atherosclerotic mice, DKK3
deficiency caused the acceleration of atherosclerosis and led to the
vulnerable atherosclerotic plaques due to the reduction of the number
of SMCs and matrix protein deposition^[148]23,[149]30. Further
mechanism study on DKK3 is needed.
Our study still has some limitations. According to FAQ’s of WGCNA
package, no less than 15 samples are recommended. However, in our case,
only 11 samples were analyzed. Therefore, we need to be aware of
several problems with small sample size, such as poorer fit to
scale-free topology, and the resulted network may be too noisy to be
biologically meaningful. And another limitation is the lack of clinical
trait data in the involved GEO datasets. These may affect the precision
of present results on certain subgroup of patients.
Overall, in our study, we used the WGCNA to identify the key genes for
the ruptured traits of atherosclerotic plaques. 13 key genes were
screened out. These genes may play crucial roles in inflammation,
fibrous cap formation and degradation, and lead to the rupture of
atherosclerotic plaques.
Materials and Methods
Data collection and preprocessing
The gene expression profiles of [150]GSE41571 and [151]GSE120521 were
downloaded from the Gene Expression Omnibus (GEO,
[152]https://www.ncbi.nlm.nih.gov/geo/). The [153]GSE41571 was a
genome-wide expression profile of macrophage-rich regions of 5 ruptured
and 6 stable human atherosclerotic plaques. They were obtained from
carotid endarterectomy operation. The clinical, radiological and
histopathological criteria were used to evaluate the plaques as stale
or ruptured. Laser micro-dissection was used to obtain the
macrophage-rich regions. After amplification of total RNA of these
samples, the transcriptional profiling was performed using [154]GPL570
platform. The [155]GSE120521 was RNA-seq profile of 4 stable and 4
ruptured human atherosclerotic plaques. The platform of [156]GPL16791
Illumina HiSeq. 2500 was used.
The preprocessing of the two series matrix profiles was conducted on R
statistical software version 3.4.2 with related packages from
Bioconductor ([157]www.bioconductor.org). After background correction,
the original data were quantile normalized using Robust Multi-array
Average (RMA) algorithm in the affy package in Bioconductor.
DEGs identification
The two series matrix files were annotated with official gene symbols
using the platform file and annotation package in the R software. The
“LIMMA” package in the R was used to identify the DEGs. Genes with a
logFC > 1.0 and p-value < 0.05 were considered to be DEGs. The heatmap
of these DEGs were conducted using software “MeV”. “ggplot2” package in
R software was used to produce the volcano plot. All the packages used
in R software were downloaded from Bioconductor
([158]http://www.bioconductor.org).
Co-expressed module identification by WGCNA
The WGCNA method was used to construct the co-expression network and to
identify the co-expressed modules. The WGCNA was conducted in the R
software using WGCNA package. The [159]GSE41571 dataset includes 8413
genes. Among them, the top 50% most variantly expressed genes (4207
genes) were used for further WGNCA analysis. The WGCNA was conducted as
previously reported^[160]31. Briefly, the similarity matrix S[ij] was
constructed by calculating the Pearson’s correlation coefficient
between each pair of genes i and j. The adjacency matrix A[ij] was
obtained by calculating the connection strength between gene i and j.
The formulas were as follows:
[MATH:
(Sij)si
gned=|(1+cor(Xi,Y
j))/2| :MATH]
[MATH:
Aij=power(
mo>Sij
,β) :MATH]
Where Xi and Yj were vectors of expression values and Sij was the
Pearson’s correlation coefficient of gene i and j; A[ij] represented
the network strength between gene i and j. A power of β = 9 was
selected in the present study as the soft-thresholding parameter to
ensure a scale-free network.
To identify the co-expressed gene modules, the adjacency metrix was
changed to topological matrix by the method of TOM. The formula for TOM
was as follows:
[MATH:
TOMij<
/mi>=∑K
=1N<
mi>Aik⋅Akj+
mo>Aijmin(Ki,K<
mrow>j)+1−Aij :MATH]
With the minimum gene group size set as 20 for the gene dendrogram, the
TOM-based dissimilarity measure was adopted to perform the average
linkage hierarchical clustering. Using the Dynamic Tree Cut algorithm,
genes were categorized into the same gene modules if they had similar
expression profiles.
Relations between identified modules and plaque straits
To identify the gene modules which were highly related to the ruptured
traits of atherosclerotic plaques, module eigengenes (MEs) were firstly
identified. MEs were principal components of a gene module. The
expression of MEs represented all genes in the given module. The
clinically significant modules were identified by calculating the
correlation between MEs and ruptured traits of plaques. The gene
significance (GS) and module significance (MS) were also calculated to
identify the relations between gene expression and ruptured traits of
plaques. Once the linear regression model of ruptured traits of plaques
vs. gene expression was obtained, the GS was then defined as the
mediated P-value of each gene (GS = lgP) in the model. The MS was
defined as the average GS of all the genes involved in the module. MS
was measured to incorporate clinical information into the coexpression
network.
Hub genes identification
There is a tendency that, in a given module, the hub genes have strong
correlations with certain clinical traits. To identify the hub genes of
a module, the correlation between ruptured traits of plaques and gene
expression were measured by the absolute value of Pearson’s correlation
(cor.gene TraitSignificance>0.2). The inter-module connectivity of
genes was also measured by the absolute value of the Pearson’s
correlation. Genes with cor.geneModuleMembership>0.9 were considered to
be hub genes.
Gene ontology and pathway enrichment analysis
The DAVID Functional Annotation Bioinformatics Microarray Analysis
(website:[161]https://david.ncifcrf.gov/) was used for the GO
enrichment and KEGG pathway analysis. The GO enriched the genes into 3
terms: Molecular Function (MF), Biological Process (BP), and Cellular
Component (CC). A q-value<0.05 was considered to be statistically
significant for the correlations. Through GO and KEGG enrichment, the
possible functions or biological processes of the genes in a given
module were predicted. The Bubble Charts were conducted using the
OmicShare tools, a free online platform for data analysis
([162]http://www.omicshare.com/tools). The RichFactor was calculated by
“counts”/“pop hits”. The “counts” is the number of hub genes enriched
in a certain term. The “pop hits” is the number of all genes enriched
in a certain term.
Protein-protein interaction network integration
The STRING online tool (website: [163]https://string-db.org/) was used
for the identification of protein-protein interactions. The STRING
database covers the number of 9 643 763 proteins from 2 031 organisms.
It provides direct (physical) interactions and indirect (functional)
associations; they stem from computational prediction, from knowledge
transfer between organisms, and from interactions aggregated from other
(primary) databases. The main databases used for STRING are the Genomic
Context Predictions, the High-throughput Lab Experiments, the
(Conserved) Co-Expression, the Automated Textming, and the Previous
Knowledge in Databases.
Key gene identification and validation
To figure out the key genes, we firstly mapped the hub genes to the
DEGs and obtained the logFC and p-value of each hub gene. Then we
screened out the hub genes with logFC>1.0 and p-value <0.05. These
genes were used to construct the sub-network using Cytoscape 3.7.2
software. The plugin “MCODE” tool in the “Apps” of the Cytoscarpe was
used to calculate the most significant clusters of genes. The genes in
the clusters were considered to be key genes.
To validate the key genes, the expression profile of dataset
[164]GSE120521 was downloaded from GEO, and the key genes expression
were analyzed.
For further validation, human CEA plaques were used. A total of 10
patients with stable plaques and 10 patients with ruptured plaques were
involved. All patients were performed CEA from the First Hospital of
Jilin University (Changchun, Jilin, China) from July, 2019 to November,
2019. The procedures were approved by the Ethics Committee of the First
Hospital of Jilin University (No.2019-272, Changchun, Jilin). Written
informed consent was obtained from every participant.
The surgical specimens were obtained and fast-frozened in liquid
nitrogen, then they were stored at −80 °C until use. The evaluation of
stable and unstable plaques was conducted according to the
classification defined by the American Heart Association (AHA) and
previous report^[165]32. The type I/II plaques are with near-normal
wall thickness but no calcification; the type III plaques are small
eccentric plaque, or plaque with diffuse intimal thickening, but no
calcification; the type IV/V plaques has lipid or necrotic core, which
is surrounded by fibrous tissue, and with possible calcification; the
type VI are complex plaque with possible surface defect, haemorrhage or
thrombus; the type VII are plaques with obvious calcification; the type
VIII are fibrotic plaque with possible small calcification, but without
lipid core^[166]32. In the present study, the type I-II, III, VII, VIII
were considered stable plaques, while type IV-V, VI to be ruptured
plaques. The classification of carotid artery was performed by two
independent investigators.
The expressions of key genes in these plaques were analyzed using qPCR
as previously reported by our lab^[167]33. Briefly, the total RNA was
extracted by Trizol. After reverse transcripted to cDNA, the qPCR
reaction was conducted using the SYBR Green Premix DimerEraser Kit
(TaKaRa Bio Inc., Dalian, China) on ABI QuantaStudio5 system. The qPCR
program was: 95 °C for 30 s, followed by 40 cycles of 95 °C 5 s, 60 °C
30 s. The primers used were listed in Table [168]S3. All samples were
run in triplicate. The results were calculated by the 2^(−ΔΔCq)
Method^[169]34.
Acknowledgements