Abstract Cardiac-cerebral vascular disease (CCVD), is primarily induced by atherosclerosis, and is a leading cause of mortality. Numerous studies have investigated and attempted to clarify the molecular mechanisms of atherosclerosis; however, its pathogenesis has yet to be completely elucidated. Two expression profiling datasets, [37]GSE43292 and [38]GSE57691, were obtained from the Gene Expression Omnibus (GEO) database. The present study then identified the differentially expressed genes (DEGs), and functional annotation of the DEGs was performed. Finally, an atherosclerosis animal model and neural network prediction model was constructed to verify the relationship between hub gene and atherosclerosis. The results identified a total of 234 DEGs between the normal and atherosclerosis samples. The DEGs were mainly enriched in actin filament, actin binding, smooth muscle cells, and cytokine-cytokine receptor interactions. A total of 13 genes were identified as hub genes. Following verification of animal model, the common DEG, Tropomyosin 2 (TPM2), was found, which were displayed at lower levels in the atherosclerosis models and samples. In summary, DEGs identified in the present study may assist clinicians in understanding the pathogenesis governing the occurrence and development of atherosclerosis, and TPM2 exhibits potential as a promising diagnostic and therapeutic biomarker for atherosclerosis. Keywords: cardiac-cerebral vascular diseases, atherosclerosis, differentially expressed genes, protein-protein interaction, bioinformatics analysis INTRODUCTION Cardiac-cerebral vascular diseases (CCVDs) have a high prevalence, disability and mortality rates, and are a serious threat to human health, especially to the health of the populace aged over 50 years [[39]1]. Each year, 15 million people die from CCVDs, which are primarily induced by atherosclerosis, and are the leading cause of mortality in the world [[40]2]. Numerous studies [[41]3–[42]7] have demonstrated that the pathophysiological process for the development of atherosclerosis are closely associated with the mutation and abnormal expression of genes, which include fms-like tyrosine kinase-1 (Flt-1), tumor necrosis factor (TNF)-α, apolipoprotein A-I (apoA-I), vascular endothelial growth factor (VEGF) and angiogenin. A previous study demonstrated that the low-expression of Flt-1 may predict the occurrence of endothelial injury, which subsequently results in the occurrence of atherosclerosis [[43]3]. The stronger the proliferation competence of endothelial progenitor cell (EPC) is, the less feeble endothelial injury may be, and the overexpression of TNF-α may damage the process of EPC development [[44]4]. In addition, the mutation of Apolipoprotein A-I (apoA-I), an anti-atherogenic gene, has been hypothesized to accelerate the apoptosis of vascular endothelial cells and downregulate the levels of eNOS and heme oxygenase-1, which culminates in atherosclerosis plaque formation [[45]5]. It is well established that the expression of VEGF and ANG stimulates the renewal of vascular endothelial cells. Therefore, abnormal inactivation of VEGF and angiogenin expression may exert a pivotal function in the occurrence and development of atherosclerosis [[46]6, [47]7]. However, due to the lack of timely detection, dynamic monitoring and effective control of the occurrence and progression of atherosclerotic stenosis and vulnerable plaques, the occurrence and recurrence of ischemic CCVDs still cannot be effectively controlled, hence why research investigating ischemic cardiac-cerebral vascular diseases is one of the principal areas of research at home and abroad [[48]8]. Therefore, it is imperative to explore the accurate molecular targets included in the hyperplasia and recidivation of atherosclerosis, in order to make a contribution to the diagnosis and treatment of atherosclerotic diseases. Since the 21^st century, bioinformatics technology has been increasingly used to excavate the potential genetic targets of diseases, which has assisted researchers in authenticating the differentially expressed genes (DEGs) and underlying pathways that are associated with the occurrence and recurrence of atherosclerosis [[49]9–[50]12]. However, it is difficult to acquire credible results when using the independent microarray technology owing to the higher false-positive rates [[51]13]. Therefore, the present study downloaded and analyzed two human expression profiling datasets from the Gene Expression Omnibus (GEO) Dataset, and identified the DEGs between non-atherosclerotic tissues and atherosclerotic tissues. The molecular mechanisms of the occurrence and development of atherosclerosis were subsequently explored via enrichment analysis of functions and pathways, protein-protein interaction (PPI) network analyses and co-expression network analyses, and a total of 234 DEGs and 13 hub genes were identified and authenticated. In addition, in order to verify the results of the bioinformatics analysis, an animal experiment using two groups (a control and atherosclerosis model group) was implemented. The data of animal experiment was digged via a proteomics assay, and differentially expressed proteins were identified between the control and atherosclerosis groups. Then after comparing the bioinformatics result and proteomic data, Tropomyosin 2 (TPM2) was identified as a commonly differentially expressed gene, which exhibits potential as a molecular target or biomarker for atherosclerosis. A series of low flux experiments were subsequently performed to verify the role and function of TPM2 in atherosclerosis. RESULTS Validation of the datasets To further validate the intra-group data repeatability, we employed the Pearson’s correlation test and principal component analysis (PCA). Based on the Pearson’s correlation test, we found that in the [52]GSE43292 dataset there were strong correlations among the samples in the control group and that there were also strong correlations among the samples in the atherosclerosis group ([53]Figure 1A). Based on the PCA the intra-group data repeatability for [54]GSE43292 was acceptable. The distances between per samples in the control group were close and the distances between per samples in the atherosclerosis group were also close in the dimension of principal component-1 (PC1) ([55]Figure 1B). Based on Pearson’s correlation test, we found that for [56]GSE57691 there was a strong correlation among the samples in the control group and a strong correlation among the samples in the atherosclerosis group ([57]Figure 1C). The PCA showed the intra-group data repeatability to be acceptable in the [58]GSE57691 dataset. The distances between per samples in the control group were close, and distances between per samples in the atherosclerosis group were also close in the dimension of PC1 ([59]Figure 1D). Figure 1. [60]Figure 1 [61]Open in a new tab (A) Pearson’s correlation analysis of samples from the [62]GSE43292 dataset. The color reflects the intensity of the correlation. When 00.4 [[217]45]. The score > 0.4 is an important restriction of the network. If the rule of score changed, the PPI maybe different. Cytoscape (version 3.6.1) is a free visualization software and was applied to visualize the PPI networks [[218]54, [219]55]. Based on the topology principles, the Molecular Complex Detection (MCODE) (version 1.5.1), a plug-in of Cytoscape, could discover the tightly coupled region [[220]56]. The Cytoscape software initially constructed the PPI network map. Secondly, MCODE identified the most important module of the network map. The criteria of MCODE analysis was a degree cut-off=2, MCODE scores >5, Max depth=100, k-score=2, and node score cut-off=0.2. When the degrees were set (degrees≥13) [[221]45], the hub genes were excavated. The clustering analysis of hub genes was performed using OmicShare. The animal model construction A total of 20, 3-month-old New Zealand white rabbits were acquired from the Institute of Laboratory Animal Sciences, Chinese Academy of Medical Sciences (CAMS) & Peking Union Medical College. The rabbits were randomly divided into two groups. In the control group (CON group, n=10), rabbits were fed with standard rabbit chow (0% cholesterol), and did not undergo abdominal aortic balloon injury. In the atherosclerosis group (n=10), the rabbits were fed a high fat diet (6% bean oil and 1% cholesterol) for eight weeks, and underwent abdominal aortic balloon injury. The Animal Care and Use Committee of the Institute of Laboratory Animal Sciences, Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS∆PUMC) authorized the experimental ethics agreement. With fasting 12h before surgery and unlimited drinking, the rabbits of atherosclerosis group were operated after anesthesia with 3% sodium pentobarbital solution (3mg/kg of animal body weight; provided by institute of laboratory animal sciences, CAMS&PUMC, Beijing, China). Then, along the right femoral artery, the surgeon opened the skin, separated the subcutaneous tissue layer by layer and freed the femoral artery(2-3cm). Furthermore, punctured the femoral artery, put in a 4F vessel sheath, expanded and pulled back the balloon for 3 times, ligatured the right femoral artery, and sutured skin. Finally, animals were taken 40,000 U penicillin after surgery for 5 days. Following euthanasia of rabbits with 3% pentobarbital solution (300mg/kg of animal body weight; provided by institute of laboratory animal sciences, CAMS∆PUMC, Beijing, China) and air embolization, and the indicators of judging death include breathing and cardiac arrest, pupil dilation, and nerve reflex disappeared. Then, the abdominal aortic tissues were dissected. The establishment of model animals is groping for ourselves. To evaluate atherosclerosis model, gross examination of abdominal aorta was performed by HE staining. Proteomic analysis Rabbit abdominal aortic tissues were ground in liquid nitrogen and lysed by protein extraction buffer (8M urea, 0.1%SDS) containing additional protease inhibitor cocktail (Roche, Switzerland) and 1mM phenylmethylsulfonyl fluoride (Beyotime Biotechnology, China) on ice for 30 min and then centrifuged at 14,000 rpm for 15 min at 4°C. The supernatant was gathered and the proteins’ concentration was measured with BCA assay (Pierce, USA). The cell lysis was stored at -80 degrees Celsius before further processing. Samples are tested for quality by using BCA quantitative kit. iTRAQ (AB Sciex, USA) with different reporter ions (113-121 Da) were applied as isobaric tags for relative quantification. iTRAQ labeling was conducted on the basis of the instructions of manufacturer. C18 chromatographic column sample classification was carried out. A total of 40 fractionations of labeled peptides were further concatenated into 20 fractions, vacuum dried and deposited at -80 degrees Celsius until further LC-MS analysis. And the LC-MS/MS analysis was performed by Q Exactive mass spectrometer (Thermo Scientific, USA). The NCBI s RefSeq rabbit protein sequence database was searched by the Sequest algorithms with Proteome discoverer software (version 1.4) (Thermo Scientific, USA). And the DEGs were identified in our own data. The rule for statistical significance was that adj. P-value≤0.01 and Fold change (FC) ≥1.5. The level of smooth muscle cell (SMC) Paraffin sections were made using the tissue samples of the abdominal artery. The paraffin sections were deparaffinized with water, sealed with hydrogen peroxide, then washed with double distilled water. SMC were detected by using immunohistochemistry after antigen retrieval. Anti-SMC monoclonal antibody was used to detect macrophagocytes. According to the instructions of the VECTASTAIN Elite ABC Kit (Vector Laboratories, Burlingame, CA, USA), the specific detection steps were performed as follows. First, antigen-fixed paraffin sections were washed with phosphate-buffered saline (PBS) two to three times (5 min/times) and blocked with 10% goat serum (TransGen Biotech, Beijing, China) at 37°C for 20minutes. Second, removed the serum by using filter paper, and added YAP or TAZ Rabbit polyclonal antibody (Abcam, Cambridge, UK) dropwise, then incubated overnight at 4°C. Third, the sections were washed with PBS three times (5 min/time) and incubated with goat anti-rabbit monoclonal antibody at 37°C for 1 hour. Fourth, color development was performed with diaminobenzidine. Each paraffin section was photographed at 6 fields and counted. Also, immunofluorescence technology was made for detecting SMC level. RT-qPCR assay Total RNA was extracted from atherosclerosis samples and control artery samples by the RNAiso Plus (Trizol) kit (Thermofisher, Massachusetts, America), and reverse transcribed to cDNA. RT-qPCR was performed using a Light Cycler® 4800 System with specific primers for TPM2. [222]Table 6 presents the primer sequences used in the experiments. The RQ values (2−ΔΔCt, where Ct is the threshold cycle) of each sample were calculated, and are presented as fold change in gene expression relative to the control group. GAPDH was used as an endogenous control. Table 6. Primers and their sequences for PCR analysis. Primer Sequence (5′–3′) β-actin-hF CGCAGAAACGAGACGAGATTG β-actin-hR GATGCTCGCTCCAACGACTG TPM2-hF TCCACCAAGGAGGACAAATACG TPM2-hR GTTGTTGAGTTCCAGCAGGGTC [223]Open in a new tab Western blot analysis Abdominal aortic tissues were frozen in liquid nitrogen. Total protein was isolated in a lysis buffer, resolved by 10% sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and transferred onto polyvinylidene fluoride (PVDF) membranes by electroblotting. TPM2 protein was detected using an anti-TPM2 polyclonal antibody (1:500 dilution, Bioss, Beijing, China). The bands were visualized with an enhanced chemiluminescence kit (Millipore, Billerica, MA) and analyzed with Image-Pro Plus 6.0 (Media Cybernetics, US). The construction of neural network model The training group was randomly divided into calibration data and training data according to the proportion of 1:2.46. There were 13 samples in the calibration data, and 32 samples in the training data. We used Matlab (version 8.3) to accomplish the normalization processing of variable values, network simulation, network training, and network initialization. The number of input neurons in input layer is same as the number of input variables, and the number is two. The hidden layer is designed as 1 layer, and the output layer is also designed for 1 layer. One output variable is intima-media thickness. When training to 2000 steps after repeated training, the falling gradient is 0, and the training speed is uniform. At the same time, the training error is 0.0064465, and the R (relativity) value reached 0.99133. Statistical analysis The results are presented as the mean ± standard error of the mean. When two groups were compared, an unpaired Student’s t-test was performed to determine statistical significance. The Pearson-rho test was executed to compare the expression of hub genes and status of atherosclerosis for the correlation analysis. When any analytic results reached a liberal statistical threshold of p < 0.2 for each comparison, the risk factors were forced into the univariable linear regression model to confirm independent risk factors for the status of atherosclerosis. Univariate and multivariate logistic regression analysis was used to calculate the odds ratios (ORs) of each hub gene expression for the status of atherosclerosis. The receiver operator characteristic (ROC) curve analysis was performed to determine the usefulness of TPM2 for predicting AS. The Pearson-rho test was executed to compare intima-media thickness, the relative expression of TPM2 and SMC level for the correlation analysis. The cubic spline interpolation algorithm was implemented to analyze the high-risk warning range of atherosclerosis. The statistical analyses were conducted using SPSS software, version 21.0 (IBM Corp., Armonk, NY, USA). P<0.05 was considered to indicate a statistically significant difference. Ethics approval All experiments were approved by Animal Care and Use Committee of the Institute of Laboratory Animal Sciences, Chinese Academy of Medical Sciences (CAMS) & Peking Union Medical College. All institutional and national guidelines for the care and use of laboratory animals were followed. ACKNOWLEDGMENTS