Abstract Purpose The aim of this study is to investigate abdominal aortic aneurysm (AAA), a disease characterised by inflammation and progressive vasodilatation, for novel gene-targeted therapeutic loci. Methods To do this, we used weighted co-expression network analysis (WGCNA) and differential gene analysis on samples from the GEO database. Additionally, we carried out enrichment analysis and determined that the blue module was of interest. Additionally, we performed an investigation of immune infiltration and discovered genes linked to immune evasion and mitochondrial fission. In order to screen for feature genes, we used two PPI network gene selection methods and five machine learning methods. This allowed us to identify the most featrue genes (MFGs). The expression of the MFGs in various cell subgroups was then evaluated by analysis of single cell samples from AAA. Additionally, we looked at the expression levels of the MFGs as well as the levels of inflammatory immune-related markers in cellular and animal models of AAA. Finally, we predicted potential drugs that could be targeted for the treatment of AAA. Results Our research identified 1249 up-regulated differential genes and 3653 down-regulated differential genes. Through WGCNA, we also discovered 44 genes in the blue module. By taking the point where several strategies for gene selection overlap, the MFG (ITGAL and SELL) was produced. We discovered through single cell research that the MFG were specifically expressed in T regulatory cells, NK cells, B lineage, and lymphocytes. In both animal and cellular models of AAA, the MFGs' mRNA levels rose. Conclusion We searched for the AAA novel targeted gene (ITGAL and SELL), which most likely function through lymphocytes of the B lineage, NK cells, T regulatory cells, and B lineage. This analysis gave AAA a brand-new goal to treat or prevent the disease. Keywords: Abdominal aortic aneurysm[1], Immune microenvironment[2], Machine learning[3], Mitochondrial fission[4], Single cell analysis[5], Bioinformatics[6] 1. Introduction Abdominal aortic aneurysm (AAA) is a serious vascular disease with a prevalence of 2%–12% and more than one third of patients die before they reach the operating room [[39][1], [40][2], [41][3]]. Men over 65 are more likely to develop AAA, which are typically asymptomatic in the early stages [[42]4]. AAA is an abdominal aortic aneurysm with a maximum aneurysm diameter of 30 mm or 1.5 times the normal diameter, which can lead to rupture and bleeding of the aorta and death of the patient within minutes of blood loss [[43]5,[44]6]. The main risk factors for AAA include advanced age, low levels of HDL cholesterol, poorly controlled hypertension (HTN), coronary artery disease (CAD), smoking and other genetic factors [[45]7,[46]8]. Currently, however, the treatment of choice for AAA is still open surgery and endovascular aortic repair [[47]9]. Although these treatments then offer a better prognosis, the risk of surgical and post-operative complications can still be harmful to a person's health, such as acute kidney injury, intraoperative hypertension, etc. [[48][10], [49][11], [50][12]]. There is an unmet clinical need for the development of pharmacologic therapies for small AAA to slow or stop progressive aneurysm enlargement and rupture. Early elective surgical repair of some minor AAAs is not advantageous. To improve understanding of AAA etiology and ultimately result in the development of pharmacologic therapies, a sizable amount of research is presently being conducted [[51]13].A growing number of studies are now targeting biomarkers of AAA with the aim of preventing the development and re-rupture of AAA. Thus, elucidating the underlying molecular mechanisms of AAA and finding key therapeutic targets is essential for the treatment of AAA. The most important pathological features in the development of abdominal aortic aneurysms include elastin destruction, infiltration of inflammatory cells and membrane degradation in Ref. [[52]14]. The primary structural element of the vascular wall, vascular smooth muscle, performs a variety of physiological tasks. Reactive oxygen species are created in huge quantities throughout the AAA disease process, harming the vascular smooth muscle cells. When elastin and collagen formation are reduced as a result of oxidative stress, artery walls can deteriorate and even burst [[53]15,[54]16]. Inflammatory cells including lymphocytes and neutrophils, in addition to macrophages, have a role in the development of AAA [[55]17]. During the development of AAA, aneurysms accumulate neutrophils, monocytes, macrophages and dendritic cells (DCs), just like atherosclerosis [[56][18], [57][19], [58][20], [59][21]]. They play an important role in the inflammation of the aorta and associated vascular destruction caused during the development of AAA. It has been demonstrated that splenic-derived monocytes promote the development of active AAA [[60][22], [61][23], [62][24]]. Amin HZ et al. found that depletion of CD11c led to a reduction in macrophages and CD4 T cells, resulting in reduced aortic inflammation and a reduced chance of AAA and aortic dissection [[63]25].In addition, it has been shown that degradation of elastin and collagen, among others, occurs along with infiltration of CD4^+ T cells during AAA development [[64]26].However, despite the specific immune cell changes that influence AAA, there is still no precise and rational explanation or specific mechanism that describes the process of AAA. As the "powerhouse" of the cell, mitochondria provide the energy currency needed for cellular processes [[65]27,[66]28]. The morphology and function of mitochondria are dynamically changing, responding to the external environment through fusion, fission, and migration [[67]29]. Among these, mitochondrial fusion and fission are crucial for the state and function of the cell [[68]30,[69]31]. Mitochondrial fission is a cellular stress response. It relies on dynamin-related protein 1 (Drp1) to induce the division of mitochondria, peroxisomes, and endoplasmic reticulum, allowing the cell to respond to external stimuli [[70]32]. Studies have shown that in both the models of atherosclerosis and non-atherosclerotic abdominal aortic aneurysm (AAA), the addition of the Drp1 inhibitor mdivi-1 resulted in a decrease in the diameter of the aorta [[71]32]. This result suggests that the progression of AAA is inhibited, indicating that mitochondrial fission plays an important role in AAA [[72]32,[73]33]. Therefore, we plan to analyze the existing data samples through the perspective of bioinformatics and computer machine learning, and at the same time establish AAA animal models for experimental validation, so as to further excavate new genetic markers of AAA, which can provide new research ideas for the treatment and intervention of the disease. 2. Materials and methods 2.1. Patients and datasets from GEO and analysis of variance Clinical information on abdominal aortic aneurysms and an expression profile matrix of samples were downloaded from the Gene Expression Omnibus (GEO) database ([74]https://www.ncbi.nlm.nih.gov/geo/). Of these, [75]GSE57691 was the experimental ladder. After normalization of the data, differential expression analysis was performed using R package “limma”. Differentially expressed genes (DEGs) were considered significantly differentially expressed when DEGs satisfied | log2FC | > 1.0 and p-Value<0.05. 2.2. Weighted gene co-expression network analysis of up-regulated DEGs The up-regulated gene matrix was extracted to analyze by weighted gene co-expression network analysis (WGCNA) according to the method in the literature [[76]34,[77]35]. In brief, it mainly relies on the R packets “WGCNA”, “reshape2”, and “stringr”. The median absolute deviation of the genes in each sample was calculated, by which we filtered out the top 50% of the smallest genes. After determining the soft threshold (β = 24) and R2 = 0.87, the adjacency relation is transformed into a topological overlap matrix (TOM), and the corresponding similarity degree (1-TOM) is calculated at the same time. Determining the minimum size (genome) of the gene tree to be 30 and the sensitivity to be 3, combined with a distance of less than 0.25, we cluster genes to different modules. Module eigengene (ME) represents the gene expression profile of the whole module, and used to describe the expression pattern of the module in each sample. Module membership (MM) refers to the correlation coefficient between a given gene and a given ME, and describes the reliability of genes belonging to a module. 2.3. Functional and pathway enrichment analysis To further explore the potential biological functions of the selected module, functional enrichment analysis (Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways) of genes in key modules was performed using the R package "ClusterProfiler" [[78]36]. We described the molecular function (MF), cellular component (CC), and biological process (BP) of genes through GO and KEGG pathways was used to analyze metabolic pathways in organisms to understand the interactions between genes in biological systems. We calculated the enrichment score for each sample in the expression matrix using Gene Set Variation Analysis (GSVA) and ranked the genes according to the enrichment score. Using gene expression profiles, relevant pathways and molecular mechanisms were predicted. 2.4. Immune infiltration analysis In order to investigate the changes of infiltrating immune cells around myocardial tissue after myocardial infarction, we used the single-sample gene set enrichment analysis (ssGSEA) method from the R package GSVA, the Microenvironment Cell Populations-counter (MCP-counter) method and Cibersort to comprehensively investigate the abundances of infiltrating immune cells in AAA and Control group [[79][37], [80][38], [81][39]]. ssGSEA is an extension of GSEA. The enrichment score (ES) for each gene was calculated to show the difference of 28 immune cells in different samples. MCP-counter uses transcriptome data to quantify the absolute abundance of eight immune cells and two stromal cells: T cells, CD8 T cells, cytotoxic lymphocytes, B lineage, NK cells, monocytic cells lineage, myeloid dendritic cells, neutrophils, endothelial cells and fibroblasts for each sample from a gene expression matrix. These scores were used to directly compare the abundance of the corresponding cell type in samples in the cohort. 2.5. Machine learning screening of feature gene After WGCNA analysis and gene enrichment analysis, we selected the blue module as the best module with the module gene as the key gene. The least absolute shrinkage and selection operator (LASSO) is a regression method in statistics. LASSO is a regression method for selecting a variable to improve the predictive accuracy and is also a regression technique for variable selection and regularization to improve the predictive accuracy and comprehensibility of a statistical model. It improves model prediction accuracy, feasibility and choice variability through regularization. LASSO analysis was performed using the R package "glmnet" (version 4.1.7) [[82]40]. Support vector machines (SVM) are machine algorithms that establish a threshold between two categories and make predictions based on one or several feature vectors [[83][41], [84][42], [85][43]]. At the same time, we used the SVM-REF algorithm by R package "e1071"(version 1.26.1) and repeated the 10-fold cross-validation with 5 repetitions, choosing the "random" repeated cross-validation method, SIZE = 1:10, to extract the genes with higher variable importance. Random Forests (RF) are able to predict continuous variables and provide predictions with almost no significant fluctuations and no restrictions on continuous variables [[86]44]. Random Forests analysis uses the R package "RandomForest" (version 4.7.1.1). IncMS is used in the random forest method. IncMSE and IncNodePurity are used by the random forest method to evaluate the genes. And IncNodePurity was more beneficial. By awarding a point to the class that is the most prevalent among the k points closest to it, the classification method known as K-Nearest Neighbor (KNN) was utilized [[87]45]. With flexibility and scalability, Extreme Gradient Boosting (XGBoost) combines a number of machine learning models to provide higher learning outcomes by the XGBoost package (version 1.4.1.1) [[88]46]. LASSO, SVM-REF, RF, KNN and XGBoost are used to predict and filter the blue module genes respectively, selecting the most feature genes All parameters were set according to the basic parameters of the R package and adjusted with the reference provided by the reference. In addition, the results of all screening techniques are required to be within statistical significance. 2.6. Construction of protein-protein interaction (PPI) network and screening of hub genes 4401 mitochondrial fission related genes were screened in GeneCard Website ([89]https://www.genecards.org/). GeneCards used Relevance Scores to measure the relevance of a gene to a phenotype.After the module genes were screened out from the blue module. Then, import the blue module genes into the STRING database ([90]https://cn.string-db.org/) to obtain the protein-protein interaction (PPI) network. The TSV file was imported into Cytoscape (version:3.9.1) software, and key genes were selected through two Cytoscape plugins (Cytohape plugins) to obtain the protein-protein interaction (PPI) network. The TSV file was imported into Cytoscape (version:3.9.1) software, and key genes were selected through two Cytoscape plugins (Cytohubba and MCODE). Cytohubba was selected according to maximal clique centrality (MCC) algorithm, while MCODE was selected according to degree cutoff = 2, max depth = 100. Hub genes were obtained by combining the four filtering results from machine learning, mitochondrial fission related genes and the results from two plug-ins in Cytoscape software to create a Venn diagram. The special ROC curves of Hub gens were plotted by the R package "pROC" to detect its diagnostic value. 2.7. Annotation and analysis of single cell types We randomly divided the six samples of the dataset [91]GSE166676 into two groups as AAA1 ([92]GSM5077727, [93]GSM5077728, [94]GSM5077730 and [95]GSM5077731), AAA2 ([96]GSM5077729 and [97]GSM5077732). We first evaluated the dataset by QC assessment. We examined cells with gene expression in the range of 250–2500 genes and filtered cells with >10% mitochondrial genes and 3% erythrocyte genes. Afterwards, we used the harmony method to identify 2000 highly variable features (HVGs) for analysis and set the number of principle components (PC) to 20 to generate cell clusters, which were then visualized. The "Seurat" R Package was used to find the top 10 most representative HVGs for each cluster and finally 18 cell subgroups were identified ([98]Supplementary Table 1). We then used the "SingleR" R Package to annotate the top 10 genes, the more genes corresponded to cells, the closer the cell types of the subgroups. Afterwards, we extracted and visualized the expression of the feature genes in the different cell subpopulations. 2.8. Establishment of the AAA model Male ApoE−/− mice were purchased from Charles River, Beijing and housed in the Animal Observation Unit of Xiamen University School of Medicine. All mice were housed at approximately 50% relative humidity, constant temperature (22–25 °C), standard light conditions (12 h/12 h dark cycle) and were allowed to eat and drink freely. The whole procedure was approved by the Institutional Animal Care and Use Committee of Xiamen University and was in accordance with the Guide for the Care and Use of Laboratory Animals published by the Ministry of Health of the People's Republic of China and the National Institutes of Health. Male ApoE−/− mice (approximately 6 months old) were anesthetized with sodium pentobarbital (50 mg/kg) and implanted with an osmotic micropump (Alzet, #2004#, Durect corporation, USA). The pumping reagent was Ang II (A9525, Sigma, USA), and was delivered at a rate of 1000 ng/kg/min for 4 weeks. The control group was pumped with saline. Four weeks after surgery, mice were anesthetized with pentobarbital sodium (50 mg/kg) and executed to obtain abdominal aorta for analysis. 2.9. H&E staining and elastin van gieson (EVG) staining HE staining and EVG staining of the removed abdominal aortic samples were performed as follows: Dewaxing as follows: Xylene I (10023418, SCRC, China) for 20 min; Xylene II for 20 min; 100% ethanol (100092683. SCRC, China) I for 5 min; 100% ethanol II for 5 min; 75% ethanol for 5 min; Rinsing with tap water; Stain sections with Hematoxylin solution (G1003. Then treat the section with Hematoxylin Differentiation solution, rinse with tap water. Then treat the section with Hematoxylin Differentiation solution, rinse with tap water. 85% ethanol for 5 min; 95% ethanol for 5 min; Finally Stain sections with Eosin dye for 5 min Neutral gum (10004160, SCRC, China) sealing [[99]47,[100]48]. Microscopic observation and image acquisition. 2.10. Enzyme-linked immunosorbent assay The serum levels of the cytokines IL-6 (cat.no. 88-7064-86), and IL-1β (cat.no. 88-7013-86) were measured using the mouse ELISA kit (eBioscience, San Diego, CA, United States), and concrete experimental steps were performed according to the instructions of the ELISA kit. 2.11. Cell culture and stimulation Mouse aortic smooth muscle cells (MOVAS) were purchased from the American Type Culture Collection (ATCC). Cells were cultured in high sugar medium cultured in DMEM mixed with 1% penicillin-streptomycin and 10% bovine fetal serum. Cells were spread evenly in 6-well plates and used at TNF-α (100 ng/ml) per well for 24 h. Subsequently, the cells were spoken for collection for further assays. 2.12. Extraction of sample RNA and RT-PCR The abdominal aortic aneurysm tissue and cells were ground up and 1 ml TRIzol (Takara, China) was added and homogenised using a homogeniser. Add 0.2 ml of chloroform, shake vigorously for 15 s and leave at room temperature for 5 min centrifuge at 10,000×g for 15 min at 2–8 °C The sample was divided into three layers: a yellow organic phase at the bottom, a colorless aqueous phase at the top and an intermediate layer. the RNA was mainly in the aqueous phase, which was approximately 60% of the volume of TRIzol reagent used. The aqueous phase was transferred to a new tube. The RNA in the aqueous phase was precipitated with isopropanol. 0.5 ml of isopropanol was added for every 1 ml of TRIzol used and left at room temperature for 10 min centrifugation at 14,000×g for 10 min at 4 °C. No RNA precipitation was visible before centrifugation, but it appeared after centrifugation. Remove the supernatant. Wash the RNA precipitate with 1 ml of 75℅ ethanol. Centrifuge at 14,000×g for 5 min at 4°Cand discard the supernatant. The RNA was reversed and followed by real-time PCR analysis (LC480, USA). The primer sequences are shown in ([101]Supplementary Table 2). 2.13. Western blot A bicinchoninic acid (BCA) protein kit from Abcam in the United States was used to measure the protein concentrations after the tissues were processed with RIPA lysis buffer (Beyotime, China). Using sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) at 8% or 12%, the proteins were separated before being transferred to a polyvinylidene fluoride (PVDF) membrane (Beyotime, China). The PVDF membrane was incubated with the following primary antibodies overnight at 4 °C after being blocked with 5% skim milk for 1 h:GAPDH (1:10,000, Cat. No. 60004-1-Ig, Proteintech), α-SMA (1:1,000, Cat. No. 14395-1-AP, Proteintech) and Calponin (1:1,000, Cat. No. ab46794, Abcam) [[102]49,[103]50]. 2.14. Predicting AAA-targeted drugs Connectivity Map (Cmap) ([104]https://clue.io/) is a gene expression database that uses different interferents (including small molecules) to detect differences in cellular gene expression after treatment of human cells, thus creating a database of interferents, gene expression and disease interconnections [[105]51,[106]52]. Cmap originally counted the gene changes caused by different drugs stimulating different cells. In this way, we are able to analyze the names of differentially expressed genes by filtering them out and calculate a standardized connectivity score (CS), which can be used to find the corresponding drugs in the original database [[107]53,[108]54]. We separately imported the blue module differential genes into Cmap, and through computation, screened for suitable drugs and visualized and analyzed them through the R language. 2.15. Statistical analysis All results are presented as the mean ± standard deviation (SD) and analyzed using GraphPad (version 8.0.1, GraphPad Prism Software). Differences of each sample were evaluated using an unpaired Student's two-tailed t-test or the rank sum test. A value of p < 0.05 was considered significant. All experiments were repeated at least three times. 3. Results 3.1. DEGs search and WGCNA analysis The above experimental ideas steps are all in [109]Supplementary Fig. 1. [110]GSE57691 samples were extracted from the database and divided into Control and AAA groups and we obtained 1249 up-regulated and 3653 down-regulated DEGs by R package “limma” (| log2FC | > 1.0 and p-Value <0.05) ([111]Fig. 1A and B). We extracted the expression profile matrix of 1249 up-regulated DEGs from the experimental cohort to construct the up-regulated gene matrix and performed WGCNA on it. To make the network scale-free, we chose the soft threshold (β = To make the network scale-free, we chose the soft threshold (β = 10) and R2 = 0.87 ([112]Fig. 1C and D) and the adjacency relation was transformed into a topological overlap matrix (TOM). Cluster analysis was performed on the gene expression profiles of AAA group and Control group ([113]Fig. 1E). We used hierarchical clustering to divide genes with similar expression profiles into 5 gene modules through average hierarchical clustering and dynamic tree clipping ([114]Fig. 1F). We performed the screening of modular genes by delineating the MM threshold of 0.8, GS threshold of 0.1, and weight threshold of 0.1, and finally screened 44 blue modular genes, 93 brown modular genes, 18 green modular genes, 14 yellow modular genes and 1 Gy modular gene ([115]Supplementary Table 3). In addition, we plotted a heat map of module feature vector clustering and investigated the relationship between modules and clinical information ([116]Fig. 1G). We found that blue (correlation score(cor) = 0.45 and P = 3.5e-4), yellow (cor = 0.47 and P = 1.7e-16), and green modules (cor = 0.45 and P = 3e-4) were highly correlated with AAA ([117]Fig. 1H). In order to select the final required module, the correlation between module membership (MM) and gene significance (GS) of the five modules was further studied, and the blue module (r = 0.07, p = 0.46), yellow module (r = −0.22, p = 0.46), grey module (r = 0.31, p = 0.02), green module (r = 0.13, p = 0.36) and brown module (r = −0.01, p = 0.88 had the correlation with AAA ([118]Fig. 1I-M). Fig. 1. [119]Fig. 1 [120]Fig. 1 [121]Open in a new tab Analysis of variance and weighted gene co-expression network analysis of up-regulated DEGs. (A) volcano map showing up-regulated genes (Red), no significant genes (Grey) and down-regulated genes (Blue) by Variance analysis. DEGs (p-Value <0.05) was significant genes. (B) Heatmap clustering the DEGs of two groups and showing the difference in expression between the two groups. (C) the connection between different soft-thresholding powers and the scale-free fit index (D) The link between different soft-thresholding powers and the mean connectivity. (E) Collection of samples into several clinical groupings (F)Colors in the gene clustering tree diagram indicate distinct modules. (G) Creating clusters based on various module feature values. (H)Clinical traits and the connections between the five modules. (I–M) The scatterplots in the 5 modules illustrating the link between MM and GS. 3.2. Functional enrichment analysis We performed a KEGG enrichment analysis on each module's genes in order to further narrow down the relevant modules. We discovered that the genes in the blue module were primarily enriched for genes associated with the hematopoietic cell lineage, T cell receptor signaling pathway, Cytokine-cytokine receptor interaction, Viral protein interaction with cytokine and cytokine receptor, Measles, Cell Adhesion Molecules (CAMs), Chemokine Signaling Pathway, Epstein-Barr virus infection, Human T-cell Leukemia Virus 1 infection, B cell The findings of the KEGG Enrichment Analysis's green, brown, and yellow modules are shown in [122]Supplementary Figs. 2A–C. We chose the blue module as the most appropriate module by combining the WGCNA and KEGG results. Next, we ran a GO enrichment analysis on the genes in the blue module. According to BP, the immune system process, immune response, leukocyte activation, cell activation, regulation of immune response, lymphocyte activation, T cell activation, and antigen receptor-mediated signaling pathway were the main areas of enrichment in the blue module ([123]Fig. 2B). In terms of CC, the blue module was primarily enriched with T cell receptor complex, immunological synapse, plasma membrane part, cell surface, cytoplasmic vesicle membrane, vesicle membrane, secretory granule, receptor complex, and secretory granule membrane ([124]Fig. 2C). Molecular transducer activity, transmembrane signaling receptor activity, protein tyrosine kinase binding, G protein-coupled purinergic nucleotide receptor activity, G protein-coupled nucleotide receptor activity, purinergic nucleotide receptor activity, nucleotide receptor activity, interleukin-2 receptor activity, and interleukin-15 receptor activity are the main enriched components of the blue module in MF ([125]Fig. 2D). Additionally, we ran a GSEA analysis. The primary four pathways that were enriched in the data were cardiac muscle contraction, oxidative phosphorylation, ubiquitin-mediated proteolysis, and vascular smooth muscle contraction ([126]Fig. 2E–H). Fig. 2. [127]Fig. 2 [128]Open in a new tab Functional enrichment analysis. (A) KEGG enrichment in the blue module. (B–D) GO (BP, CC and MF) enrichment in the blue module. (E–H) GSVA analysis showing in samples. 3.3. The immune microenvironment of abdominal aortic aneurysms Our KEGG enrichment analysis revealed that the blue module's gene enrichment findings were strongly correlated with immunity. In the AAA data set, the immunological microenvironment was examined using ESTIMATE, MCPcounter, CIBERSORTs, and ssGSEA, respectively. The ImmuneScore, StromalScore, and ESTIMATEScore were all higher in the AAA group, even though the ESTIMATE findings were not statistically significant (p > 0.05) ([129]Fig. 3A). Second, we used MCPcounter to determine the abundance of 10 immune-related cells. When compared to the Control group, the AAA group had higher levels of T cells, cytotoxic lymphocytes, B lineage, myeloid dendritic cells, and neutrophils ([130]Fig. 3B). In the AAA group, there were more T follicular helper cells, T regulatory cells, monocytes, and dendritic activated cells ([131]Fig. 3C). In order to further investigate the degree of immune infiltration in the samples, we employed ssGSEA. The AAA group had higher concentrations of activated CD4 T cells, central memory CD4 T cells, gamma delta T cells, type 17 T helper cells, type 2 T helper cells, immature dendritic cells, eosinophils, and neutrophils ([132]Fig. 3D and E). Fig. 3. [133]Fig. 3 [134]Fig. 3 [135]Open in a new tab The immune microenvironment of abdominal aortic aneurysms. (A) By ESTIMATE analysis box plot displaying the immunization scores of grouped samples. (B) 10 immune-related cells were enhanced in the sample by MCPcounter analysis. (C) Box plot from CIBERSORTX analysis indicating the enrichment of 22 immune-related cells in the sample. (D, E) The analysis of ssGSEA in the sample showing heat map and box line diagram respectively. For the data statistics, the rank sum test was employed. Data were presented as the mean ± SD(n ≥ 3). *p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 0.0001 vs. the Control group. 3.4. Machine learning and PPI networks together screen for the most feature genes We used two plug-ins from PPI network and Cytoscape to screen the feature genes in the blue module and three different learning machine algorithms to determine the most feature genes in the blue module. We then obtained the feature genes from the aforementioned five methods to establish them as Hub genes. IncMS is used in the random forest method. IncMSE and IncNodePurity are used by the random forest method to evaluate the genes. We screened the top ten crucial genes according to IncNodePurity was more beneficial: ITGAL, PRKCQ.AS1, SELL, PLAC8, NIBAN3, RASGRP1, STAP1, CR2, NUP210, and LRMP ([136]Fig. 4A and B). Regression results were presented by rainbow plots and Binomial Deviance ([137]Fig. 4C and D). Lasso regression screened out five signature genes (SELL, IL2RB, ITGAL, RCSD1 and PRKCQ-AS1). At the same time, we used the SVM-REF algorithm and repeated the 10-fold cross-validation with 5 repetitions, choosing the "random" repeated cross-validation method, SIZE = 1:10, to extract the genes with higher variable importance and plot the histogram ([138]Fig. 4E). We then evaluated the accuracy of the regression model by Root Mean Square Error (RMSE) and found the point with the lowest RMSE (RMSE = 5) and the best subset of genes: PRKCQ.AS1, ITGAL, SELL, NUP210 and RASGRP1 ([139]Fig. 4F). The model functions best when K is equal to 1 or 3. We calculate the K value of the blue module using Mean Absolute Error, as indicated in the picture ([140]Fig. 4G). Then, using XGBoost, we determined the blue module genes' significance scores, displayed a histogram, and selected the top 10 genes for SHAP analysis ([141]Fig. 4H and [142]Supplementary Fig. 3A). In addition, we imported the genes from the blue module into the STRING database and visualized them by Cytoscape ([143]Fig. 4I). We screened the signature genes by two plugins, Cytohubba and MCODE, respectively ([144]Fig. 4 J, K). Finally, the genes screened by these five methods were intersected to obtain the most feature genes: ITGAL and SELL ([145]Fig. 4 L). The ROC curves were used to assess the diagnostic value of the most feature genes ([146]Supplementary Figs. 3B and C) Fig. 4. [147]Fig. 4 [148]Fig. 4 [149]Open in a new tab Machine learning and ppi networks together screen for the most feature genes. (A, B) Random Forest Analysis by IncMSE and IncNodePurity for top variables presentation.(C, D) Screening of feature genes by using the Lasso model. The most appropriate number of genes for AVC with a diagnosis of MS is between 5 and 6, which corresponds to the lowest point on the curve. (E, F) The diagram showing the 17 features genes by support vector machine-recursive feature elimination (SVM-RFE) method choosing. By varying the number of variables and thus determining the best n = 5 for Training RMSE. (G) KNN regression analysis to calculate the number of best nearest neighbors (K = 1 or 3). (H) The important scores of the XGBoost studying out of the TOP 10 genes are represented in the bar chart. (I) Blue module genes are imported into Cytoscape via PPI network for visualization. (J) Top 10 genes by Cytohubba calculating in Cytoscape. (K) Top 10 genes by MCODE calculating in Cytoscape. (L) Venn diagram showing the intersection of genes screened between lasso, SVM-REF,random forest, XGboost, Cytohubba, MCODE, and mitochondrial fission related genes. 3.5. Single-cell analysis located the expression of signature genes in cells We divided the six samples of the dataset into two groups into AAA1 ([150]GSM5077727, [151]GSM5077728, [152]GSM5077730 and [153]GSM5077731), AAA2 ([154]GSM5077729 and [155]GSM5077732) respectively for QC evaluation and data filtering to prevent mitochondrial and nuclear RNA contamination ([156]Fig. 5A–C). We then performed data cleaning and removed batch effects using the harmony method based on a subset of features for highly variable features (HVGs) ([157]Fig. 5D, [158]Supplementary Fig. 4A). We then searched for 2000 HVGs for cell clustering and identification and showed the top 5 genes: PRSS1, CLPS, PNLIP, AMY3A, CPA1 ([159]Fig. 5E). We performed principal component analysis (PCA) and identified several PC gene populations with high variability, and used the Umap method to downscale the PC gene populations into 18 cell subgroups ([160]Fig. 5F, [161]Supplementary Figs. 4B and C). We screened the top 10 marker genes of each cell subgroup to identify the type of this cell subgroup, and finally re-marked the type of each cell subgroup ([162]Fig. 5G). We then examined the expression of ITGAL and SELL across samples and cell populations. We found that ITGAL showed higher expression mainly in NK cells and Lymphoid cells, while SELL showed higher expression mainly in T regulatory cells, Lymphoid cells, B cells and CD1C+_B dendritic cells ([163]Fig. 5H and I). Combined with the results of the previous immuno-infiltration analysis, we conclude that ITGAL and SELL may affect T regulatory cells, NK cells, B lineage and lymphocytes and thus the development of AAA. Fig. 5. [164]Fig. 5 [165]Open in a new tab Single-cell analysis located the expression of signature genes in cells. (A) Data illuminating the percentage of mitochondrial genomes found and the number of genes found in each cell; the X-axis and color reflect various GSM values, while the Y-axis shows the number of genes found or mitochondrial content. (B, C) Correlation between base data before genes filtering and after genes filtering, Mitochondrial content decreases as nCount rises. (D) The PCA analysis graphing after merging and reducing batch effects. Several PC populations with high variability as anchor points displaying both PC1 and PC2 components. (E) The scatter figure shows how many high variant genes were removed from the sample batch. The top 5 high variant genes are shown in red.(F) Results sorted using UMAP techniques. Different cell populations are represented by various numbers or colors. (G) Top10 heat map distinct or colors indicate distinct cell groups in the manually annotated cell subgroup map with marker genes concentrated in various cell groups. (H, I) Blue shows the expression of genes in expression plots of MFG in various cell categories. 3.6. Gross morphological changes in AAA and conversion of smooth muscle phenotypic proteins Mice were anesthetized after 28 days. The gross morphology of the aortic aneurysm was visualized in the model group compared to the ApoE −/− saline group. The aneurysm diameter of the model group was measured and counted as more than 1.5 times the normal diameter ([166]Fig. 6A and B). After HE and EVG staining, it was seen that the abdominal aortic wall was thickened, the mesenteric elastic fiber layer was reduced, smooth muscle cells were reduced, and EVG showed that the number of mesenteric elastic fiber layers was reduced and broken ([167]Fig. 6C). Secondly, we examined the mRNA levels of arterial smooth muscle related proteins, we found that Sm22α, SmMHC, MMP1 and α-SMA expression levels were reduced in the aorta of mice after angiotensin stimulation ([168]Fig. 6D–G). Finally, we detected the protein levels of α -SMA and Calponin in the AAA group, and the expression of both Calponin and α -SMA was decreased in the AAA group ([169]Fig. 6H–J). Fig. 6. [170]Fig. 6 [171]Open in a new tab Gross morphological changes in AAA and conversion of smooth muscle phenotypic proteins. (A, B) Picture of a large specimen of abdominal aortic aneurysm and a statistical picture of the maximum diameter of the arterial vasculature. (C) Representative plots of EVG staining and HE staining (500um and 100um). (D–G) Bars graphing statistics of mRNA levels of Sm22α, SmMHC, MMP1 and α-SMA in abdominal aortic aneurysm tissue samples. (H–I) Statistical protein expression level of α-SMA and Calponin. (J) Western blot showing protein expression. For the data statistics, the t-test was employed. Data were presented as the mean ± SD(n ≥ 3). *p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 0.0001 vs. the Saline group. 3.7. Inflammation and immune in abdominal aortic aneurysms Increased secretion of pro-inflammatory factors is a key feature in the mechanism of abdominal aortic aneurysms. We measured serum levels of IL-1β and IL-6 in mice by ELISA kits. We found elevated levels of both IL-1β and IL-6 in the angiotensin-treated group compared to the saline group ([172]Fig. 7A and B). In addition, we extracted RNA from the abdominal aorta of mice and performed in vivo RT-PCR analysis to detect changes in their inflammatory factors. Compared with the saline group, the mRNA expression of TNF-α, IL-1β, IL-4, IL-6, IL-8 and IL-17 was significantly higher in the abdominal aortic aneurysm model group, while the mRNA expression of the inflammatory factor IL-10 was lower ([173]Fig. 7C–I). Fig. 7. [174]Fig. 7 [175]Open in a new tab Inflammation and immune in abdominal aortic aneurysms. (A, B) Statistics of IL-1β and IL-6 levels in serum of abdominal aortic aneurysm. (C–I) The bar graphs representing the mRNA levels of TNF-α, IL-1β, IL-4, IL-6, IL-8, IL-17 and IL-10 in AAA respectively. For the data statistics, the t-test was employed. Data were presented as the mean ± SD(n ≥ 3). *p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 0.0001 vs. the Saline group. 3.8. Expression of signature genes in mouse abdominal aortic aneurysm model and MOVAS To verify whether the expression of our signature genes in the model was consistent with the database results, we examined the mRNA levels of ITGAL and SELL in the tissues and found that the expression of SELL and ITGAL increased in the AAA group ([176]Fig. 8A and B). In addition, we have verified this in in vitro experiments. We stimulated MOVAS cells with TNF-α and then extracted samples and examined the mRNA levels of the cells. As in the animal model, the mRNA levels of ITGAL and SELL were upregulated in MOVAS under TNF-α stimulation conditions ([177]Fig. 8C and D). Fig. 8. [178]Fig. 8 [179]Open in a new tab Expression of signature genes in mouse abdominal aortic aneurysm model and MOVAS. (A, B) The bar graphs representing the mRNA levels of MFG(IGTAL and SELL)in AAA respectively.(C, D) The bar graphs representing the mRNA levels of MFG(IGTAL and SELL)in MOVAS respectively after TNF-α stimulating. For the data statistics, the t-test was employed. Data were presented as the mean ± SD(n ≥ 3). *p < 0.05, **p < 0.01, ***p < 0.001, and ****p < 0.0001 vs. the Saline group. 3.9. Prediction of targeting drugs for AAA by cmap We imported the blue module genes screened by WGCNA into the Cmap website to screen specific targeted drugs by specific genes using the database.We screened the top 10 compounds with the highest standardized connectivity score (CS) and false discovery rate (FDR (nlog10)). Clobetasol, Adapalene, Everolimus, Bifonazole, Vorinostat, PSB-06126, Tiotropium, Tasquinimod, Ecopipam and Marimastat) were visualized ([180]Fig. 9A and B). These drugs suggest possible immunological guidance for the treatment of AAA. Fig. 9. [181]Fig. 9 [182]Open in a new tab Prediction of targeting drugs for AAA by Cmap. (A) The lollipop graph plots the RAW cs, Norm cs and fdr_q_nlog10 of the Top10 targeted drugs. the length represents the RAW cs, the color represents the Norm cs, and the size of the circle represents fdr_q_nlog10. (B) Mulberry plots the 10 medications' cell types of action, doses, times, and N-samples. 4. Discussion As a vascular disease, AAAs are a serious risk to a person's healthy life as they rupture a blood vessel within minutes and cause death. Although surgical treatment plays a large role in this, there is no clear way to intervene in its development and prognosis. Therefore, targeted AAA treatment plays a crucial role in the development of AAA. In this study, we first performed differential analysis of the samples, extracted up-regulated differential genes and then performed WGCNA analysis to identify suitable modules, performed gene enrichment analysis, and simultaneously performed immune infiltration analysis, which revealed that the development of AAA may be closely related to immune cells. Four filtering results from machine learning, mitochondrial fission related genes and the results from two plug-ins in Cytoscape software (Lasso regression, random forest method, SVM-REF, XGBoost, Cytohubba and MCODE) were then used to screen for the most feature genes. We screened for ITGAL and SELL and localized them in cells by single cell analysis. Thereafter, we constructed AAA animal and cellular models and examined the expression of a number of inflammatory immune factor-related genes, as well as the mRNA expression levels of ITGAL and SELL in vivo and in vitro, respectively. Finally, we screened 10 drugs by predicting and screening the blue module genes for targeting drugs with implications for AAA development. The expression of Integrin alpha L (ITGAL), a member of the integrin family whose primary role is to regulate T cell activation and migration, is closely linked to tumor and immunity by prolonging contact with antigen-presenting cells and binding to target cells for killing [[183][55], [184][56], [185][57]].Li et al. found that knockdown of ITGAL inhibited acute myeloid leukemia cell growth and conversely induced early apoptosis [[186]58].H Shinkal et al. found that functional expression of ITGAL on CD8^+ T cells was inhibited in regional lymph nodes of gastric cancer tissue [[187]59]. In a study by Eugene S Lee et al., elevated molecular expression levels of ITGAL in patients with abdominal aortic aneurysms were detrimental to endothelial function and affected patient prognosis [[188]60].However the specific mechanistic role of ITGAL in AAA is unclear. Sieghart Sopper et al. found that L-selectin (CD62L, SELL), a type I transmembrane glycoprotein and cell adhesion molecule, is used to regulate the protrusion of monocytes during transendothelial migration (TEM), giving the cells the ability to converge towards the site of damage. SELL has been shown to have been associated with a variety of tumours. Reduced expression of SELL on T cells enhances the immune response to tyrosine kinase inhibitor therapy in early stage chronic myeloid leukemia [[189]61].Kevin K Hannawa et al. found that SELL deficiency mitigates abdominal aortic aneurysm formation [[190]62]. In addition, ITGAL and SELL are closely related to mitochondrial function. Mannheimia haemolytica producing a leukotoxin which binds to ITGAL/CD18 which can cause lesions in the outer mitochondrial membrane, thereby affecting changes in mitochondrial function [[191]63]. L-7/T cells can upregulate IL-7R and SELL and promote changes in mitochondrial function and fatty acid oxidative metabolism.he role of SELL specifically in abdominal aortic aneurysms remains unclear, apart from early studies [[192]64]. But the role of SELL specifically in abdominal aortic aneurysms remains unclear, apart from early studies. Thus, both ITGAL and SELL are of interest for the development of diagnostic treatment of AAA and can be further investigated in the future. Early infiltration of bone marrow immune cells in the wall of the abdominal aorta as a marker for the development of AAA [[193][65], [194][66], [195][67]]. T regulatory cells (Tregs) play a major protective role in cardiovascular diseases, including atherosclerosis, hypertension, myocarditis, etc [[196]68]. Tregs are detected in both abdominal aortic aneurysms and atherosclerotic tissue and are protective against atherosclerosis [[197]69,[198]70]. Bin Liu et al. showed that Tregs can prevent the development of AAA by inhibiting the expression of COX-2 in vitro and in vivo [[199]71]. Tomohiro Hayashi et al. found that UVB induced expansion of Tregs and thus prevented the expansion of AAA tumor diameter [[200]72]. Studies have shown higher levels of B-cell infiltration and richer immunoglobulin levels in AAA compared to aortic occlusive disease in vascular related diseases [[201]14,[202]73,[203]74].This suggests that B cells play an important role in the development of AAA. We analyzed the immune cell infiltration of the samples by combining multiple immune enrichment assays. We found that Activated CD4 T cells, Central memory CD4 T cells, Gamma delta T cells, Type 17 T helper cells, Type 2 T helper cells, Activated B cells, Immature B cells, CD56dim natural killer cells, Immature dendritic cells, Eosinophil, T follicular helper cells,T regulatory cells, Monocytes CD56dim natural killer cell, Immature dendritic cell, Eosinophil, T follicular helper cells,T regulatory cells, Monocytes,Cytotoxic lymphocytes, B lineage,Myeloid dendritic cells and Neutrophils were more enriched in the AAA group. In combination with monocyte analysis, we found that ITGAL and SELL were more highly expressed in T regulatory cells, NK cells, CD1C+_B dendritic cells, B lineage and lymphocytes. It can be speculated that ITGAL and SELL may further influence the development of AAA through immune cells. Finally, we established animal and cellular models and assayed inflammatory immune-related cytokines for further validation. We found increased expression of TNF-α, IL-1β, IL-4, IL-6, IL-8 and IL-17 and decreased expression of IL-10 in the AAA group. This further validated the inflammatory immune response process in AAA. We also examined the mRNA levels of ITGAL and SELL and the results were as expected, with increased expression in the AAA group. However, we did not detect changes in protein levels of ITGAL and SELL in AAA. And we did not further investigate the effects of increased or decreased ITGAL and SELL expression on AAA at the animal level and cellular level. This provides an idea to explore this study further. In addition, we performed targeting drug prediction for the modular differential genes screened from the samples and screened for the 10 most likely drugs with targets. However, we did not validate this with targeted drugs, the basis for argumentative support for further experiments. To study the effects of these medications on AAA, they can first be administered to animals by gavage or intramuscular injection. At the same time, in vitro studies can be carried out to study the effects of the pharmaceuticals themselves. We then intend to collect human abdominal aortic samples to investigate how ITGAL and SELL affect the development of AAA. Future research on these two genes can be done in animals and people, respectively, to investigate their unique mechanisms and to investigate the genetically corresponding antagonists. These studies could be important for the prevention and treatment of AAA in the future and could also help to guide clinical measures for AAA occurrence and prognosis. In summary, we are the first to combine bioinformatics, machine learning and PPI networks to analyze AAA samples to look for MFGs, and use single-cell analysis and immune infiltration analysis to detect the relationship between disease and the immune system. In addition, we build animal models and test their levels. 5. Conclusion We screened for new target markers of AAA, ITGAL and SELL by multiple machine learning and PPI networks, and detected high expression of both markers in NK cells, Lymphoid cells T regulatory cells, B cells and CD1C + High expression of the two markers was detected in NK cells, Lymphoid cells T regulatory cells, B cells and CD1C dendritic cells. Combined with immuno-infiltration analysis of the samples, we concluded that ITGAL and SELL may act in AAA through T regulatory cells, NK cells, B lineage and lymphocytes. We developed an AAA model and found that TNF-α, IL-1β, IL-4, IL-6, IL-8 and IL-17 expression was increased in the AAA group, while IL-10 was decreased. Moreover, ITGAL and SELL mRNA levels were increased in the AAA group compared to saline levels. We also predicted 10 drugs that could target AAA: Clobetasol, Adapalene, Everolimus, Bifonazole, Vorinostat, PSB-06126, Tiotropium, Tasquinimod, Ecopipam and Marimastat. Availability statement The samples mentioned in the article can all be retrieved in the online library. The names of the library and sample numbers are given in article/Supplementary Material. Ethics statement Not applicable. Funding This study was supported by the Natural Science Foundation of China (Grant Nos.82070291). The funder had no role in the decision to publish or preparation of the manuscript. Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Data availability statement The datasets supporting our findings are presented in the article. [204]GSE57691 was downloaded from the GEO database ([205]https://www.ncbi.nlm.nih.gov/geo/). CRediT authorship contribution statement Yi-jiang Liu: Writing – review & editing, Writing – original draft, Validation. Rui Li: Writing – review & editing, Writing – original draft. Di Xiao: Methodology, Formal analysis, Data curation. Cui Yang: Methodology, Formal analysis, Data curation. Yan-lin Li: Data curation. Jia-lin Chen: Data curation. Zhan Wang: Data curation. Xin-guo Zhao: Data curation. Zhong-gui Shan: Writing – review & editing, Visualization, Validation, Supervision. Declaration of competing interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interestsZhong-gui Shan reports financial support was provided by National Natural Science Foundation of China. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgments