Abstract Atherosclerosis (AS), a chronic inflammatory disease with autoimmune components, represents the predominant pathological change underlying cardiovascular diseases. Natural killer (NK) cells, pivotal actors in the innate immune system, play intricate regulatory roles in AS. Our objective was to identify and analyze NK cell-related genes involved in AS pathogenesis. We conducted differential expression analysis and functional enrichment analysis via microarray datasets from AS patients, identified NK cell-characteristic genes and performed subgroup analysis of AS on the basis of the expression levels of these genes. The results revealed that the differentially expressed genes in AS were predominantly associated with immune response activities and were significantly enriched in NK cell-mediated cytotoxicity pathways. PTPN6, ITGAL, TYROBP, SLAMF7, LCP2, HCST, HAVCR2, and VAV3 were identified as NK cell-characteristic genes. Subgroup analysis indicated that in patients with high expression levels of NK cell-characteristic genes, the progression of AS may be driven primarily by immune cell activity, whereas in those with low expression levels, TGF-β signaling may be the primary driving factor. In summary, our findings emphasize the crucial role of NK cell-mediated immunity in AS, offering potential targets for personalized immunomodulatory therapies and highlighting the need for tailored treatments based on different AS subtypes. Supplementary Information The online version contains supplementary material available at 10.1038/s41598-025-98524-9. Introduction Atherosclerosis (AS) constitutes the central pathological process underlying major cardiovascular ailments^[34]1,[35]2. The most severe complications of AS, myocardial infarction (MI) and stroke, are major causes of mortality worldwide^[36]3,[37]4. Cardiovascular diseases (CVDs), underpinned fundamentally by AS, represent a substantial threat to worldwide public health. Statistics from the World Health Organization (WHO) showed CVDs represent nearly 32% of global deaths annually, affecting approximately 17.9 million people^[38]5, a number projected to increase to 22 million by 2030^[39]6. Consequently, the prevention and control of AS, aimed at reducing the incidence of cardiovascular events, have become paramount priorities in global public health agendas. AS is distinguished by the progressive buildup of lipids, vascular wall cells, immune system cells, and the extracellular matrix in the arterial wall, gradually resulting in the formation of atherosclerotic plaques^[40]7. These plaques predominantly affect large and medium-sized arteries, potentially impairing the blood supply to the heart, brain, and kidneys and subsequently triggering a range of clinical manifestations. In addition to being a prevalent lipid metabolism disturbance, AS represents a persistent inflammatory ailment deeply rooted in autoimmune processes. The initiation of AS originates from the response of the arterial intima to endogenously produced oxidatively modified components, such as oxidized low-density lipoprotein (oxLDL), which triggers both innate and adaptive immunity, exacerbating AS progression^[41]8,[42]9. Immune cells play crucial roles in the development of atherosclerotic plaques^[43]10. In the advanced stages of AS, various endogenous immune cells contribute to the development of plaques within the arterial wall^[44]11. In addition to participating in inflammatory phenomena, immune cells strongly influence the robustness and fragility of plaque structures, thereby impacting the clinical outcomes associated with AS. Therefore, uncovering the intricate roles of immune cells in AS is paramount for innovating therapeutic options that effectively combat disease evolution and shield against detrimental consequences. Serving as pivotal components of the human immune system, natural killer (NK) cells are derived from lymphoid progenitor cells in the bone marrow, and complete their maturation and functional differentiation in various tissues, including the thymus, spleen, lymph nodes, lungs, liver, and kidneys^[45]12. Dysregulation of NK cell functionality and molecular profiles is intimately linked with atherosclerotic disease progression. Investigations have revealed that the marked upregulation of CD160 expression on NK cells among individuals with AS is associated with diminished NK cell populations and elevated systemic levels of inflammatory cytokines such as IFN-γ, TNF-α, and IL-6, which are known to accelerate atherogenesis^[46]13. Experimental evidence from ApoE(−/−) mouse models further demonstrated that NK cell-mediated secretion of cytotoxic effectors (perforin and granzyme B) facilitates necrotic core expansion within developing plaques, whereas suppression of NK cell activity significantly mitigates atherosclerotic lesion development^[47]14. Furthermore, NK cells are observed alongside mast cells and eosinophils in early atherosclerotic lesions, suggesting their potential involvement in the initial inflammatory response^[48]15. Additionally, NK cells may exacerbate plaque vulnerability by releasing cytokines, such as IFN-γ, that promote a proinflammatory phenotype in macrophages^[49]16. In obesity-related AS models, the inflammatory state of adipose tissue macrophages (ATMs) can indirectly influence plaque progression by modulating the levels of circulating immune cells, including NK cells^[50]17. These findings indicate that NK cells may promote AS development through regulating inflammation, modulating plaque stability, and interacting with other immune cells. Therefore, targeting NK cell-related pathways may hold therapeutic potential for AS. The intricate and diverse functions of NK cells throughout the progression of AS have attracted considerable attention from scientists across various disciplines. Our study aims to delineate NK cell-related genetic targets and subtypes relevant to AS, as well as to predict potential therapeutic agents. An elaborate diagram outlining the methodological steps adopted in our current study is illustrated in Fig. [51]1. Fig. 1. [52]Fig. 1 [53]Open in a new tab Research flowchart. Results Identification and validation of NK cell-related genes in AS Differential genetic screening Following the download of standardized expression matrices from AS carotid artery microarray datasets, probe annotations were converted to gene names via dataset annotation files, with duplicate probe values averaged. [54]GSE28829 and [55]GSE104140 were used as validation sets. Carotid artery sample data from [56]GSE100927 were combined with those from [57]GSE43292, and the “sva” package implemented in R helped us address batch inconsistencies. Principal component analysis (PCA) confirmed effective correction (Fig. [58]2A, B). The merged dataset, which served as the training set, comprised microarray data from 61 disease group cases and 44 control group cases of carotid artery tissues. By applying the “limma” package in R, 123 upregulated differentially expressed genes (DEGs) and 80 downregulated DEGs in AS carotid artery samples were identified (Fig. [59]2C, D; Supplementary Table [60]S1). Fig. 2. [61]Fig. 2 [62]Open in a new tab Identification of AS carotid NK cell-related genes. PCA was used to check heterogeneity of the training set before (A) and after (B) combining carotid artery sample data from [63]GSE43292 and [64]GSE100927. The volcano plot (C) and the heatmap (D) show DEGs in the carotid artery in the training set (|LogFC|> 1 and adjusted p values < 0.05). Bubble diagrams show GO analysis (E) and KEGG analysis (F) of AS DEGs^[65]41–[66]43. The Venn diagram (G) shows DEGs in AS carotid arteries and NK cell-related genes. We performed Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses on DEGs in AS to investigate potential disease mechanisms. The significantly enriched terms (Supplementary Table [67]S2) included 765 biological process (BP), 39 cellular component (CC), and 24 molecular function (MF) terms. Figure [68]2E shows the top 10 enriched terms in the BP, MF, and CC domains respectively. The key terms included phagocytosis, myeloid cell activation involved in the immune response, and leukocyte activation involved in the immune response in the BP category; secretory granule membrane, tertiary granule, and specific granule in the CC category; and integrin binding, lipase activity, and phospholipase activity in the MF category. KEGG enrichment analysis revealed 29 pathways (Fig. [69]2F; Supplementary Table [70]S3), with DEGs enriched primarily in osteoclast differentiation, viral protein interaction with cytokines and cytokine receptors, rheumatoid arthritis, platelet activation, Virion-Lassa virus and SFTS virus, and NK cell-mediated cytotoxicity pathways. Collectively, immune responses were found to dominate the biological functions of the DEGs, with NK cell-mediated cytotoxicity pathways prominently featured. Using the “Venn” package in R, the intersection of DEGs and NRGs was identified, revealing 16 NK cell-related DEGs (Fig. [71]2G; Supplementary Table [72]S5), which were considered initial candidate biomarkers. Preliminary validation and diagnostic efficacy evaluation The expression levels of the initial biomarkers were visualized in the training set (constructed by merging carotid artery sample data from the [73]GSE43292 and [74]GSE100927 datasets), which revealed statistically significant differences in expression between the disease group and the control group (Fig. [75]3A). This result was validated in the [76]GSE104140 (Fig. [77]3B) and [78]GSE28829 (Fig. [79]3C) datasets. All the biomarkers were upregulated in the disease group, regardless of which dataset they originated from. The receiver operating characteristic (ROC) curves of the biomarkers in the training set (Fig. [80]3D) and validation sets (Fig. [81]3E, F) had the area under the curves (AUCs) > 0.8, indicating their good diagnostic potential for AS. Fig. 3. [82]Fig. 3 [83]Open in a new tab Preliminary validation and diagnostic efficacy evaluation of AS carotid NK cell-related genes in the training set and the two validation sets. Boxplots show the validation of preliminary gene marker expression levels observed in the training dataset (A), comprising merged [84]GSE43292 and [85]GSE100927 carotid artery sample data, and further validated in the independent datasets [86]GSE104140 (B) and [87]GSE28829 (C) (** indicates p values < 0.01, *** indicates p values < 0.001). The diagnostic performance of preliminary gene biomarkers was evaluated through ROC curves analysis in the training dataset (D), with subsequent validation in two independent datasets, [88]GSE104140 (E) and [89]GSE28829 (F). Identification and validation of NK cell-characteristic genes in AS Selection of the machine model and candidate biomarkers To further select the optimal biomarkers for predicting AS disease status, machine learning algorithms were employed to develop a predictive model. Among least absolute shrinkage and selection operator (LASSO), random forest (RF), and support vector machine (SVM), RF was chosen as the best model on the basis of having the smallest residual and largest AUC (Fig. [90]4A–C). In the RF model, with “ntree” set to 1500, the optimal number of trees was 16 (Fig. [91]4D). Following parameter optimization, the initial biomarkers were ranked on the basis of "Mean Decrease Gini", with the top 50% (8 genes) selected as the optimal biomarkers: PTPN6, ITGAL, TYROBP, SLAMF7, LCP2, HCST, HAVCR2, and VAV3 (Fig. [92]4E). Fig. 4. [93]Fig. 4 [94]Open in a new tab Selection of the best predictive model and candidate biomarkers. (A) Residual box plots of the three models. (B) Residual reverse cumulative distribution maps of the three models. (C) ROC curves of the three models. (D) Cross-validation error chart based on the RF model. (E) Gene importance score based on the RF model. Predictive evaluation and quantitative real-time PCR (qRT-PCR) results A nomogram (Fig. [95]5A) was constructed on the basis of the RF model to evaluate the predictive ability of biomarkers for AS disease status, indicating individual weights for each gene. The subsequently plotted calibration curve, with a mean absolute error (MAE) of 0.037, demonstrated substantial agreement among the ideal line, apparent line, and bias-corrected line (Fig. [96]5B). Decision curve analysis (DCA) demonstrated that our predictive model provided a superior clinical net benefit compared with both the “no intervention” and “all intervention” groups when the risk threshold probability ranged from 0.1 to 0.9, underscoring its practical value for predicting AS (Fig. [97]5C). Furthermore, the clinical impact curve demonstrated a high level of concordance between model predictions and actual events when the threshold probability exceeded 0.6, suggesting a high efficacy of clinical prediction (Fig. [98]5D). Fig. 5. [99]Fig. 5 [100]Open in a new tab Predictive evaluation and expression validation of atherosclerosis carotid NK cell-characteristic genes. (A) Nomogram. (B) Calibration curves. (C) Decision curves. (D) Clinical impact curves. (E) Expression validation in the atherosclerosis cell model via qRT-PCR (* indicates p values < 0.05, ** indicates p values < 0.01, *** indicates p values < 0.001). To validate the accuracy of the NK cell-characteristic genes, qRT-PCR was performed using the THP-1 monocytic cell line, which was sequentially induced into THP-1 macrophages and foam cells to serve as an AS cell model. Compared with the control group (THP-1 + PBS), the experimental group (THP-1 + ox-LDL) presented higher expression levels of PTPN6, TYROBP, SLAMF7, HAVCR2, HCST and VAV3 (p values < 0.05), with SLAMF7 and VAV3 showing the most significant differences (p values < 0.001). ITGAL was downregulated, whereas LCP2 did not differ markedly between the two groups (Fig. [101]5E; Supplementary Table [102]S6). Analysis of AS subtypes on the basis of NK cell-characteristic genes Identification and functional analysis of AS subtypes Consensus clustering analysis was performed on all disease group samples in the training set, and the results revealed that the samples were clearly divided into two subtypes when the cluster number k = 2. The consensus matrix indicated high consistency within each subtype (darker regions) and a clear distinction between the two subtypes (lighter regions) (Fig. [103]6A). The consensus cumulative distribution function (CDF) analysis further supported this result, revealing a significant decrease in the slope of the CDF curve at k = 2, indicating high stability of sample grouping at this cluster number (Fig. [104]6B). PCA revealed distinct gene expression patterns between the two subtypes of AS, with PC1 and PC2 clearly separating the samples into two groups, corresponding to subtype A and subtype B, respectively (Fig. [105]6C). Gene expression analysis revealed notable variations in the expression levels of NK cell-characteristic genes between the two subtypes, and subtype A exhibited significantly elevated expression levels relative to those of subtype B (Fig. [106]6D). Heatmap analysis further validated these results, revealing generally higher expression levels of NK cell-characteristic genes in subtype A and lower expression levels in subtype B (Fig. [107]6E). Fig. 6. [108]Fig. 6 [109]Open in a new tab Subtyping analysis of AS based on NK cell-characteristic genes in carotid artery samples. (A) Consistency matrix diagram at k = 2. (B) Consistency cumulative distribution function diagram. (C) PCA of carotid artery samples. (D) Gene expression levels between subtypes. (E) Heatmap of NK cell-characteristic genes between AS subtypes. (F) Heatmap of GSVA result^[110]41–[111]43. Gene set variation analysis (GSVA) revealed that subtype A was significantly enriched in immune-related pathways, such as KEGG NK cell-mediated cytotoxicity (KEGG_NATURAL_KILLER_CELL_MEDIATED_CYTOTOXICITY) and the KEGG T cell receptor signaling pathway (KEGG_T_CELL_RECEPTOR_SIGNALING_PATHWAY). In contrast, subtype B exhibited marked enrichment in the TGF-β signaling pathway (KEGG_TGF_BETA_SIGNALING_PATHWAY) (Fig. [112]6F). These results indicate that subtype A is closely linked to NK cell-mediated immune responses and inflammatory processes, whereas subtype B may be more involved in fibrosis and tissue remodeling. Immune cell infiltration and its correlation with candidate biomarkers Comparative analysis of the infiltration levels of the 23 immune cell types between the AS subtypes revealed higher levels of MDSCs, regulatory T cells, NK cells, and 19 other immune cell types in subtype A than in subtype B, whereas no significant difference was observed in Th2 cell infiltration (Fig. [113]7A). Correlation analysis further revealed that the expression levels of NK cell-characteristic genes, except for SLAMF7, were positively associated with the concentrations of 22 immune cell types but negatively or not strongly correlated with Th2 cell abundance (Fig. [114]7B). These findings suggest that subtype A patients exhibit extensive immune cell infiltration and immune activation, in contrast to subtype B patients, who exhibit reduced immune cell recruitment and diminished immune activity, further supporting our previous conclusion that subtype A patients are closely linked to NK cell-mediated immune responses. Fig. 7. [115]Fig. 7 [116]Open in a new tab (A) Boxplot of immune infiltration differences between AS subtypes. (B) Correlation analysis among NK cell-characteristic genes and immune cells. Enrichment analysis of DEGs between AS subtypes We identified 63 DEGs between the two AS subtypes through differential expression analysis (Supplementary Table [117]S7). The volcano plot revealed significant upregulation or downregulation trends of these DEGs in the two subtypes (Fig. [118]8A), whereas the heatmap further illustrated their expression patterns in subtypes A and B (Fig. [119]8B). To elucidate the functional significance of these DEGs, we conducted Disease Ontology (DO) enrichment analysis. The results demonstrated that these DEGs were notably associated with atherosclerotic disorders, including arteriosclerotic cardiovascular disease, arteriosclerosis, and abdominal aortic aneurysm (Fig. [120]8C; Supplementary Table [121]S8). Additionally, these genes were significantly associated with other immune system disorders (e.g., multiple sclerosis and systemic scleroderma) and infectious diseases (e.g., malaria and parasitic infections). These findings suggest that the DEGs between AS subtypes are not only closely related to the pathological processes of AS but may also be involved in broader biological processes, including immune modulations and responses to infectious agents. Fig. 8. [122]Fig. 8 [123]Open in a new tab Enrichment analysis of DEGs between AS subtypes. The volcano plot (A) highlights the distribution of DEGs (|LogFC|> 1, adjusted p values < 0.05), and the heatmap (B) depicts their expression patterns. Bubble diagram (C) reveals DO enrichment analysis result. Recognition of core genes and potential therapeutics for AS PPIs among 63 DEGs between AS subtypes were explored via the search tool for the retrieval of interacting genes/proteins (STRING), which generated a network with 32 nodes and 80 edges (Fig. [124]9A). By applying the maximal clique centrality (MCC) algorithm for ranking, MMP9, MMP12, IL1RN, MMP7 and CCL3 were selected as core genes in AS (Fig. [125]9B; Supplementary Table [126]S9). Potential therapeutic targets for AS were identified via the drug-gene interaction database (DGIdb), revealing 40 potential drugs/compounds for AS treatment (Fig. [127]9C). Among them, 25 targeted MMP9, 4 targeted MMP12, 4 targeted IL1RN, 7 targeted MMP7, and 3 targeted CCL3. Fig. 9. [128]Fig. 9 [129]Open in a new tab (A) PPI network of DEGs between AS subtypes; (B) AS core genes obtained according to the MCC algorithm; (C) Mulberry plot of drug-gene interactions prediction. Discussion The progression of AS involves inflammatory processes and immune reactions. NK cells modulate inflammation by secreting cytokines and contribute to the composition of AS plaques, impacting their stability. Furthermore, NK cells participate in AS pathogenesis through their cytotoxic function, induction of oxidative stress, and apoptosis. Consequently, inhibiting signaling pathways associated with NK cells represents a promising treatment strategy for combating AS. Our study aimed to elucidate the relevance of NK cell-related genes to AS and identify potential therapeutic approaches. We initially identified DEGs in the carotid arteries related to AS, followed by GO and KEGG analyses to identify their potential functional roles and pathways. The results indicated that the DEGs in AS carotid arteries are strongly linked to immune-related biological processes, with NK cell-mediated cytotoxicity representing one of the most prominently enriched signaling pathways. Previous studies have revealed that NK cells are present in both mouse aortas and human atherosclerotic lesions, regardless of whether AS has initiated or progressed, albeit to a limited amount^[130]18,[131]19. Through comprehensive investigations employing antibody-based depletion and adoptive transfer studies, researchers have demonstrated that NK cells generate perforin and granzyme B and play a substantial role in the formation and exacerbation of atherosclerotic plaques in mice^[132]14. Collectively, these findings support the functional contribution of NK cells in driving atherosclerotic lesion development. Machine learning algorithms have become widely utilized in the detection and characterization of feature genes. For example, LASSO is employed for its ability to perform feature selection through L1 regularization, effectively identifying the most predictive biomarker genes while shrinking irrelevant features to zero, thus enhancing model interpretability and efficiency in high-dimensional datasets; Random Forest is chosen for its robustness in capturing nonlinear relationships and evaluating feature importance, making it particularly suitable for handling noisy data and complex interactions among biomarkers; Support Vector Machine was utilized for its superior performance in high-dimensional classification tasks and its ability to maximize classification margins, ensuring high accuracy even with small sample sizes through the use of kernel functions. Given that each of these three machine learning models has distinct advantages, we selected the algorithm with the best predictive performance among them to further identify the optimal biomarkers for predicting AS risk. Then we analyzed the residuals and the AUCs of the ROC curves for the three machine learning algorithms and chose the RF model as the optimal predictive model. We set parameters to ensure good performance. Finally, we identified a set of NK cell-characteristic genes expressed within the carotid arteries in AS, namely, PTPN6, ITGAL, TYROBP, SLAMF7, LCP2, HCST, HAVCR2, and VAV3. To facilitate the application and visualization of AS risk prediction results, we plotted a nomogram based on the NK cell-characteristic genes identified in our study. This nomogram enables the quantification of the combined impact of multiple genes on the prediction model. The nomogram is constructed via the following steps: first, a vertical line is drawn from each variable’s value to the point axis to ascertain the respective score. Second, the scores of all the variables are aggregated to derive the total score. Finally, another perpendicular line is plotted from the total score axis to intersect with the risk axis to establish the predicted risk of AS. The nomogram exhibited excellent discriminative ability, with a Somers’ Dxy of 0.89 and a C-index of 0.945. Additionally, the accuracy and clinical applicability of the nomogram were validated through calibration curves, decision curve analysis, and clinical impact curves. Prior studies have indicated that some of these NK cell-characteristic genes contribute to the onset and progression of AS. HAVCR2, also referred to as hepatitis A virus cell receptor 2, encodes T-cell immunoglobulin and mucin domain-containing protein 3 (Tim-3), an immunoregulatory receptor predominantly expressed on TH1 cells. Vital in the modulation of macrophage responsiveness, Tim-3 dampens unwanted TH1-triggered immune assaults against self and nonself targets, acting as a linchpin for immune homeostasis and tolerance induction^[133]20–[134]22 and potentially engaging in multiple inflammatory episodes via distinctive pathways^[135]23,[136]24. In human peripheral blood mononuclear cells (PBMCs), especially NK and NKT cell subpopulations, Tim-3 mRNA is highly expressed^[137]23. Notably, increased Tim-3 expression on NK cells is an indicator of AS progression^[138]25. Furthermore, evidence has demonstrated that statins downregulate Tim-3 expression levels on NK and NKT cells under atherosclerotic conditions, thereby exerting immunomodulatory effects and delaying the progression of the disease^[139]26. PTPN6, encoding the protein tyrosine phosphatase nonreceptor type 6, performs a pivotal function in NK cells by dephosphorylating tyrosine residues, consequently modulating NK cell activation and downstream signaling pathways. Furthermore, PTPN6 has been identified as a key negative regulator in atherosclerotic processes, where it suppresses both inflammatory pathways and immune cell activation, thus preventing the progression of AS^[140]27. ITGAL, which encodes the integrin alpha L chain, produces the essential α subunit component of lymphocyte function-associated antigen-1 (LFA-1). LFA-1 is a critical leukocyte adhesion molecule. It is assumed to be a critical mediator in mediating NK cell adhesion to target cells and facilitating immunological synapse assembly, processes that underpin NK cell performance and the execution of cytotoxic activity^[141]28,[142]29. In AS, the adhesive function of LFA-1 may permit NK cell entry into the plaque, aiding in the cleansing of impacted zones. However, research has revealed that under conditions of hypercholesterolemia, the ICAM-1/LFA-1 signaling axis potentially enhances the recruitment and accumulation of monocytes and T lymphocytes within the arterial intima, increasing their presence in the area^[143]30, which could exacerbate inflammatory responses in AS. Although the precise functions of other genetic markers in AS remain to be thoroughly elucidated, current evidence suggests a connection with NK cells. TYROBP, also known as DNAX-activation protein 12 (DAP12), functions as a membrane-associated signaling adaptor protein. First identified in NK cells, this adaptor is tied to an assortment of surface activation receptors displayed by cells of both lymphoid and myeloid origins^[144]31. TYROBP is implicated in interactions with proteins such as SYK and ZAP-70, contributing to a diverse array of physiological processes, encompassing cellular signaling pathways, skeletal development, myelination of the nervous system, and inflammatory responses. SLAMF7, also referred to as signaling lymphocytic activation molecule family member 7, is detectable on the exterior of NK cells. This molecule engages in homotypic binding interactions and facilitates NK cell activation and cytotoxicity. By engaging with its counterpart on neighboring cells, SLAMF7 promotes signaling cascades that are important for the full activation and effector functions of NK cells^[145]32. LCP2, encoding lymphocyte cytosolic protein 2, operates primarily in intracellular signaling pathways, where it acts as a phosphatase that can modulate various immune cell functions, including those of NK cells. HCST, known as hematopoietic cell signal transducer and encoding a transmembrane adaptor protein, is potentially implicated in the activation of PI3K-dependent signaling pathways. HCST regulates the functional activation of NK cells and T lymphocytes by affecting their survival and proliferation. The engagement of HCST in PI3K signaling is crucial for the regulation of immune cell functions, where it facilitates cross-talk between various signaling molecules, contributing to the dynamic balance of immune responses. VAV3, also known as VAV guanine nucleotide exchange factor 3, is a pivotal regulator in the context of NK cell signaling and exerts its influence predominantly on receptor-mediated cytoskeletal rearrangements and cytotoxic effector functions^[146]33. Although investigations into the associations between NK cell-specific genetic markers and AS are in their nascent stages, the bioinformatic analyses conducted in our study suggest a potential link. These analyses indicate that NK cell-characteristic genes might influence the initiation and development of AS by regulating NK cell activity. To explore the biological disparities mediated by NK cell-characteristic genes among distinct pattern subtypes in AS, we conducted consensus clustering analysis on AS samples and identified two subtypes, with PCA substantiating the effectiveness of identification. Subsequent analysis of immune microenvironment differences revealed that subtype A exhibited markedly elevated infiltration levels of NK cells and other immune cell populations relative to those of subtype B, which provided additional support for the validity of our AS subgroup classification. Moreover, the NK cell-characteristic genes showed positive correlations with various immune cells, suggesting that these genes may exacerbate the inflammatory response in AS by promoting immune cell infiltration and activation. Additionally, the negative or nonsignificant correlation between NK cell-characteristic genes and Th2 cells, which typically exhibit anti-inflammatory functions, further supports the potential role of these genes in proinflammatory immune responses. Therefore, the identified NK cell-characteristic genes may serve as potential targets for modulating the immune microenvironment in AS. After the identification of DEGs between AS subtypes, we performed DO enrichment analysis and confirmed that these genes were strongly connected to AS. To further investigate the biological disparities between AS subtypes, GSVA was employed. Subtype A, characterized by increased expression of NK cell-characteristic genes, was enriched predominantly in pathways related to immune cell-mediated functions. Conversely, subtype B, with lower expression of NK cell-characteristic genes, was enriched mainly in the TGF-β signaling pathway. Interestingly, emerging evidence has revealed that TGF-β serves as a critical regulatory molecule in NK cell biology, exerting an inhibitory effect on NK cell function^[147]34. For example, the secretion of TGF-β impairs the functionality of NK cells in the context of the tumor microenvironment, consequently limiting their efficacy in orchestrating robust antitumor immune responses^[148]35. Specifically, TGF-β1 has been shown to downregulate the expression of NK cell activation receptors and effector molecules, including CD226, NKG2D, NKp30, granzyme B, and perforin^[149]36. Extensive research has shown that TGF-β contributes substantially to AS by enhancing both the migratory and proliferative activities of vascular smooth muscle cells (VSMCs)^[150]37. VSMC-derived extracellular matrix component synthesis mediates atherosclerotic plaque development^[151]38. These findings suggest that targeted modulation of the TGF-β signaling pathway may offer innovative therapeutic avenues for the clinical management of AS. Our results indicate that in patients with high expression of NK cell-characteristic genes, the pathogenesis of AS might be dominated by immune cell activities, whereas in patients with low expression of these genes, TGF-β signaling might play a leading role, which underscores the need for tailored therapeutic strategies for different patient subgroups. For example, based on our research findings, a diagnostic kit developed from NK cell-characteristic genes, combined with imaging techniques for assessing plaque characteristics, can be utilized to classify AS patients into immune-dominant and fibrosis-dominant subtypes. For the immune-dominant subtype, priority should be given to modulating immune cell functions; for the fibrosis-dominant subtype, direct targeting of fibrotic processes, such as inhibiting the TGF-β pathway or blocking ECM deposition, is recommended. Dynamic monitoring can be achieved through regular biomarker testing and imaging follow-ups. Current clinical guidelines focus primarily on managing the complications of AS^[152]39, and effective strategies for preventing or slowing its progression are limited and are largely confined to medications that reduce LDL-C levels. However, AS is not merely a classical lipid metabolic disorder; it also involves the participation of autoimmune responses^[153]40. In pursuit of novel therapeutic strategies for AS, we further analyzed the DEGs between the AS subtypes identified in our previous analyses by constructing a PPI network. This analysis enabled us to pinpoint 5 core genes associated with AS. After searching the DGIdb database, 40 potential drugs or compounds that may hold therapeutic potential for the treatment of AS were identified. This study, adopting a bioinformatics perspective, aimed to elucidate the relationship between AS and NK cells. Employing a multifaceted approach that integrates data from multiple databases, diverse analytical techniques, and corroborative cellular experiments, our methodology is characterized by rigorous data selection and sound analytical strategies, thereby lending credibility to our findings. However, acknowledging the inherent limitations of our study is essential for a comprehensive understanding of its scope and implications. First, drawbacks of datasets: while our analysis of DEGs in AS was based on transcriptomic sequencing data, the complexity of the cellular composition suggests that the use of single-cell sequencing data could enhance the reliability of our results; The demographic characteristics, including ethnicity, of the population from which our datasets were derived were not fully captured. The substantial interpopulation variability in genetic and genomic profiles associated with AS progression may constrain the broader applicability of our research outcomes; The small sample size in our study may decrease the statistical robustness, potentially limiting the discovery of significant associations. To address these limitations, future studies should expand cohort sizes and incorporate diverse populations, including individuals of different ethnicities, genders, and age groups, to ensure broader applicability. Biological approaches based on gene co-expression networks, such as weighted gene co-expression network analysis (WGCNA), could be employed to investigate the association between clinical characteristics and AS. Multicenter studies and meta-analyses that integrate data from independent cohorts could also be conducted to increase the statistical power and validate results across various populations. Second, flaws about experiment: discrepancies between the results obtained from qRT-PCR and bioinformatics analyses highlight the complexity of the molecular mechanisms underlying the interaction between AS and NK cells. With limited research available on the specific roles of target gene markers in AS or NK cell function, the exact mechanisms remain to be elucidated, necessitating further investigation to bridge this knowledge gap; The validation methods employed after the screening of NK cell-characteristic genes were limited and focused primarily on cellular experiments and mRNA levels. To enrich our understanding and strengthen the robustness of our findings, future research could incorporate additional validation approaches, such as animal models and protein-level analysis. Finally, problems with drugs prediction: the exact therapeutic effect of the predicted gene-targeted drugs remains unclear, so validation through in vitro and in vivo experiments is needed to confirm the efficacy of the discovered substances. These constraints highlight the need to refine methodologies and expand data sources in subsequent investigations to elucidate the intricate interplay between NK cells and AS pathogenesis. In summary, we conducted a comprehensive bioinformatics analysis, which revealed that the DEGs in AS are enriched in pathways associated with NK cell-mediated cytotoxicity. This analysis allowed us to identify the NK cell-characteristic genes specific to AS. Furthermore, by correlating these genes with AS subtypes, we were able to discern the core genes of AS and predict potential therapeutic drugs or compounds for the disease. Methods Software tools R software (The R Project for Statistical Computing, version 4.3.1) was employed for statistical analysis and visualization, with usage spanning from May 2023 to December 2023. GraphPad Prism (version 5.0.1) was used for graphical representation, with usage in October 2023. Cytoscape (version 3.9.1) was applied for network analysis and visualization, with usage in November 2023. Acquisition and preprocessing of AS datasets We downloaded microarray datasets ([154]GSE100927, [155]GSE43292, [156]GSE28829, and [157]GSE104140) containing carotid artery sample data from AS patients from the GEO database ([158]https://www.ncbi.nlm.nih.gov/geo/) (Table [159]1). Advanced or diseased carotid artery samples were designated the disease group, and early or healthy samples were designated the control group. Carotid artery sample data from [160]GSE100927 were combined with those from [161]GSE43292, batch-corrected via the “sva” package in R, and subjected to PCA to check for heterogeneity before and after batch correction. Table 1. Sample information of the datasets. GSE dataset Organism Sample number (AS/control) Platform [162]GSE43292 Carotid atheroma 64 (32/32) [163]GPL6244 [164]GSE100927 Artery 104 (69/35) [165]GPL17077 Carotid artery 41 (29/12) [166]GSE28829 Carotid atheroma 29 (16/13) [167]GPL570 [168]GSE104140 Carotid atheroma 32 (19/13) [169]GPL16791 [170]Open in a new tab Identification of differentially expressed genes (DEGs) in AS carotid artery Differential gene expression analysis between the disease group and the control group was conducted with the “limma” package in R. DEGs were identified on the basis of |LogFC| > 1 and an adjusted p value < 0.05. Functional and pathway enrichment analysis To gain insight into the underlying mechanisms associated with the DEGs found in AS carotid artery tissue, we conducted functional enrichment analysis, including GO terms and KEGG pathways, via the “clusterProfiler” package in R. Statistical significance was determined using a threshold of adjusted p values < 0.05. Identification and validation of NK cell-related genes in AS Natural killer cell-related genes were retrieved from the Molecular Signatures Database (MSigDB) and the Immunology Database and Analysis Portal (ImmPort) (Supplementary Table [171]S4). Using the “Venn” package in R, the intersection of DEGs and NRGs was used to identify NK cell-related DEGs in AS as initial biomarker candidates. The expression levels of these biomarkers in the training set and validation sets were visualized via the “ggpubr” package in R. The diagnostic efficacy of these markers for AS was quantified by creating ROC curves with the aid of the “pROC” package, where p values < 0.05 underscore statistical importance. The AUC was calculated to quantitatively evaluate diagnostic performance. Identification of NK cell-characteristic genes in AS via machine learning algorithms To select the optimal biomarker genes for predicting disease status from the identified biomarkers, three machine learning algorithms, including LASSO, RF, and SVM, were constructed via the “glmnet”, “randomForest”, and “kernlab” packages in R, respectively. The residuals and AUCs of the ROC curves were analyzed via the “dalex” and “pROC” packages in R to determine the best model. The RF model was selected as the best predictive model, with parameters set to ensure good performance. Genes were ranked by their "Mean Decrease Gini" scores, and the top 50% were considered NK cell-characteristic genes in AS. Cell culture The THP-1 cell line was obtained from the Chinese Academy of Life Sciences (Shanghai, China). This cell line was originally established from a patient with acute monocytic leukemia and has been maintained in our laboratory for experimental use. No human participants were involved in this study. The THP-1 cells were cultured in RPMI 1640 medium (GIBICO, USA) supplemented with 10% fetal bovine serum (FBS), 100 μg/mL streptomycin, and 100 U/mL penicillin and maintained at 37°C in a 5% CO[2] incubator. Cellular differentiation into macrophages was achieved through 24 hours of incubation with 100 ng/mL phorbol 12-myristate 13-acetate (PMA) (MedChemExpress). Foam cells were further induced by 24 hours of incubation with 50 μg/mL ox-LDL (YiSeng Biotech, Shanghai, China) for use as an AS disease cell model. Quantitative real-time PCR (qRT-PCR) Total RNA was extracted from foam cells and THP-1 cells via TRIzol reagent (Invitrogen, USA) and quantified via a spectrophotometer (Wuzhou Dongfang Technology Development Co., Ltd., Beijing). cDNA was synthesized from RNA via Hifair® III 1st Strand cDNA Synthesis SuperMix (YiSeng Biotech, Shanghai, China) and amplified via Hieff® qPCR SYBR Green Master Mix (YiSeng Biotech, Shanghai, China) for qRT-PCR on a real-time PCR instrument (Thermo Fisher Scientific, Inc.). The primers used were synthesized by Liuhe Huada Gene Technologies Co., Ltd., Beijing (Table [172]2). Relative mRNA expression levels were quantified via the 2^-ΔΔCt method, with three replicates for each experiment. Table 2. Primers used for qRT-PCR. Gene Primer sequence (5′–3′) PTPN6 F GGAGCGAGATCCCTCCAAAAT PTPN6 R GGCTGTTGTCATACTTCTCATGG ITGAL F CTGCTTTGCCAGCCTCTCTGT ITGAL R GCTCACAGGTATCTGGCTATGG TYROBP F ACTGAGACCGAGTCGCCTTAT TYROBP R ATACGGCCTCTGTGTGTTGAG SLAMF7 F TATGCCCCCATTCTGGAGAGA SLAMF7 R CTTGGTGTGTCTGGCATCGT HAVCR2 F TTTCCAAGGATGCTTACCACCAG HAVCR2 R TTGGCCAATCTAGAGTCCCGTAAC HCST F CCATCTGGGTCACATCCTCTT HCST R CGGAACAAGAGCCTGAAGTG LCP2 F GAGGAGCATCTTCACACGCAA LCP2 R CGGCTCATAATCCGCGTCAT VAV3 F GAGCAGGCAACAGCTTGTTAAG VAV3 R TCTCCACCAGCCATTTGCAC GADPH F ACAACTTTGGTATCGTGGAAGG GADPH R GCCATCACGCCCACAGTTTC [173]Open in a new tab Construction and verification of a nomogram for clinical prediction Relying on the selected machine learning model, a nomogram was constructed to estimate the predictive validity of NK cell traits in AS assessment. Calibration curves, decision curves, and clinical impact curves were harnessed to authenticate the clinical accuracy of the model. Consensus clustering to identify AS subtypes Unsupervised consensus clustering analysis was performed on carotid artery samples via the “ConsensusClusterPlus” package in R to identify different disease subtypes in AS patients. The optimal number of clusters (k) was determined by maximizing the internal similarity while maximizing the inter-cluster differences. PCA was used as an auxiliary tool to verify the scientific rationality of the selected k value, and the expression profiles of NK cell-characteristic genes across subtypes were explored. Subtype-immune cell correlation analysis The infiltration levels of 23 distinct immune cell populations in various AS subtypes were quantified via single-sample gene set enrichment analysis (ssGSEA) implemented through the “GSVA” package in R. The connection between NK cell-specific genes and immunocyte density was assessed via Spearman’s rank-order correlation analysis, with statistical significance defined as p values < 0.05. Heatmap visualizations were created via the “pheatmap” package in R. Identification and analysis of DEGs between AS subtypes DEGs between subtypes were identified via the “limma” package in R with an adjusted p value < 0.05 and |LogFC| > 1 as the criteria. Heatmaps were generated via the “pheatmap” package to visualize expression patterns. Using the “clusterProfiler” package, we conducted Disease ontology (DO) enrichment analysis for subtype-specific DEGs, with an adjusted p value < 0.05 considered statistically significant. Gene set variation analysis (GSVA) GSVA was performed to explore the biological differences mediated by NK cell-characteristic genes between different AS subtypes via the “GSVA” package in R, with pathways downloaded from the MSigDB database ("c2.cp.kegg.v2023.1.Hs.symbols.gmt"). Adjusted p values < 0.05 were deemed indicative of statistically significant differences between subtypes. Protein–protein interaction (PPI) network analysis and drug-gene interaction prediction A PPI network was constructed via the search tool for the retrieval of interacting genes/proteins (STRING) ([174]www.STRING-db.org) to integrate known and predicted associations between proteins encoded by DEGs between AS subtypes. We utilized multiple types of interaction scores, including neighborhood on chromosome, gene fusion, phylogenetic co-occurrence, homology, co-expression, experimentally determined interaction, database annotation, automated and text-mining, and derived combined scores to comprehensively evaluate the reliability of protein-protein interactions. The network was visualized via Cytoscape software, and key node genes were identified on the basis of maximal clique centrality (MCC) algorithm scores as core genes in AS. Potential target-binding drugs or compounds for core genes were predicted via the drug-gene interaction database (DGIdb) ([175]http://www.DGIdb.org/). Electronic supplementary material Below is the link to the electronic supplementary material. [176]Supplementary Material 1^ (1.2MB, pdf) Acknowledgements