Abstract Background Our study aimed to identify potential specific biomarkers for osteoarthritis (OA) and assess their relationship with immune infiltration. Methods We utilized data from [34]GSE117999, [35]GSE51588, and [36]GSE57218 as training sets, while [37]GSE114007 served as a validation set, all obtained from the GEO database. First, weighted gene co-expression network analysis (WGCNA) and functional enrichment analysis were performed to identify hub modules and potential functions of genes. We subsequently screened for potential OA biomarkers within the differentially expressed genes (DEGs) of the hub module using machine learning methods. The diagnostic accuracy of the candidate genes was validated. Additionally, single gene analysis and ssGSEA was performed. Then, we explored the relationship between biomarkers and immune cells. Lastly, we employed RT-PCR to validate our results. Results WGCNA results suggested that the blue module was the most associated with OA and was functionally associated with extracellular matrix (ECM)-related terms. Our analysis identified ALB, HTRA1, DPT, MXRA5, CILP, MPO, and PLAT as potential biomarkers. Notably, HTRA1, DPT, and MXRA5 consistently exhibited increased expression in OA across both training and validation cohorts, demonstrating robust diagnostic potential. The ssGSEA results revealed that abnormal infiltration of DCs, NK cells, Tfh, Th2, and Treg cells might contribute to OA progression. HTRA1, DPT, and MXRA5 showed significant correlation with immune cell infiltration. The RT-PCR results also confirmed these findings. Conclusions HTRA1, DPT, and MXRA5 are promising biomarkers for OA. Their overexpression strongly correlates with OA progression and immune cell infiltration. Supplementary Information The online version contains supplementary material available at 10.1186/s12891-024-07758-7. Keywords: Osteoarthritis, Machine learning, Biomarkers, Immune infiltration, Extracellular matrix Introduction Osteoarthritis (OA) is a common chronic degenerative joint disease affecting millions of people worldwide [[38]1]. It is characterized by the cartilage deterioration, subchondral osteosclerosis, osteophyte formation and synovial inflammation [[39]2, [40]3]. The clinical symptoms include joint pain and dysfunction, eventually leading to disability [[41]4]. Recent study indicates that early diagnosis and timely intervention can not only relieve temporary symptoms but also delay OA progression [[42]5]. However, reliable biological markers for OA remain limited, and the pathogenesis of OA is not completely understood. Thus, identifying biomarkers for OA and exploring the molecular mechanisms of its development is essential. Recently, immune cell infiltration has emerged as a significant contributor to OA pathophysiology [[43]6]. These immune cells include mast cells, macrophages, lymphocytes, and neutrophils [[44]7, [45]8]. These cells produce pro-inflammatory mediators, secrete degrading enzymes, and disrupt the metabolism of various joint cells [[46]9, [47]10]. Targeting immune cells may provide a promising disease-modifying strategy for OA; however, the underlying mechanisms of immune cell infiltration require further elucidation. Thus, it is important to explore the genes that regulate this infiltration. In the present study, we initially conducted a weighted gene co-expression network analysis (WGCNA) to identify the module most relevant to OA. Subsequently, we performed functional enrichment analysis along with protein-protein interaction (PPI) assessments to explore the potential roles and interactions of genes within this key module. Leveraging a machine learning algorithm, we identified seven candidate molecules. After validation of their diagnostic accuracy, we obtained the final biomarkers. Lastly, we performed single-gene analysis and single-sample gene set enrichment analyses (ssGSEA) to reveal the biological functions and immunological correlates of these biomarkers. Taken together, the aim of our study was to identify potential specific biomarkers for OA progression and assess their association with immune infiltration. Materials and methods Data collection and processing The gene expression profiles of [48]GSE117999, [49]GSE51588, [50]GSE57218, and [51]GSE114007 were downloaded from the Gene Expression Omnibus (GEO) database. [52]GSE117999 contains 24 samples, including 12 cartilage tissue samples from normal joints and 12 cartilage tissue samples from OA joints. [53]GSE51588 contains 50 samples, including 10 subchondral bone tissue samples from normal joints and 40 subchondral bone tissue samples from OA joints. [54]GSE57218 contains 73 samples, including 7 cartilage tissue samples from normal joints and 66 cartilage tissue samples from OA joints. [55]GSE114007 contains 38 samples, including 18 cartilage tissue samples from normal joints and 20 cartilage tissue samples from OA joints. Batch effects were removed using the “sva” R package, followed by normalization with the “limma” R package [[56]11]. The [57]GSE117999, [58]GSE51588, and [59]GSE57218 cohorts functioned as the training set, while [60]GSE114007 served as the validation set. WGCNA analysis and module selection A gene co-expression network was established using the WGCNA R package with normalized expression data for 16,106 genes in the training set [[61]12]. A hierarchical clustering analysis preceded module detection. The scale-free topology criterion then determined the optimal soft-thresholding power (β). Then, average connectivity and independent traits of each module were assessed using selected β values. A network was subsequently constructed. Gene significance (GS) quantified the relations between individual genes and OA, while module membership (MM) gauged the importance of the gene profile within modules. The module demonstrating the highest GS and MM was identified as the hub module Genes with |log2FC| > 1 and false discovery rate (FDR) < 0.05, were identified as differentially expressed genes (DEGs). Functional enrichment analysis Genes in the hub module were extracted for Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis using the “clusterProfiler” R package [[62]13]. Terms and pathways with achieving FDR < 0.05 were considered significantly enriched. Construction of the PPI network PPI networks of genes in the hub module were predicted using the STRING database ([63]http://string-db.org; version 11.5) [[64]14]. PPI networks were constructed with an interaction score of > 0.4. Screening and verifying biomarkers Three machine learning methods-least absolute shrinkage and selection operator (LASSO) logistic regression, support vector machine (SVM), and random forest (RF)-were used to screen biomarkers for OA. The LASSO algorithm was conducted via the “glmnet” R package. The SVM module was implemented using “e1071” and “kernlab” R package. RF analysis was performed using “randomForest” R package. The intersection of results from LASSO, SVM, and RF constituted the candidate diagnostic gene for further analysis. Statistical significance was set at P < 0.05. Boxplots were used to compare the differential expression of biomarkers between healthy and OA tissues in both the training and validation sets. Receiver operating characteristic (ROC) curve analysis was used to assess the diagnostic accuracy. Single gene analysis Single gene analysis included differential expression analysis, GO, KEGG enrichment analysis, and gene set enrichment analyses (GSEA). Based on the median expression levels of each key genes (HTRA1, DPT, and MXRA5), we categorized 98 osteoarthritis (OA) patients into high and low expression groups. The identified DEGs were used for the GO, KEGG, and GSEA enrichment analyses. The GO and KEGG analysis were carried out as before. GSEA was conducted using GESA 3.0 software [[65]15]. Evaluating immune cell infiltration The ssGSEA methods was applied to with “GSVA” R package to to evaluate immune cell infiltration levels. The cell-specific markers of the 28 types of immune cells were obtained from a previous study [[66]16]. We compared immune cell counts across groups and assessed correlations among the 28 infiltrating immune cell types. Correlation analysis between biomarkers and immune cells Spearman correlation analysis was conducted to explore potential relationships between biomarker expression and infiltrating immune cells with “corrplot” R package. The result was visualized with “ggplot2” R package. Real-time polymerase chain reaction (RT-PCR) validation To validate our bioinformatics findings, we harvested cartilage and subchondral bone from five OA patients undergoing total joint arthroplasty for RT-PCR confirmation. This study was approved by the Ethical Committees of our hospital (2023A02), and informed consent was obtained from all patients. We categorized tissues into intact cartilage with underlying subchondral bone (control samples) and degenerative cartilage with underlying subchondral bone (OA samples). Total RNA was extracted from tissues using TRIzol reagent (Invitrogen, USA). Total RNA was then reverse-transcribed to cDNA and RT-PCR was performed using a PCR Kit (Promega, China) following the manufacturer’s instructions. The target genes were HTRA1, DPT, and MXRA5, and GAPDH was used as an internal reference. The primers used are listed in Table [67]1. The relative mRNA expression was calculated using the 2^−ΔΔCt method. Table 1. Primer sequences used for RT-qPCR Gene Sequences (5’-3’) HTRA1 F: TCCCAACAGTTTGCGCCATAA R: CCGGCACCTCTCGTTTAGAAA DPT F: GGGGCCAGTATGGCGATTATG R: CGGTTCAAATTCACCCACCC MXRA5 F: CCTTGTGCCTGCTACGTCC R: TTGGTCAGTCCTGCAAATGAG GAPDH F: GGCAAAGTGGAGATTGTTGCC R: CTTCCCATTCTCGGCCTTGA [68]Open in a new tab RT-qPCR, real time quantitative polymerase chain reaction; F, forward; R, reverse; HTRA1, high temperature requirement A1; DPT, dilute prothrombin time; MXRA5, matrix remodeling associated 5; GAPDH, glyceraldehyde-3-phosphate dehydrogenase Results Identification of the key module and enrichment analysis To identify the key module most associated with OA, WGCNA was performed using the expression levels of 16,106 genes. Clustering analysis facilitated module detection, yielding seven distinct modules represented by unique colors (Fig. [69]1A). The soft thresholding power β was determined to be 6 (Fig. [70]1B). As shown in Fig. [71]1C, the blue module showed the strongest association with OA (R = 0.37, P = 6e-06). Thus, the blue module, containing 327 genes, was considered the key module. Fig. 1. [72]Fig. 1 [73]Open in a new tab Key module detection and enrichment analysis. (A) Clustering dendrogram of genes, each module was displayed by distinct colors to distinguish different modules. (B) Determination of the soft thresholding power. (C) Heatmap of the correlations between gene modules and phenotypes. (D) GO terms analysis of the key module genes. (E) KEGG pathways analysis of the key module genes. (F) PPI network Functional enrichment analysis was performed to explore the potential functions of the genes in the key module. GO results showed that the main biological process (BP) involved extracellular structure organization and extracellular matrix (ECM) organization, the cellular component (CC) primarily consisted of the ECM and basement membrane, and the molecular function (MF) mainly encompassed ECM structural constituents, glycosaminoglycan binding, and growth factor binding (Fig. [74]1D). KEGG pathway enrichment analysis showed that revealed significant involvement of these genes in focal adhesion, ECM-receptor interactions, the Hippo signaling pathway, and the TGF-β signaling pathway (Fig. [75]1E). These results imply that ECM alterations may play a pivotal role in OA. Subsequently, a PPI network was constructed to investigate interactions among these genes (Fig. [76]1F). Screening and verifying biomarkers A total of 52 DEGs from the key module were extracted. The heatmap showed the differences in gene expression between OA patients and healthy controls (Fig. [77]2A). Biomarker selection was performed using the training set. The LASSO regression model revealed 22 DEGs relevant to OA (Fig. 2BC). Using the SVM algorithm, we identified 16 DEGs as potential biomarkers (Fig. [78]2D). Additionally, the RF algorithm was applied (Fig. [79]2E), and the top ten genes are shown in Fig. [80]2F. Detailed results from the three machine learning methods are available in Supplemental Table [81]1. Ultimately, by considering the overlaps among the three methods, seven biomarkers for OA were identified (Fig. [82]2G). These included ALB, HTRA1, DPT, MXRA5, CILP, MPO, and PLAT. Fig. 2. [83]Fig. 2 [84]Open in a new tab Screening the biomarkers. (A) Heatmap present the identified DEGs in key module between OA patients and healthy controls. (B) LASSO logistic regression algorithm for screening the biomarkers. (C) Confidence interval for each lambda. (D) SVM algorithm screening of the biomarkers. (E) RF algorithm screening of the biomarkers. (F) Top 10 genes in RF algorithm. (G) Venn diagram displaying the intersection between biomarkers identified using the three algorithms We evaluated the diagnostic accuracy of these seven biomarkers in both training and validation datasets. Analysis of expression levels among biomarkers in the training sets indicated that ALB and MPO exhibited reduced expression in OA patients, while HTRA1, DPT, MXRA5, CILP, and PLAT presented increased expression levels (P < 0.001) (Fig. [85]3A). The AUCs of ALB, HTRA1, DPT, MXRA5, CILP, MPO, and PLAT for OA were 0.820, 0.813, 0.773, 0.772, 0.740, 0.785, and 0.778, respectively (Fig. [86]3B). Notably, in the validation set, only three genes (HTRA1, DPT, and MXRA5) were differentially expressed. Consistent with training set findings, these three genes showed significantly higher expression levels in the OA group (P < 0.001) (Fig. [87]3C). The AUCs of HTRA1, DPT, and MXRA5 for OA diagnosis in the validation set were 0.964, 0.844, and 0.947, respectively (Fig. [88]3D). Taken together, these results indicate that HTRA1, DPT, and MXRA5 are valuable biomarkers for OA. Fig. 3. [89]Fig. 3 [90]Open in a new tab Validating the biomarkers. (A) Difference analysis of the expression levels of biomarkers in the training sets. (B) ROC curve analysis of the diagnostic efficacy of identified genes the training sets. (C) Difference analysis of the expression levels of biomarkers in the validation set. (D) ROC curve analysis of the diagnostic efficacy of identified genes the validation set. ^*P < 0.05, ^**P < 0.01, and ^***P < 0.001 Single gene analysis of hub biomarkers To better understand the role of the three hub biomarkers in OA, single-gene analysis was performed. Based on the median expression level of HTRA1, OA patients were divided into high- and low-expression groups. The high-expression group exhibited differential expression of 209 genes compared to the low-expression group, with 144 genes upregulated and 65 downregulated (Fig. [91]4A). GO functional enrichment analysis revealed that these DEGs primarily participated in ossification, development of bone and cartilage, and extracellular matrix organization (Fig. [92]4B). KEGG pathway analysis showed that these DEGs were significantly enriched in ECM-receptor interaction, cytokine-cytokine receptor interaction, focal adhesion, and PI3K-Akt signaling pathway (Fig. [93]4C). GSEA analysis showed that the gene set of the low HTRA1 expression OA group was mainly involved in signaling pathways such as the adipocytokine signaling pathway, cytokine-cytokine receptor interaction, and neuroactive ligand receptor interaction (Fig. [94]4D), whereas the gene set of the high HTRA1 expression group substantially participated in the cell cycle, ECM receptor interaction, focal adhesion, and ribosome (Fig. [95]4E). Fig. 4. [96]Fig. 4 [97]Open in a new tab Single gene analysis of HTRA1. (A) Heatmap present the identified DEGs between low- and high- HTRA1 expression OA patients. (B) GO terms analysis of HTRA1-related DEGs. (C) KEGG pathways analysis of the HTRA1-related DEGs. (D) GSEA-KEGG enrichment analysis in low HTRA1 expression OA group. (E) GSEA-KEGG enrichment analysis in high HTRA1 expression OA group, saved the top five enriched pathways Similarly, based on the median DPT expression level, patients with OA were divided into high- and low-expression groups. The high-expression group showed differential expression of 93 genes relative to the low-expression group, with 76 genes upregulated and 17 downregulated (Fig. [98]5A). GO functional enrichment analysis revealed that these DEGs were mainly involved in collagen metabolic processes, bone and cartilage development, and extracellular matrix organization (Fig. [99]5B). KEGG pathway analysis showed that these DEGs were significantly enriched in protein digestion and absorption, ECM-receptor interaction, focal adhesion, and phagosome (Fig. [100]5C). GSEA analysis showed that the gene set of the low DPT expression OA group was mainly involved in signaling pathways such as the adipocytokine signaling pathway, PPAR signaling pathway, and T cell receptor signaling pathway (Fig. [101]5D), whereas the gene set of the high DPT expression group substantially participated in ECM-receptor interaction, focal adhesion, oxidative phosphorylation, lysosomes, and ribosomes (Fig. [102]5E). Fig. 5. [103]Fig. 5 [104]Open in a new tab Single gene analysis of DPT. (A) Heatmap present the identified DEGs between low- and high- DPT expression OA patients. (B) GO terms analysis of DPT-related DEGs. (C) KEGG pathways analysis of the DPT-related DEGs. (D) GSEA-KEGG enrichment analysis in low DPT expression OA group. (E) GSEA-KEGG enrichment analysis in high DPT expression OA group, saved the top five enriched pathways Following the same approach, we evaluated MXRA5 expression levels to categorize OA patients into high- and low-expression groups. The high-expression group manifested differential expression of 190 genes compared to the low-expression group, with 136 genes upregulated and 54 downregulated (Fig. [105]6A). GO functional enrichment analysis revealed that these DEGs were mainly involved in ossification, bone and cartilage development, and extracellular matrix organization (Fig. [106]6B). KEGG pathway analysis showed that these DEGs were significantly enriched for protein digestion and absorption, ECM-receptor interaction, focal adhesion, tyrosine metabolism, and the PI3K-Akt signaling pathway (Fig. [107]6C). GSEA analysis showed that the gene set of the low MXRA5 expression OA group was mainly involved in signaling pathways such as adipocytokine signaling pathway, fatty acid metabolism, and PPAR signaling pathway (Fig. [108]6D), whereas the gene set of the high MXRA5 expression group substantially participated in the cell cycle, ECM-receptor interaction, focal adhesion, and lysosomes (Fig. [109]6E). Fig. 6. [110]Fig. 6 [111]Open in a new tab Single gene analysis of MXRA5. (A) Heatmap present the identified DEGs between low- and high- MXRA5 expression OA patients. (B) GO terms analysis of MXRA5-related DEGs. (C) KEGG pathways analysis of the MXRA5-related DEGs. (D) GSEA-KEGG enrichment analysis in low MXRA5 expression OA group. (E) GSEA-KEGG enrichment analysis in high MXRA5 expression OA group, saved the top five enriched pathways Immune infiltration analysis ssGSEA was performed to analyze differences in immune cell infiltration between OA and healthy tissues. Compared to healthy tissue, O exhibited a greater abundance of activated dendritic cells (aDCs), immature dendritic cells (iDCs), cytolytic activity, inflammation-promoting factors, parainflammation, natural killer (NK) cells, follicular helper T cells (Tfh), T helper 2 (Th2) cells, and type I interferon responses, while regulatory T cells (Tregs) were found in lower proportions (Fig. [112]7A). A heatmap illustrated the correlations among these immunocytes (Fig. [113]7B). Fig. 7. [114]Fig. 7 [115]Open in a new tab Evaluation of immune cell infiltration. (A) The difference of immune infiltration between OA and healthy controls. (B) Correlation heat map of 28 types of immune cells. ^*P < 0.05, ^**P < 0.01, and ^***P < 0.001 Correlation of HTRA1, DPT and MXRA5 with immune cell infiltration We further examined the relationship between immune cell infiltration and the expression levels of HTRA1, DPT, and MXRA5. In patients with high HTRA1 expression, Th1 cells, B cells, NK cells, tumor-infiltrating lymphocytes (TIL), T cell co-stimulation, T cell co-inhibition, inflammation-promoting factors, checkpoint activity, cytolytic activity, and CD8 + T cells were present in lower proportions compared to patients with low HTRA1 expression (Fig. [116]8A). Correlation analysis indicated a negative association between HTRA1 expression and these immune cells, while a positive correlation existed with Tfh expression (Fig. [117]8B). Fig. 8. [118]Fig. 8 [119]Open in a new tab Correlation between HTRA1, DPT, MXRA5, and infiltrating immune cells. (A) The difference of immune infiltration between low- and high- HTRA1 expression OA patients. (B) Correlation between HTRA1 and infiltrating immune cells. ^*P < 0.05, ^**P < 0.01, and ^***P < 0.001 In patients with high DPT expression, proportions of B cells, cytolytic activity, CD8 + T cells, Th1 cells, and NK cells were lower, whereas higher proportions of parenchymal inflammation, macrophages, type I interferon responses, MHC class I molecules, neutrophils, HLA, dendritic cells (DCs), and plasmacytoid dendritic cells (pDCs) were observed (Fig. [120]8C). Correlation analysis revealed a positive association of DPT with elevated immune cell levels but a negative correlation with CD8 + T cells and Th1 cells (Fig. [121]8D). For patients with low MXRA5 expression, proportions of Th1 cells, B cells, NK cells, CD8 + T cells, cytolytic activity, inflammation-promoting factors, T cell co-stimulation, and TIL were lower compared to those with high MXRA5 expression. Conversely, higher proportions of macrophages, inflammation, Tfh, type I interferon responses, and T helper cells were noted in the high-expression group (Fig. [122]8E). Correlation analysis indicated that MXRA5 positively correlated with elevated immune cell levels (excluding T helper cells) but negatively correlated with Th1 cells, B cells, NK cells, CD8 + T cells, cytolytic activity, and iDCs (Fig. [123]8F). In summary, our findings suggest that HTRA1, DPT, and MXRA5 play significant roles in the regulation of various immune cells in OA. RT-PCR validation of three biomarkers The mRNA levels of the three crucial OA biomarkers were validated through RT-PCR. The expression levels of HTRA1, DPT, and MXRA5 were markedly higher in OA samples than in control samples (Fig. [124]9), aligning with the bioinformatics analysis results. Fig. 9. [125]Fig. 9 [126]Open in a new tab RT-PCR validation of the hub biomarkers between OA and healthy controls. All experiments were performed in triplicate. ^*P < 0.05, ^**P < 0.01, and ^***P < 0.001 Discussion OA is the most common degenerative joint disease and is associated with progressive degeneration of articular cartilage, ECM, and subchondral bone [[127]17]. Nonetheless, critical genes and mechanisms underlying OA progression remain poorly understood, complicating early targeted intervention [[128]10, [129]18]. Recent studies highlight the importance of immune cell infiltration in OA progression [[130]6]. Therefore, identifying new biomarkers and exploring their relationship with immune cell infiltration patterns may provide promising early intervention strategies and improve the prognosis. Our results revealed that HTRA1, DPT, and MXRA5 are upregulated in OA and could serve as promising biomarkers for OA. However, the specific roles of these biomarkers in OA pathophysiology require further elucidation. HTRA1 is a serine protease that modulates several biological processes including cell proliferation, migration, and fate determination [[131]19]. HTRA1 can trigger ECM substrate degradation and contribute to various diseases, including cancer and neurodegenerative disorders [[132]20–[133]22]. In addition, HTRA1 promotes the expression of matrix metalloproteinases (MMPs) focal adhesion [[134]23]. Emerging evidence indicates that HTRA1 facilitates inflammation and immune cell infiltration by activating the NF-kB signaling pathway [[135]24, [136]25]. It may also exert an adverse impact on bone and cartilage formation via the TGF-β and BMP pathways [[137]26, [138]27]. Moreover, after exposure to inflammatory cytokines, elevated HTRA1 levels correlate with cartilage degeneration, implying that heightened HTRA1 expression may precipitate OA progression [[139]28]. DPT serves as a key component of the ECM, critical for cell adhesion and ECM assembly. It interacts with various proteins, enhancing biological activities and regulating the cell cycle and differentiation [[140]29–[141]31]. These effects may be achieved via the TGF-β, P13K/Akt, YAP, and Wnt signaling pathways [[142]29, [143]32–[144]34]. Furthermore, DPT is associated with ECM remodeling in contexts like adipose tissue, cancer, cardiac disorders, and wound healing [[145]35–[146]38]. Increased DPT expression correlates with elevated inflammation levels, promoting pro-inflammatory factors [interleukin (IL)-6, IL-8, tumor necrosis factors (TNF) and lipopolysaccharide (LPS)] and suppressing anti-inflammatory markers (IL-4) [[147]35, [148]37, [149]39]. MXRA5, a secreted proteoglycan with VEGF receptor activity, is expressed in primates and mammals but not in mice or rats [[150]40]. Like the other two biomarkers, MXRA5 is crucial for cell adhesion and ECM remodeling [[151]41]. Its expression is elevated in chronic cardiac disorders, kidney diseases, and radiation-induced fibrosis [[152]42–[153]45]. Additionally, MXRA5 participates in tumorigenesis and could serve as a potential biomarker for early colorectal cancer, gastric cancer, glioma, and acute myeloid leukemia diagnosis [[154]46–[155]49]. Recent studies suggest that MXRA5 overexpression correlates with the expression of immune checkpoint molecules and the extent of immune cell infiltration [[156]42, [157]47, [158]49]. Although abnormal alternations of these 28 types of immune cells in OA development have been widely discussed, results varied considerably across studies. Our study again investigated the significant changes in immune cell subpopulations and their functions within OA cartilage and subchondral bone. We observed increased infiltration of DCs, NKs, Tfh, Th2, alongside decreased Tregs, which may exacerbate OA. DCs, as primary mediators of immune response, play a crucial role in recruiting proinflammatory immune cells such as macrophages and neutrophils, while activating inflammatory T cells [[159]50]. Macrophages serve as both initiators and effectors in OA, significantly contributing to immune recruitment, releasing pro-inflammatory cytokines, and mediating ECM degradation [[160]51]. Neutrophils release enzymes that that cartilage ECM [[161]52]. T cells secrete pro-inflammatory cytokines, and their accumulation in joints signifies OA progression [[162]53, [163]54]. Therefore, analyzing the immune landscape of these cells in OA illuminates mechanisms driving OA progression, aiding in the identification of novel therapeutic target. Next, we investigated correlations between HTRA1, DPT, MXRA5, and immune cells. Importantly, our findings indicated that high expression levels of these three genes consistently aligned with reduced Th1 cells, B cells, and NK cells, yet increased macrophages and Tfh cells. Prior research noted that HTRA1 promotes macrophage infiltration during anomalous ocular angiogenesis [[164]24]. Another bioinformatic study revealed that a strong positive correlation between DPT and macrophage infiltration in breast cancer [[165]55]. In addition, Sun et al. and Wen et al. suggested the significant association of MXRA5 with macrophage infiltration [[166]42, [167]47]. Thus, we propose that HTRA1, DPT, and MXRA5 play critical roles in the OA immune microenvironment. Nonetheless, further investigations are essential to validate the complex interplay between these genes and immune cells. Our study has some limitations. First, the sample size remains relatively limited. Second, the diagnostic accuracy of these three genes should be assessed in a large external cohort. Finally, the potential mechanisms revealed in this study require validation through in vitro and in vivo experiments. Conclusion Utilizing a combination of WCGNA and machine learning methods, our study identified elevated levels of HTRA1, DPT, and MXRA5 in OA, suggesting their potential as promising novel biomarkers. Additionally, our research indicates that immune cells might play a role in OA progression. HTRA1, DPT, and MXRA5 demonstrate a close correlation with immune cell infiltration. These findings not only unveil potential mechanisms underlying OA progression but also present novel critical biomarkers and promising immunotherapeutic targets for OA treatment. Electronic supplementary material Below is the link to the electronic supplementary material. [168]Supplementary Material 1^ (11.6KB, docx) Acknowledgements