Abstract Background Rheumatoid arthritis (RA), a chronic autoimmune inflammatory disease, is often characterized by persistent morning stiffness, joint pain, and swelling. Early diagnosis and timely treatment of RA can effectively delay the progression of the condition and significantly reduce the incidence of disability. In the study, we explored the function of pyroptosis-related genes (PRGs) in the diagnosis and classification of rheumatoid arthritis based on Gene Expression Omnibus (GEO) datasets. Method We downloaded the [41]GSE93272 dataset from the GEO database, which contains 35 healthy controls and 67 RA patients. Firstly, the [42]GSE93272 was normalized by the R software “limma” package. Then, we screened PRGs by SVM-RFE, LASSO, and RF algorithms. To further investigate the prevalence of RA, we established a nomogram model. Besides, we grouped gene expression profiles into two clusters and explored their relationship with infiltrating immune cells. Finally, we analyzed the relationship between the two clusters and the cytokines. Result CHMP3, TP53, AIM2, NLRP1, and PLCG1 were identified as PRGs. The nomogram model revealed that decision-making based on established model might be beneficial for RA patients, and the predictive power of the nomogram model was significant. In addition, we identified two different pyroptosis patterns (pyroptosis clusters A and B) based on the 5 PRGs. We found that eosinophil, gamma delta T cell, macrophage, natural killer cell, regulatory T cell, type 17 T helper cell, and type 2 T helper cell were significant high expressed in cluster B. And, we identified gene clusters A and B based on 56 differentially expressed genes (DEGs) between pyroptosis cluster A and B. And we calculated the pyroptosis score for each sample to quantify the different patterns. The patients in pyroptosis cluster B or gene cluster B had higher pyroptosis scores than those in pyroptosis cluster A or gene cluster A. Conclusion In summary, PRGs play vital roles in the development and occurrence of RA. Our findings might provide novel views for the immunotherapy strategies with RA. Keywords: rheumatoid arthritis, pyroptosis, immunity, consensus clustering, bioinformatic analysis 1. Introduction Rheumatoid arthritis (RA), a chronic autoimmune inflammatory disease, is often characterized by persistent morning stiffness, joint pain and swelling ([43]1). RA affects approximately 1% of the world population and has become one of the most common causes of significant disability ([44]2). Although the pathogenesis and etiology of RA have not been fully known, the interaction of environmental, genetic, and immunological factors has been shown to play an important role in the development of RA ([45]3). Early diagnosis and timely treatment of RA can effectively delay the progression of the condition and significantly reduce the incidence of disability ([46]4). Therefore, screening for diagnostic genes associated with RA, exploring their subtype classification, and elucidating the underlying pathogenesis of RA could be effective in preventing and treating RA, and might provide new approaches for clinical treatment of RA. Pyroptosis, a novel inflammatory programmed cell death, is mediated by the caspase family and the GSDM protein family ([47]5). Pyroptosis is characterized by cell swelling and cell membrane rupture, and the release of pro-inflammatory cytokines that eventually induce and aggravate the inflammatory response ([48]6). Increasing studies conformed that pyroptosis might play a key role in the development of many immune diseases ([49]7). In the arthritic mouse model, NLRP3^-/- or Caspase-1^-/- mice could alleviate symptoms of arthritis ([50]8). Gsdme^-/- mice have been demonstrated to reduce intestinal inflammation in the inducible colitis model ([51]9). Besides, bronchial epithelial cell pyroptosis promotes airway inflammation in asthmatic mice ([52]10). However, the role of pyroptosis-related genes (PRGs) in RA remains unclear. In the research, we used bioinformatics methods to investigate the function of PRGs in the diagnosis and classification of rheumatoid arthritis form the Gene Expression Omnibus (GEO) datasets. Firstly, we identified differential expression of PRGs from the [53]GSE93272 dataset. Then, we screened 5 PRGs associated with RA by support vector machine-recursive feature elimination (SVM-RFE), least absolute shrinkage and selection operator (LASSO) logistic regression and random forest (RF) algorithms, and established a nomogram model for predicting the prevalence of RA. In addition, we divided gene expression profiles into two clusters and explored their relationship with infiltrating immune cells. Finally, we further analyze the relationship between two clusters and cytokines. We found that the pyroptosis-related pattern could distinguish RA patients from normal people and provide new directions for the prevention and treatment of RA. 2. Materials and methods 2.1. Data acquisition and preprocessing The microarray datasets were downloaded from the GEO database ([54]https://www.ncbi.nlm.nih.gov/geo/) using “rheumatoid arthritis”, “whole blood,” and “Homo sapiens” as keywords. The inclusion criteria were as follows: the whole-genome expression profiling of whole blood of RA patients and healthy control samples was available in the datasets; every dataset contained a sample count of > 20; and all included samples were not treated with drugs. The microarray dataset [55]GSE93272 from the [56]GPL570 platform containing 35 healthy controls and 67 RA patients was downloaded from the GEO database ([57]11). 2.2. Identification of differentially expressed PRGs The [58]GSE93272 cohort was normalized by the “limma” package of R software ([59]12). Based on previous literatures ([60]13–[61]15), we acquired 52 PRGs. However, we did not find the expression data of GSDMA in [62]GSE93272. Therefore, 51 PRGs were used for the following analysis. Then, we identified differentially expressed PRGs in RA and normal samples using the “limma” package. The p-value < 0.05 was considered a significant difference. Heatmap and boxplot were performed using the R packages “pheatmap” and “ggpubr” to visualize the differentially expressed PRGs. 2.3. Screening of PRGs for RA Based on the differentially expressed PRGs, three feature selection algorithms, including SVM-RFE ([63]16), LASSO logistic regression ([64]17) and RF algorithm ([65]18) were adapted to screen RA-related biomarkers, respectively. The SVM-RFE algorithm was performed by the R packages “e1071” and “caret” with five-fold cross-validation ([66]19). The LASSO logistic regression was employed with the R package “glmnet” ([67]20). The RF algorithm was analyzed by the R package “randomForest” ([68]21). Then, the “venn” R package was used to select overlapping genes from the three algorithms as signature genes for further analysis. 2.4. Construction of a nomogram model We constructed a nomogram model based on PRGs (CHMP3, TP53, AIM2, NLRP1, and PLCG1) to predict the occurrence of RA patients with the “rms” package in R ([69]22). The calibration curve was used to assess the predictive performance of the nomogram model. Then, we further performed decision curve analysis (DCA) and clinical impact curve analysis (CICA) to estimate the clinical utility of the nomogram model ([70]23). 2.5. Consensus clustering Consensus clustering is an algorithm for identifying cluster each member and their number in datasets ([71]24). We utilized the consensus clustering method to distinguish distinct pyroptosis-related clinical subtypes of RA and identify different PRGs patterns based on the significant differentially expressed PRGs with the package “ConsensusClusterPlus” in R ([72]25). “Points” represents the score of the corresponding factor below and “Total Points” indicates the summation of all the scores of factors above. 2.6. Estimation of the pyroptosis gene signature To quantify the pyroptosis patterns, we used principal component analysis (PCA) algorithms to calculate the pyroptosis score for each RA sample. The Principal Component 1 (PC1) and Principal Component 2 (PC2) were chosen as the signature scores. And pyroptosis scores for each RA patient were calculated using the following formula ([73]26, [74]27): Pyroptosis Score = Σ(PC1[i] + PC2[i]), where i is the expression of PRGs. 2.7. Estimation of immune cell infiltration for RA The single-sample gene-set enrichment analysis (ssGSEA) was employed to measure the relative abundance of immune cells in RA samples via the R packages “limma”, “GSVA”, and “GSEABase” ([75]28). And the gene set for marking each immune cell type was obtained from the study of Charoentong ([76]29). 2.8. Functional and pathway enrichment analysis To investigate the functional and molecular pathways of differentially expressed genes between pyroptosis gene clusters A and B, we performed GO, KEGG enrichment analyses by the “colorspace”, “stringi” and “ggplot2” packages in R ([77]30, [78]31). P < 0.05 was considered statistically significant. 2.9. Statistical analysis The Kruskal-Wallis test was adopted to compare differences between normal samples and RA samples. The significant differences were identified with the p-value < 0.05. All statistical analysis were performed using the R version 4.0.3. 3. Results 3.1. The landscape of the differentially expressed PRGs We analyzed the differential expression levels of 51 PRGs between RA patients and healthy controls using the “limma” R package ([79] Supplementary Table 1 ). A heatmap and histogram were used to visualize the 23 differentially expressed PRGs. We found that BAX, CASP1, CASP3, CASP4, CASP5, CHMP2B, CHMP3, HMGB1, IL18, IL1A, AIM2, NLRC4, NOD2, TNF, and GZMA were overexpressed in RA patients compared to healthy controls ([80] Figures 1A, B ). Figure 1. [81]Figure 1 [82]Open in a new tab Landscape of the 23 PRGs. (A) Expression heatmap of the 23 PRGs in healthy control and RA patients. (B) Expression histogram of the 23 PRGs in healthy control and RA patients. (C) The PRGs screened using the LASSO logistic regression algorithm. (D) The hub genes identified via the RF algorithm. (E) The PRGs recognized using SVM-RFE algorithm. (F) Venn diagram showing the intersection among PRGs genes between the three algorithms. * means P < 0.05, ** means P < 0.01, *** means P < 0.001. 3.2. Identification of characteristic genes To further screen the characteristic genes related to PRGs for RA, we utilized the LASSO logistic regression algorithm, the RF algorithm, and the SVM-RFE analysis for feature identification ([83] Supplementary Table 2 ). Thirteen genes from differentially expressed PRGs were identified as biomarkers for RA using the LASSO logistic regression algorithm ([84] Figure 1C ). We used RF algorithm to detect nine key genes from differentially expressed PRGs as vital biomarkers ([85] Figure 1D ). Eight signature genes were identified from differentially expressed PRGs by the SVM-RFE analysis ([86] Figure 1E ). Finally, we overlapped three different algorithms analysis results and obtained 5 genes (CHMP3, TP53, AIM2, NLRP1, and PLCG1) that were significantly related to RA ([87] Figure 1F ). 3.3. Construction of the nomogram To predict the prevalence of RA patients, we constructed a nomogram model based on the 5 PRGs ([88] Figure 2A ). As shown in [89]Figure 2B , the calibration curve of the nomogram revealed accurate predictive ability. The DCA result revealed that decision-making based on established models may be beneficial for RA patients ([90] Figure 2C ). And the CICA result ([91] Figure 2D ) found that the predictive power of the nomogram model was significant. Figure 2. [92]Figure 2 [93]Open in a new tab Construction of the nomogram model. (A) Construction of the nomogram model based on the 5 PRGs. (B) Predictive robustness of the nomogram model as revealed by the calibration curve. (C) Decisions based on the nomogram model may benefit RA patients. (D) Clinical impact of the nomogram model as assessed by the clinical impact curve. 3.4. Two distinct pyroptosis patterns Based on the 5 PRGs, we identified two different pyroptosis patterns (cluster A and cluster B) using the consensus clustering method ([94] Figure 3A and [95]Supplementary Figure 1 ). There were 38 cases in cluster A and 29 cases in cluster B. We plotted the histogram to observe the differential expression levels of the 5 PRGs between the two clusters. TP53, NLRP1, and PLCG1 showed higher expression in pyroptosis gene cluster A than in pyroptosis gene cluster B, while AIM2 revealed the opposite results. And CHMP3 showed no differently expressed between the two patterns ([96] Figure 3B ). As shown in [97]Figure 3C , the two pyroptosis patterns could be distinguished though the 5 significant PRGs with PCA analysis. Then, the differential immune cell infiltration between the two pyroptosis patterns was analyzed ([98] Figure 3D ). We found that eosinophil, gamma delta T cell, macrophage, natural killer cell, regulatory T cell, type 17 T helper cell, and type 2 T helper cell were significant high expressed in cluster B (p < 0.05). Besides, we calculated the abundance of immune cells in RA patients and evaluated the correlation between the 5 PRGs and immune cells ([99] Figure 3E ). Figure 3. Figure 3 [100]Open in a new tab Consensus clustering of the 5 PRGs. (A) Consensus matrices of the 5 PRGs for k = 2. (B) Differential expression histogram of the 5 PRGs in gene cluster A and B. (C) PCA for the expression profiles of the 5 PRGs. (D) Differential immune cell infiltration between gene cluster A and B. (E) Correlation between infiltrating immune cells and the 5 PRGs. * means P < 0.05, ** means P < 0.01, *** means P < 0.001. 3.5. Function and pathway enrichment A total of 56 differentially expressed genes (DEGs) were identified between the two pyroptosis patterns. To further explore the potential functional and molecular pathways of DEGs, we performed GO and KEGG enrichment analyses, and the results were shown through an enrichment circle diagram. In the GO enrichment analysis of differential expression PRGs, biological processes (BP) terms were correlated with defense response to virus (GO:0051607) and defense response to symbiont (GO:0140546); cellular components (CC) terms were related to tertiary granule (GO:0070820) and early endosome (GO:0005769); and molecular functions (MF) terms were associated with double stranded RNA binding (GO:0003725) and pattern recognition receptor activity (GO:0038187) ([101] Figure 4A ; [102]Supplementary Table 3 ). The results of KEGG enrichment analysis revealed that DEGs were significantly enriched in the NOD-like receptor signaling pathway and the NF-kappa B signaling pathway ([103] Figure 4B ; [104]Supplementary Table 4 ). Figure 4. [105]Figure 4 [106]Open in a new tab The functional enrichment analyses of DEGs. (A) The GO analyses results for DEGs; (B) The KEGG analysis results for DEGs. 3.6. Identification of two distinct gene patterns To further verify the pyroptosis patterns, we classified the RA patients into different genetic subtypes and termed as gene cluster A and B based on the 56 DEGs by using the consensus clustering method ([107] Figure 5A ; [108]Supplementary Figure 2 ). There were 37 cases in gene cluster A and 30 in gene cluster B. As shown in [109]Figure 5B , the heatmap displayed the expression levels of the 56 DEGs in gene clusters A and B. In addition, we found that the differential expression levels of the 5 significant PRGs and immune cell infiltration between gene cluster A and B were consistent with those in the pyroptosis patterns ([110] Figures 5C, D ). The result again demonstrated the accuracy of dividing into distinct subtypes. Furthermore, we also compared the pyroptosis score between the two distinct pyroptosis patterns or DEGs patterns. The result revealed that the pyroptosis score in cluster B or gene cluster B was significantly higher than that in cluster A, or gene cluster A ([111] Figure 6A ). The relationship between pyroptosis patterns, pyroptosis gene patterns, and pyroptosis scores was visualized in a Sankey diagram ([112] Figure 6B ). Figure 5. [113]Figure 5 [114]Open in a new tab Consensus clustering of the DEGs. (A) Consensus matrices of the 56 DEGs for k = 2. (B) Expression heatmap of the 56 DEGs in gene cluster A and B. (C) Differential expression histogram of the 5 PRGs in gene cluster A and B. (D) Differential immune cell infiltration between gene cluster A and B. * means P < 0.05, *** means P < 0.001. Figure 6. [115]Figure 6 [116]Open in a new tab Role of pyroptosis patterns in distinguishing RA. (A) Differences in pyroptosis score between pyroptosis gene cluster A and B and differences in pyroptosis score between gene cluster A and B. (B) Sankey diagram showing the relationship between pyroptosis patterns, pyroptosis gene patterns, and pyroptosis scores. (C) Differential expression levels of STAT1, CCR5, NLRP1, IL-15, and CXCL10 between pyroptosis gene cluster A and B. (C) Differential expression levels of STAT1, CCR5, NLRP1, IL-15, and CXCL10 between gene cluster A and B. * means P < 0.05, ** means P < 0.01, *** means P < 0.001. 3.7. Identification of two distinct gene patterns To further reveal the relationship between pyroptosis patterns and RA, we investigated the correlation between pyroptosis patterns and STAT1, CCR5, NLRP1, IL-15, and CXCL10. The results showed that the expression levels of STAT1, CCR5, NLRP1, IL-15, and CXCL10 were higher in pyroptosis gene cluster B or gene cluster B than in pyroptosis gene cluster A or gene cluster A, which suggested that pyroptosis gene cluster B or gene cluster B is highly linked to RA characterized by the immune response [117]Figure 6C . 4. Discussion RA is a chronic inflammatory disease characterized by persistent inflammatory synovitis and systemic inflammation. RA has attracted wide world attention in recent years due to its high disability rate ([118]32). Currently, treatment strategies with biologics and disease-modifying anti-rheumatic drugs have led to significant improvement in the prognosis of RA patients, while a large proportion of RA patients still do not experience effective clinical relief. Studies showed that early diagnosis and positive treatment significantly improve the clinical prognosis of RA ([119]33). Thus, there is an urgent need to identify RA-related diagnostic genes, further explore the molecular mechanisms of RA, and provide novel therapeutic strategies for the prevention and treatment of RA. Pyroptosis is a novel form of inflammatory programmed cell death that plays a vital role in the development of RA ([120]34). Pyroptosis further exacerbates RA inflammation by releasing inflammatory cytokines like interleukin (IL)-1β and IL-18 ([121]35). Besides, studies demonstrated that the serum concentrations of IL-1β ([122]36) and IL-18 ([123]37) were significantly higher in RA patients compared to healthy controls. In order to gain new knowledge for the diagnosis and management of RA, we further studied the connection between RA and pyroptosis by locating and screening PRGs in the serum of RA patients. In this work, we used 51 PRGs to detect differential expression PRGs using differential expression analysis. We chose 5 candidate PRGs (CHMP3, TP53, AIM2, NLRP1, and PLCG1) from differential expression PRGs by applying RF, SVM-RFE, and LASSO methods in order to filter the 51 PRGs that were the most pertinent for RA. Then, we constructed a nomogram model based on the 5 PRGs to predict the occurrence of RA. In addition, we distinguished two different pyroptosis regulation patterns based on the 5 PRGs and explored the correlation between infiltrating immune cells and the 5 PRGs. A total of 56 DEGs were screened between the two pyroptosis patterns. We further investigated the GO and KEGG functional enrichment of 56 DEGs. Furthermore, we used the consensus clustering method to validate the pyroptosis patterns based on 56 DEGs. We found that two distinct pyroptosis gene patterns were consistent with the grouping of pyroptosis patterns. During the progression of RA, cytokines have been involved in immune regulation, immune response, and inflammatory response ([124]38). We also explore the relationship between inflammatory cytokines and the patterns of pyroptosis. NOD-like receptor thermal protein domain associated protein 1 (NLRP1) is a member of the NLR family. NLRP1 has been found to be closely associated with the pathogenesis of RA ([125]39). Activated NLRP1 promoted the release of inflammatory cytokines, such as IL-1β and IL-18 ([126]40). Besides, a study showed that inhibition of NLRP1 activation effectively ameliorated joint inflammation and destruction in collagen-induced arthritis mice ([127]41). Furthermore, the polymorphism of the NLRP1 gene was associated with the incidence of RA in the Han Chinese population ([128]42). A member of the interferon-inducible HIN-200 protein family is absent in melanoma 2 (AIM2). AIM2 has emerged as a hub for research into the pyroptosis-specific pathophysiology of RA. AIM2 has been linked to the emergence of inflammatory illnesses and autoimmune arthritis, according to a research ([129]43). AIM2 could format a caspase-1-activating inflammasome, thereby controlling the proteolytic maturation of pro-inflammatory cytokines IL-1β and IL-18 ([130]44). In addition, a meta-analysis revealed that AIM2 levels were highly expressed in peripheral blood mononuclear cells from RA patients ([131]45). Recent study showed that the expression of AIM2 was higher in the RA synovium than in the OA. AIM2 siRNA could inhibit the proliferation of RA fibroblast-like synoviocytes ([132]46). PLCG1, also called phospholipase C, gamma 1, is involved in the receptor tyrosine kinase-(RTK-)-mediated signal transduction pathway ([133]47). A study found that PRGPI might serve as a prognostic biomarker for pancreatic cancer patients ([134]48). Besides, numerous studies have proven the involvement of PLCG1-mediated inflammatory response in the pathogenesis of osteoarthritis and lung cancer ([135]49, [136]50). Charged multivesicular body protein 3 (CHMP3) is a subunit of ESCRT III involved in membrane remodeling ([137]51). High CHMP3 expression in breast cancer patients predicts better survival outcomes ([138]52). Moreover, immunohistochemistry revealed significant high expression of CHMP3 in tumor liver tissue ([139]53). The P53 tumor suppressor gene (TP53), also known as the p53 gene, is a protein encoding a molecular weight of 53 kDa. TP53 was found to regulate important cellular functions, such as apoptosis, cell cycle regulation, DNA repair, and apoptosis ([140]54). Besides, TP53 is an inflammatory suppressor associated with autoimmune diseases. Many studies have indicated that the TP53 mutation is closely related to the pathological changes of RA ([141]55, [142]56). TP53 mutation was identified in synovium of RA patients ([143]57). In the collagen-induced arthritis model, p53 ^-/- mice showed increased severity of arthritis ([144]58). However, there are some limits to the study. Firstly, the lack of experimental verification of bioinformatics analysis results. We need to collect human serum samples to further validate our analysis results and elucidate their value as potential clinical biomarkers. Besides, due to the small number of available RA datasets in the GEO database and the limited sample size of this study, the analysis results may be biased. We will include more samples to further assess the reliability of the predicted signature genes. 5. Conclusion In conclusion, our study first found PLCG1 and CHMP3 may be involved in the pathogenesis of RA. And pyroptosis pattern is involved in the progress of RA by bioinformatics analysis, which provides a novel prospective for the prevention and diagnosis of RA. Data availability statement The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/[145] Supplementary Material . Author contributions JZ, JGuo, and HZ contributed to the study design and critical revision of the manuscript. JL carried out the study and drafted the manuscript. JL, YC, XJ, DH, XC, HR, and JGuo analyzed the data. All authors read and approved the final manuscript. Funding Statement This study was supported by grants from the Zhejiang Medical and Health Science and Technology Plan Project (2022504276), the Traditional Chinese Medicine of Zhejiang Province Science and Technology plan project (2023ZL128), and Health Science and Technology Program of Hangzhou (A20210086). Conflict of interest The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Publisher’s note All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher. Supplementary material The Supplementary Material for this article can be found online at: [146]https://www.frontiersin.org/articles/10.3389/fendo.2023.1144250/fu ll#supplementary-material Supplementary Figure 1 Consensus clustering of the 5 PRGs in RA. (A-G) Consensus matrices of the 5 PRGs for k = 3–9. [147]Click here for additional data file.^ (1MB, tif) Supplementary Figure 2 Consensus clustering of the 56 DEGs in RA. (A-G) Consensus matrices of the 56 DEGs for k = 3–9. [148]Click here for additional data file.^ (1.3MB, tif) Supplementary Table 1 The list of 52 PRGs. [149]Click here for additional data file.^ (30.8KB, xlsx) Supplementary Table 2 The details of PRG screened by the LASSO, RF, and SVM-RFE algorithms. [150]Click here for additional data file.^ (30.8KB, xlsx) Supplementary Table 3 The details of GO analysis. [151]Click here for additional data file.^ (30.8KB, xlsx) Supplementary Table 4 The details of KEGG analysis. [152]Click here for additional data file.^ (30.8KB, xlsx) References