Abstract Background: Breast cancer (BRCA) is a life-threatening malignancy in women with an unsatisfactory prognosis. The purpose of this study was to explore the prognostic biomarkers and a risk signature based on ferroptosis-related RNA-binding proteins (FR-RBPs). Methods: FR-RBPs were identified using Spearman correlation analysis. Differentially expressed genes (DEGs) were identified by the “limma” R package. The univariate Cox and multivariate Cox analyses were executed to determine the prognostic genes. The risk signature was constructed and verified with the training set, testing set, and validation set. Mutation analysis, immune checkpoint expression analysis in high- and low-risk groups, and correlation between risk signature and chemotherapeutic agents were conducted using the “maftools” package, “ggplot2” package, and the CellMiner database respectively. The Human Protein Atlas (HPA) database was employed to confirm protein expression trends of prognostic genes in BRCA and normal tissues. The expression of prognostic genes in cell lines was verified by Real-time quantitative polymerase chain reaction (RT-qPCR). Kaplan-meier (KM) plotter database analysis was applied to predict the correlation between the expression levels of signature genes and survival statuses. Results: Five prognostic genes (GSPT2, RNASE1, TIPARP, TSEN54, and SAMD4A) to construct an FR-RBPs-related risk signature were identified and the risk signature was validated by the International Cancer Genome Consortium (ICGC) cohort. Univariate and multivariate Cox regression analysis demonstrated the risk score was a robust independent prognostic factor in overall survival prediction. The Tumor Mutational Burden (TMB) analysis implied that the high- and low-risk groups responded differently to immunotherapy. Drug sensitivity analysis suggested that the risk signature may serve as a chemosensitivity predictor. The results of GSEA suggested that five prognostic genes might be related to DNA replication and the immune-related pathways. RT-qPCR results demonstrated that the expression trends of prognostic genes in cell lines were consistent with the results from public databases. KM plotter database analysis suggested that high expression levels of GSPT2, RNASE1, and SAMD4A contributed to poor prognoses. Conclusion: In conclusion, this study identified the FR-RBPs-related prognostic genes and developed an FR-RBPs-related risk signature for the prognosis of BRCA, which will be of great significance in developing new therapeutic targets and prognostic molecular biomarkers for BRCA. Keywords: breast cancer, ferroptosis, RNA-binding proteins, risk signature, bioinformatics analysis Introduction Breast cancer (BRCA) is one of the most common malignancies in women, accounting for a quarter of all female cancer cases ([44]Siegel et al., 2020). One statistic shows that 2.26 million cases and 684,996 fatalities of BRCA were reported globally in 2020 ([45]Sung et al., 2021). According to hormonal status, three different types of BRCA may be identified clinically: luminal-like, human epidermal growth factor receptor 2 (HER2) positive, and triple negative BRCA (TNBC) ([46]Tian et al., 2021). Currently, BRCA treatment strategies mainly include the combination of surgical resection, endocrine therapy, chemotherapy, immunotherapy, and other adjuvant therapies ([47]Waks and Winer, 2019). Despite tremendous improvements in early detection and treatment over the past few decades, the high morbidity and mortality of BRCA continue to constitute a serious danger to human health ([48]Early Breast Cancer Trialists’ Collaborative Group (EBCTCG), 2011). Therefore, accurate prediction of BRCA prognosis is essential to improve prognosis and provide appropriate treatment for patients. In contrast to cell necrosis, apoptosis, and autophagy, ferroptosis is a novel type of regulated cell death that is brought on by the accumulation of iron-dependent lipid peroxides ([49]Song et al., 2021). Ferroptosis-related genes (FRGs) are promising therapeutic targets for BRCA patients ([50]Li et al., 2021), and FRG signatures have been reported to be effective in predicting the prognosis of BRCA patients in earlier research ([51]Liu et al., 2021). Some genes, such as ACSL4 and P53RRA, are known to positively regulate ferroptosis. However, other ferroptosis-related genes, including ATF2, NRF2, and GPX4, may inhibit the ferroptosis process in BRCA ([52]Peng et al., 2021). RNA binding proteins (RBPs) are proteins that interact with RNA through RNA binding domains. RBPs, as important coordinators for maintaining genome integrity, are widely expressed in cells and play a central role in gene regulation. RBPs are involved in regulating various aspects of RNA metabolism and function, including RNA biogenesis, maturation, transport, cell localization, and degradation, which have also been found to play a key role in tumor development ([53]Luo et al., 2021). For example, RBP-related prognostic markers are highly expressed in BRCA ([54]Lan et al., 2021), and TRIM21 facilitates the transformation of breast cancer cells from epithelium to stroma ([55]Jin et al., 2019). In addition, RBPs can be used to predict the prognosis of patients with lung adenocarcinoma ([56]Li et al., 2020a), and FOXK2 promotes colorectal cancer metastasis by up-regulating ZEB1 and EGFR expression ([57]Du et al., 2019). However, no study has been undertaken on the prognostic significance of RBPs associated with ferroptosis in BRCA. Therefore, bioinformatics methods were used to identify ferroptosis-related RBPs, establish an independent prognostic model, and verify the good predictive performance of the model. Further research was conducted on the variations in immune checkpoint expression, chemotherapeutic agent sensitivity, and TMB between high-risk and low-risk BRCA patients. The prognostic signature in this study may improve prognostic prediction and become the choice of treatment for patients with BRCA. Materials and methods Data source RNA-seq data and clinical information of 1091 BRCA samples and 113 normal samples were accessed from the Cancer Genome Atlas (TCGA) database. RNA-seq data of 50 BRCA patients with survival information were collected from the ICGC database ([58]https://dcc.icgc.org/) and utilized for risk signature validation, namely validation set. 117 BRCA samples from [59]GSE88770 dataset were also used to verify the prognostic risk model. 259 ferroptosis-related genes (FRGs) were extracted from the FerrDb database. 1542 RNA-binding proteins (RBPs) were derived from a previous report ([60]Wang et al., 2021). Authentication of differentially expressed genes (DEGs) in BRCA The R language package ‘limma’ was engaged in analyzing the DEGs between 113 normal samples and 1091 BRCA samples from the TCGA database ([61]Ritchie et al., 2015). The screening criteria for difference were |log[2]Fold change (FC)| > 1 and adjusted p-value <0.05. Excavation of FR-RBP genes in BRCA The expression matrices of FRGs and RBPs were extracted sequentially based on the transcriptome data from the TCGA database, followed by the calculation of correlation coefficients and p-values between FRGs and RBPs using the Spearman method. The RBPs were identified as significantly correlated with FRG, i.e., FR RBPs, by filtering with a threshold FDR <0.05 and |correlation coefficient (cor)| > 0.3. The function of differentially expressed FR-RBPs (DE-FR-RBPs) was probed by the R package ‘clusterProfiler’ ([62]Yu et al., 2012). Gene Ontology (GO) enrichment analysis mainly described the biological processes (BP), cellular components (CC), and molecular functions (MF) correlated with genes. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis revealed biological pathways associated with genes. Establishment and validation of the risk signature based on DE-FR-RBPs for BRCA To estimate whether the DE-FR-RBPs were associated with survival in BRCA patients, we randomly split the 1069 patients with survival information from the TCGA database into a training set and a testing set in a 6:4 ratio, and the clinicopathologic characteristics of the training set and testing set are shown in [63]Table 1. Univariate Cox and multivariate Cox regression analyses were executed in the training set to screen prognostic genes. The risk score was calculated as Risk score = h0(t)*exp (β[1]X[1] +β[2]X[2]+…+β[n]X[n]), where β refers to the regression coefficient; X represented the gene expression level; h0(t) is the benchmark risk function. The following steps were performed simultaneously in the training set, testing set, validation set, and [64]GSE88770 dataset. We calculated risk scores for each BRCA patient based on prognostic genes and regression coefficients and divided BRCA patients into high-risk and low-risk groups based on the optimal cutoff values obtained from the surv_cutpoint function in the ‘survival’ package. A Kaplan-Meier (K-M) curve, a risk curve, and a prognostic gene expression heatmap were plotted, and receiver operating characteristic (ROC) curves for 1,3,5-year survival were also drawn to assess the accuracy of the risk signature in predicting survival. TABLE 1. The clinicopathologic characteristics of training set and testing set. Features TCGA Training set (n = 642) testing set (n = 427) age (%) ≤30 7 (1.1) 5 (1.2) >30 635 (98.9) 422 (98.8) pathologic_M (%) M0 535 (83.3) 355 (83.1) M1 13 (2.0) 9 (2.1) MX 94 (14.6) 63 (14.8) pathologic_N (%) N0 310 (48.3) 192 (45.0) N1 207 (32.2) 150 (35.1) N2 67 (10.4) 53 (12.4) N3 47 (7.3) 26 (6.1) NX 11 (1.7) 6 (1.4) pathologic_T (%) T1 179 (27.9) 100 (23.4) T2 361 (56.2) 256 (60.0) T3 75 (11.7) 57 (13.3) T4 25 (3.9) 13 (3.0) TX 2 (0.3) 1 (0.2) tumor_stage.diagnoses (%) stage I 113 (17.6) 68 (15.9) stage II 360 (56.1) 246 (57.6) stage III 145 (22.6) 95 (22.2) stage IV 12 (1.9) 8 (1.9) stage x 6 (0.9) 5 (1.2) Unknow 6 (0.9) 5 (1.2) race.demographic (%) american indian or alaska native 1 (0.2) 0 (0.0) asian 37 (5.8) 21 (4.9) black or african american 107 (16.7) 73 (17.1) Unknow 52 (8.1) 33 (7.7) white 445 (69.3) 300 (70.3) [65]Open in a new tab Establishment of a nomogram Chi-square tests were applied to examine the linkage of risk signatures with other clinicopathological characteristics. Independent prognostic profiling was undertaken by adopting univariate Cox and multivariate Cox analysis. Clinical features and risk scores were selected to establish a nomogram associated with outcome for assessing the overall survival (OS) of 1-, 3-, and 5-year for BRCA patients. Furthermore, calibration curve was plotted to evaluate the consistency between predicted probabilities of 1-, 3- and 5-year survival and actual ones. Analysis based on risk signature To investigate the relevance of risk signature to somatic mutations, immune checkpoints, immune cell infiltration, and chemotherapeutic efficacy, we conducted mutation analysis, immune checkpoint expression analysis, and immune cell infiltration analysis in high- and low-risk groups, and discovered the correlation between risk signature and chemotherapeutic agents using the “maftools” package, the “ggplot2” package, CIBERSORT algorithm and the CellMiner database respectively. Analysis of prognostic genes We used the cBioPortal database ([66]http://www.cbioportal.org) to analyze genetic alterations of prognostic genes in risk signatures. The HPA database ([67]https://www.proteinatlas.org/) was employed to identify protein expression trends of prognostic genes in BRCA and normal tissues. To further explore the possible molecular mechanisms of prognostic genes in BRCA, we performed a single-gene Gene Set Enrichment Analysis (GSEA) analysis based on the KEGG gene set, setting SIZE >20, NOM p-value <0.05, and FDR q-val >0.05 as significant pathways. Verification of mRNA expression levels of prognostic genes in cell lines Human epithelial cell lines from the mammary gland, MCF-10A, and three breast cancer cell lines MCF-7, MDA-MB-468, and T47D were obtained from iCell Bioscience Inc. (Shanghai, China). MCF-10A cells were cultured in MEGM Kit medium (Lonza/Clonetics, CC-3150). MDA-MB-468 and T47D cells were cultured in RPMI-1640 medium (iCell-0002), supplemented with 0.02 mg/L of bovine insulin (iCell-0016-a), 10% fetal bovine serum (FBS) (Gibco) and 1% penicillin/streptomycin. MCF-7 cells were cultured in MEM basic medium (iCell-0012), supplemented with 0.01 mg/mL of bovine insulin, 10% FBS and 1% penicillin/streptomycin. The cells were incubated at 37°C in a humidified atmosphere of 5% CO[2]. Total RNA from the cell lines in logarithmic phage was isolated utilizing the TRIzol Reagent following the producer’s instructions (Ambion, USA). Next, total RNA was reverse transcribed into cDNA utilizing the SweScript-First-strand-cDNA-synthesis-kit (Servicebio, China) and qPCR was subsequently carried out using the 2xUniversal Blue SYBR Green qPCR Master Mix, according to the manufacturers’ direction (Servicebio, China). The sequences of the primers were listed in [68]Table 2. The relative expression level was normalized to the endogenous control GAPDH and calculated using the 2^−ΔΔCq method ([69]Livak and Schmittgen, 2001). The Student’s t-test was used to contrast the distinction. The two-tailed p-value <0.05 was delimited as statistically significant. TABLE 2. The sequences of the primers for qPCR. Primer Sequences GSPT2 For CGTCAACGCCAAGCC GSPT2 Rev CCCCCGTCCCATCCT RNASE1 For ACTCAGACAGTTCCCCCA RNASE1 Rev CCTCCACAGAAGCATCAA SAMD4A For AAC​CAA​TGG​CAA​CAG​GAA​T SAMD4A Rev GGT​GGG​GAC​AGA​TGA​GGA​G TIPARP For GGC​AGA​TCA​AAA​GGA​CAA​C TIPARP Rev ATA​AAA​CAG​GAG​CGG​AAG​A TSEN54 For AAG​AAT​AAC​CCT​GGC​AAA​C TSEN54 Rev AAG​TCC​CTG​AAG​CTG​TAG​A GAPDH For CCC​ATC​ACC​ATC​TTC​CAG​G GAPDH Rev CAT​CAC​GCC​ACA​GTT​TCC​C [70]Open in a new tab The Kaplan-Meier (KM) plotter for prognostic value The survival probability of signature mRNA expression was assessed using the KM plot database ([71]www.kmplot.com), which contained survival information and gene expression data for BRCA patients. In order to analyze the OS of BRCA patients, the samples were divided into high- and low-expression groups by median expression and evaluated by a KM survival plot, with the log-rank p-value and hazard ratio (HR) with 95% confidence intervals (CIs). Statistical analysis All bioinformatics analyzes were run in the R programming language and the data from different groups were compared using the Wilcoxon test. p values less than 0.05 were considered statistically significant if not specified above. Results DE-FR-RBPs in BRCA As shown in [72]Figure 1A and [73]Supplementary Table S1, a total of 1615 DEGs, including 600 upregulated genes and 1015 downregulated genes, were exploited in BRCA samples compared with normal samples. The expression heatmaps of the top 50 upregulated genes and top 50 downregulated genes are displayed in [74]Supplementary Figure S1A. Then, 1377 FR-RBPs were identified by Spearman correlation analysis between FRGs and RBPs ([75]Supplementary Tables S2, S3). After overlapping the DEGs and FR-RBPs, 46 DE-FR-RBPs in BRCA, including 27 upregulated genes and 19 downregulated genes, were determined and all DE-FR-RBPs were RBPs ([76]Figure 1B; [77]Supplementary Figure 1B; [78]Supplementary Table S4). To probe the possible functions of these 46 genes in BRCA development, we proceeded with a functional enrichment analysis. A total of 116 GO entries (56 BP, 47 CC, and 13 MF) and 1 KEGG pathway were enriched for upregulated genes ([79]Supplementary Table S5). The downregulated genes were enriched for a total of 87 GO entries (59 BP, 8 CC, and 20 MF) ([80]Supplementary Table S5). The top 5 entries in each category are shown in the bubble diagram ([81]Figures 1C–E). The upregulated genes in KEGG pathways were associated with the “Spliceosome” pathway ([82]Figure 1D). The majority of the upregulated genes in BP were connected to biological processes that included RNA splicing and regulation of nuclease activity ([83]Figure 1C). Most of the downregulated genes were involved in the regulation of mRNA metabolism, translation, mRNA catabolism, mRNA splicing via the spliceosome, and nuclear-transcribed mRNA catabolism ([84]Figure 1D). FIGURE 1. [85]FIGURE 1 [86]Open in a new tab Identification of 46 DE-FR-RBPs and its functional enrichment analysis. (A) Volcano map of DEGs between BRCA and normal samples. Green dots represent downregulated genes while red dots indicate upregulated genes. (B) Venn diagram was conducted to detect the intersection of 46 DE-FR-RBPs by DEGs and FR-RBPs. (C) The top5 GO entries under each category based on the function enrichment results of upregulated DE-FR-RBPs. (D) The KEGG pathway enriched by upregulated DE-FR-RBPs. (E) The top5 GO entries under each category based on the function enrichment results of downregulated DE-FR-RBPs. The larger the circle contains the greater the number of genes and the redder the color the smaller the p-value. The risk signature based on the DE-FR-RBPs in BRCA We randomly assigned 1069 BRCA cases with survival information from the TCGA database into a training set and a testing set. We included 46 DE-FR-RBPs in a univariate Cox analysis in the training set. Based on the p-value <0.2, eight genes (SAMD4A, CELF2, TSEN54, ZFP36L2, TIPARP, GSPT2, YBX3, and RNASE1) were selected for incorporation into the next step of multivariate Cox analysis ([87]Table 3). Five genes associated with overall survival (OS) in BRCA patients were screened (GSPT2, RNASE1, TIPARP, TSEN54, and SAMD4A) by multivariate Cox analysis for risk signature establishment ([88]Figure 2A). Of these, GSPT2, TIPARP, TSEN54, and SAMD4A were protective factors for patient OS (hazard ratio (HR) < 1), and RNASE1 was a risk factor (HR > 1). The risk score was calculated as: risk score = h0(t)×exp ((-0.31839)×expression of GSPT2 + (0.312671)×expression of RNASE1 + (−0.31809)×expression of TIPARP + (−0.58387)×expression of TSEN54 + (−0.64123)×expression of SAMD4A). Based on this formula, we calculated the risk score for each BRCA patient in the training set and classified them into high- and low-risk groups based on optimal cut-off value ([89]Figure 2B). The risk curve manifested that as the risk score increased, patients confronted a higher risk of demise ([90]Figure 2B). The K-M curve indicated that the high-risk patients had noteworthy poorer survival than the low-risk patients ([91]Figure 2C). The Area Under Curve (AUC) values for the ROC curves at 1, 3, and 5 years were 0.821, 0.739, and 0.664, reflecting the decent accuracy of the gene signature ([92]Figure 2D; [93]Supplementary Tables S6–S8). The heatmap revealed that RNASE1 was highly expressed in the high-risk group and GSPT2, TIPARP, TSEN54, and SAMD4A were highly expressed in the low-risk group ([94]Figure 2E). To further confirm the applicability and reliability of the risk signature, the above analysis was carried out in both the testing set and the external validation set. The results of the testing set and external validation set were consistent with the training set ([95]Supplementary Figures S2A–H). The AUC values of the 1, 3, and 5-year ROC curves in the testing set were 0.603, 0.649, and 0.606 respectively ([96]Supplementary Figure S2C), while the AUC values of the 1, 3, and 5-year ROC curves in the validation set were 0.897, 0.978, and 0.846 respectively ([97]Supplementary Figure S2G), suggesting that the FR-RBPs associated risk signature was an effective predictor of survival in BRCA patients. Moreover, the [98]GSE88770 dataset was also used to validate the prognostic model ([99]Supplementary Figures S2I–L). The results suggested that patients in high-risk group had poorer OS than low-risk group ([100]Supplementary Figure S2J), and the AUC values of 3 and 5 years were 0.724 and 0.690 respectively ([101]Supplementary Figure S2K), indicating the risk signature had good predictive ability. TABLE 3. The result of Univariate Cox analysis. id HR HR.95L HR.95H P-value GSPT2 0.707641508 0.54167322 0.924462362 0.011214893 RNASE1 1.250347564 1.009428054 1.548767171 0.040766883 TIPARP 0.733565395 0.541961869 0.992907839 0.044851373 TSEN54 0.689105896 0.476337238 0.996913319 0.048112608 SAMD4A 0.662677227 0.406480148 1.080350687 0.098935143 YBX3 0.847845556 0.682996077 1.052483479 0.134586015 CELF2 0.828334478 0.630036769 1.089044389 0.177340052 ZFP36L2 0.844579069 0.659786412 1.081128364 0.179988295 ALYREF 0.821438156 0.606253705 1.113000447 0.204378613 LARP6 0.768080806 0.504944647 1.168342169 0.217593842 NUDT16L1 0.82550748 0.59655413 1.142331542 0.247259363 EZH2 0.867042096 0.65859947 1.141455515 0.309192942 DDX39A 0.86271146 0.630696363 1.180078255 0.355504011 EXOSC4 1.135808917 0.866411329 1.488971638 0.356588365 BOP1 1.116973789 0.878760146 1.419762208 0.366043977 EXO1 1.099478609 0.867536602 1.393431943 0.432744694 QKI 0.883711263 0.610381308 1.279438911 0.512607185 ACO1 0.87859728 0.595336873 1.296632571 0.514537779 SNRPE 1.136400042 0.773123506 1.670373553 0.515284169 SNRPB 0.889632146 0.622585703 1.271223143 0.520754293 RBMS3 0.898135035 0.624364616 1.29194788 0.562492321 MAZ 1.119591522 0.759603203 1.650184163 0.568172367 LSM4 0.900036413 0.621296936 1.303829936 0.577555478 MRPL12 0.933092038 0.730247914 1.192281053 0.579763207 ZNF106 1.101633388 0.760767574 1.595225878 0.608349773 RBMS2 0.886316661 0.556348736 1.411987074 0.611505345 ZFP36 0.951432459 0.781604851 1.15816032 0.619691033 OAS3 1.046277912 0.861897572 1.270101582 0.647397407 EIF3L 0.941677939 0.696351628 1.273433284 0.696358388 ESRP1 1.062783087 0.762096148 1.482106809 0.719706896 OAS2 1.029753483 0.873416765 1.214073598 0.727098229 PARP1 0.937945972 0.64125593 1.37190567 0.741252895 RNASEH2A 1.055259928 0.748967479 1.486811572 0.758473021 ZCCHC24 0.962498613 0.746990713 1.240180854 0.767576629 NOVA1 1.027431906 0.856821179 1.232014739 0.770212969 SNRNP25 0.950064372 0.663923509 1.359527564 0.779352003 MBNL2 0.967323505 0.723614741 1.293111805 0.822507597 RNASE7 1.063113031 0.603951646 1.871357292 0.832004214 EEF1A2 1.010017226 0.91721685 1.112206778 0.83937158 HNRNPAB 1.030484411 0.652800239 1.626681577 0.897417351 OASL 0.990460197 0.831607532 1.179656705 0.914412957 RPUSD1 0.986777119 0.681941484 1.427877764 0.943710688 MRPS34 0.99432936 0.728127266 1.357854487 0.971465208 MEX3A 0.99898961 0.830181567 1.202122862 0.991459526 MRPL14 1.001139283 0.708039533 1.415570483 0.994859575 MRPS12 1.000436745 0.7027528 1.424218703 0.998066621 [102]Open in a new tab FIGURE 2. [103]FIGURE 2 [104]Open in a new tab Construction of prognostic risk models in BRCA. (A) The forest map of multivariate Cox regression analysis. (B) Distribution of survival status based on the median risk score in training set. (C) Kaplan-Meier survival analysis of high and low risk groups in TCGA-BRCA training set. (D) Time-independent receiver operating characteristic (ROC) analysis of risk scores predicting the overall survival of 1-, 3-, and 5-year in training set. (E) Heatmap indicated the difference of prognostic gene expression between high- and low-risk groups in TCGA-BRCA training set. The FR-RBPs associated risk signature was an independent prognostic factor We initially evaluated the proportion of individuals at high- and low-risk under various clinical features in order to better investigate the relationship between risk signature and clinicopathological factors. As shown in [105]Table 4, risk signature was associated with pathologic T and stage in the training set, while in the testing set, risk signature was correlated with pathologic T ([106]Table 5). Age and stage were unrelated to the risk signature in the validation set ([107]Table 6). Stratified survival analysis showed significant differences in survival between high- and low-risk groups with different T subgroups, stage subgroups, age (age >30), and subtypes (HER2-enriched, LuminalA, LuminalB, and triple negative breast cancer (TNBC)) in the training set ([108]Figures 3A–I). Next, we integrated both risk score and clinical factors into a univariate Cox analysis, with risk score, age, pathologic N, pathologic T, pathological M, and the stage being associated with OS in BRCA patients ([109]Figure 4A). Then included these factors into multivariate Cox analysis, the results indicated that risk score and age were independent prognostic factors ([110]Figure 4B). TABLE 4. The number of patients in high- and low-risk groups with different clinical characteristics in the training set. TCGA training set Total (n = 642) High_risk (n = 322) Low_risk (n = 320) p-value Age (years) ≤30 7 (1.1%) 4 (1.2%) 3 (0.9%) 1 >30 635 (98.9%) 318 (98.8%) 317 (99.1%) pathologic_M M0 535 (83.3%) 269 (83.5%) 266 (83.1%) 0.314 M1 13 (2.0%) 9 (2.8%) 4 (1.3%) MX 94 (14.6%) 44 (13.7%) 50 (15.6%) pathologic_N N0 310 (48.3%) 156 (48.4%) 154 (48.1%) 0.298 N1 207 (32.2%) 94 (29.2%) 113 (35.3%) N2 67 (10.4%) 40 (12.4%) 27 (8.4%) N3 47 (7.3%) 26 (8.1%) 21 (6.6%) NX 11 (1.7%) 6 (1.9%) 5 (1.6%) pathologic_T T1 179 (27.9%) 74 (23.0%) 105 (32.8%) 0.0042 T2 361 (56.2%) 189 (58.7%) 172 (53.8%) T3 75 (11.7%) 40 (12.4%) 35 (10.9%) T4 25 (3.9%) 19 (5.9%) 6 (1.9%) TX 2 (0.3%) 0 (0%) 2 (0.6%) Stage stage I 113 (17.6%) 46 (14.3%) 67 (20.9%) 0.0209 stage II 360 (56.1%) 179 (55.6%) 181 (56.6%) stage III 145 (22.6%) 83 (25.8%) 62 (19.4%) stage IV 12 (1.9%) 9 (2.8%) 3 (0.9%) stage x 6 (0.9%) 1 (0.3%) 5 (1.6%) Unknow 6 (0.9%) 4 (1.2%) 2 (0.6%) race american indian or alaska native 1 (0.2%) 1 (0.3%) 0 (0%) 0.306 asian 37 (5.8%) 20 (6.2%) 17 (5.3%) black or african american 107 (16.7%) 49 (15.2%) 58 (18.1%) Unknow 52 (8.1%) 32 (9.9%) 20 (6.3%) white 445 (69.3%) 220 (68.3%) 225 (70.3%) [111]Open in a new tab TABLE 5. The number of patients in high- and low-risk groups with different clinical characteristics in the testing set. TCGA testing set Total (n = 427) High_risk (n = 99) Low_risk (n = 328) p-value Age (years) ≤30 5 (1.2%) 0 (0%) 5 (1.5%) 0.482 >30 422 (98.8%) 99 (100%) 323 (98.5%) pathologic_M M0 355 (83.1%) 83 (83.8%) 272 (82.9%) 0.235 M1 9 (2.1%) 4 (4.0%) 5 (1.5%) MX 63 (14.8%) 12 (12.1%) 51 (15.5%) pathologic_N N0 192 (45.0%) 36 (36.4%) 156 (47.6%) 0.114 N1 150 (35.1%) 35 (35.4%) 115 (35.1%) N2 53 (12.4%) 19 (19.2%) 34 (10.4%) N3 26 (6.1%) 7 (7.1%) 19 (5.8%) NX 6 (1.4%) 2 (2.0%) 4 (1.2%) pathologic_T T1 100 (23.4%) 15 (15.2%) 85 (25.9%) 0.0197 T2 256 (60.0%) 69 (69.7%) 187 (57.0%) T3 57 (13.3%) 9 (9.1%) 48 (14.6%) T4 13 (3.0%) 6 (6.1%) 7 (2.1%) TX 1 (0.2%) 0 (0%) 1 (0.3%) Stage stage I 68 (15.9%) 10 (10.1%) 58 (17.7%) 0.104 stage II 246 (57.6%) 54 (54.5%) 192 (58.5%) stage III 95 (22.2%) 29 (29.3%) 66 (20.1%) stage IV 8 (1.9%) 4 (4.0%) 4 (1.2%) stage x 5 (1.2%) 1 (1.0%) 4 (1.2%) Unknow 5 (1.2%) 1 (1.0%) 4 (1.2%) race asian 21 (4.9%) 2 (2.0%) 19 (5.8%) 0.259 black or african american 73 (17.1%) 14 (14.1%) 59 (18.0%) Unknow 33 (7.7%) 10 (10.1%) 23 (7.0%) white 300 (70.3%) 73 (73.7%) 227 (69.2%) [112]Open in a new tab TABLE 6. The number of patients in high- and low-risk groups with different clinical characteristics in the validation set. ICGC Total (n = 50) High_risk (n = 10) Low_risk (n = 40) p-value Age (years) ≤30 13 (26.0%) 5 (50.0%) 8 (20.0%) 0.126 >30 37 (74.0%) 5 (50.0%) 32 (80.0%) Stage 0 3 (6.0%) 0 (0%) 3 (7.5%) 0.402 1 13 (26.0%) 4 (40.0%) 9 (22.5%) 2 28 (56.0%) 4 (40.0%) 24 (60.0%) 3 6 (12.0%) 2 (20.0%) 4 (10.0%) [113]Open in a new tab FIGURE 3. [114]FIGURE 3 [115]Open in a new tab Kaplan-Meier curve analysis of clinical pathologic factors for the OS in BRAC patients in the training set. Stratified K-M curves among different groups, including pathologic T stage (A, B), pathologic stage (C, D), age (E), HER2-enriched (F), Luminal A (G), Luminal B (H), and TNBC (I). FIGURE 4. [116]FIGURE 4 [117]Open in a new tab Estableshment of a nomogram. (A) Univariate Cox analysis of the correlations between risk score and clinicopathological factors in TCGA-BRCA training set. (B) Multivariate Cox of the correlations betweent the risk score and clinicopathological factors independent prognostic analysis of in TCGA-BRCA training set. (C) Nomogram for forecasting 1-, 3- and 5-year OS. (D) The calibration for forecasting 1-, 3- and 5-year OS. Nomogram construction is based on risk scores and clinicopathological factors Independent prognostic indicators including risk score and age were involved in the nomogram to forecast the survival probability of BRCA patients. Nomography predicted the 1, 3, and 5 years survival probability of patients with BRCA ([118]Figure 4C). The calibration curve was plotted to assess the nomogram and indicated that the practical survival of BRCA patients was in line with the predicted value ([119]Figure 4D). The role of risk signature in BRCA We used the “maftools” program to display the distribution of the top 20 mutated genes in the high- and low-risk groups in order to further analyze the difference in mutations between the groups, revealing that TP53 was more frequently mutated in the low-risk group ([120]Figure 5A). We further examined the tumor mutational burden (TMB) of high- and low-risk patients and noted that the TMB of the high-risk group was significantly higher than that of the low-risk group ([121]Figure 5B). The TMB was associated with immunotherapy, implying that the high- and low-risk groups responded differently to immunotherapy. Following that, we compared the expression of immune checkpoint molecules in the high- and low-risk groups, displaying that NRP1, CD244, ADORA2A, TNFRSF14, and TNFRSF15 were significantly less expressed in the high-risk group than in the low-risk group and that LAG3 was significantly more expressed in the high-risk group than in the low-risk group ([122]Figure 5C). The CIBERSORT algorithm revealed that the high-risk group with more immune cell infiltrates included T cells CD4 memory activated, Macrophages M2, Dendritic cells activated, and Neutrophils, but low immune cell infiltrates included B cells naive and T cells memory resting ([123]Figure 5D). We then investigated whether the risk signature could predict sensitivity to chemotherapy drugs. Using the CellMiner database, we calculated risk scores for NCI60 cell lines and divided them into high- and low-risk groups delimited by median values. Correlations between risk score and the half maximal inhibitory concentration (IC50) values of FDA-approved drugs were calculated using the Spearman method, which showed that by-products of CUDC-305, Denileukin/Diftitox/Ontak, LDK-378, Nilotinib and Tamoxifen were significantly correlated with the risk score (|cor| > 0.3 and p-value <0.05) ([124]Figure 5E). In addition, IC50 values for by-products of CUDC-305, Denileukin/Diftitox/Ontak, and Tamoxifen were lower in the low-risk group ([125]Figure 5F). These findings suggest that the model may be able to act as a chemosensitivity predictor. FIGURE 5. [126]FIGURE 5 [127]Open in a new tab Correlation analysis of risk model with genomic mutations, immune checkpoints, immune cell infiltration, and chemotherapeutic drug efficacy. (A) The waterfall maps of the somatic mutations of the top 20 mutated genes in the high and low risk groups. (B) Boxplot of TMB score in high and low risk group. (C) Boxplot of the relationships of immune checkpoint expression between high and low risk group. (D) The immune cell infiltration using CIBERSORT. (E) Correlation analysis between drugs and risk score. (F) Boxplot of IC50 values of significantly related drugs in high and low risk group. *p-value <0.05, **p-value <0.01, ***p-value <0.001, ****p-value <0.0001. The role of prognostic FR-RBPs in BRCA We carried out the relevant study by utilizing the cBioPortal database in order to better comprehend the mutations in the five prognostic genes in BRCA. As shown in [128]Supplementary Figure S3, GSPT2, RNASE1, TIPARP, TSEN54, and SAMD4A were mutated in BRCA, with amplification being the predominant mutation type. To further investigate the role of prognostic genes in the BRCA progression, we proceeded with a single-gene GSEA enrichment analysis with detailed information on the results listed in [129]Supplementary Table S9. The top 5 pathways activated in the low expression group and the top 5 pathways activated in the high expression group for each gene are shown in [130]Figures 6A–E. We noted that the low expression groups of GSPT2, SAMD4A, TIPARP, and the high expression group of TSEN54 significantly activated the “oxidative phosphorylation” and ‘DNA replication’ pathways. The high expression group of RNASE1 significantly activated the ‘cytokine receptor interaction’ and ‘natural killer cell mediated cytotoxicity’ pathways. The high expression group of GSPT2, SAMD4A, TIPARP, and the low expression group of TSEN54 significantly activated the “ECM receptor interaction” pathway. The low expression group of TSEN54 significantly activated the “TGF beta signaling pathway”. FIGURE 6. [131]FIGURE 6 [132]Open in a new tab GSEA analysis of prognostic genes. Top5 KEGG pathways significantly enriched in high and low expression groups of GSPT2 (A), RNASE1 (B), SAMD4A (C), TIPARP (D) and TSEN54 (E). The expression of prognostic FR-RBPs in BRCA As exhibited in [133]Figure 7 , TSEN54 was upregulated in BRCA tissues, while GSPT2, RNASE1, TIPARP, and SAMD4A were downregulated in BRCA tissues compared to normal tissues. We then validated the expression of prognostic genes at the mRNA level in human epithelial cell lines from the mammary gland, MCF-10A and three breast cancer cell lines MCF-7, MDA-MB-468, and T47D. Consistent with the trend of results from public databases, TSEN54 was upregulated in breast cancer cell lines and GSPT2, RNASE1, SAMD4A, and TIPARP were downregulated in breast cancer cell lines ([134]Figure 8A-E). To further determine the changes in expression of prognostic genes at the protein level, we obtained corresponding images from the HPA database. We did not detect immunohistochemical results for TIPARP in BRCA. As shown in [135]Figure 9, we found that protein expression levels of GSPT2 and SAMD4A were decreased in BRCA tissues compared to normal tissues, but RNASE1 was largely unexpressed in normal breast tissues and BRCA tissues at protein levels. The increased expression of TSEN54 at protein level in BRCA tissues compared to normal tissues was noteworthy. FIGURE 7. [136]FIGURE 7 [137]Open in a new tab | Validation of the expression of five prognostic genes in the normal and BRCA samples from ICGC cohort. ****p-value < 0.0001. FIGURE 8. [138]FIGURE 8 [139]Open in a new tab Verification of the expression of prognostic genes by RT-qPCR. The expression of GSPT2 (A), RNASE1 (B), SAMD4A (C), TIPARP (D), and TSEN54 (E) among MCF-10A, MCF-7, MDA-MB-468, and T47D. *p-value <0.05, **p-value <0.01, ***p-value <0.001, ****p-value <0.0001. FIGURE 9. [140]FIGURE 9 [141]Open in a new tab Immunohistochemical results of four prognostic genes in normal and BRCA tissues obtained from HPA database. The survival status of signature genes KM plotter database was utilized to analyze the OS of BRCA patients. The log-rank test and KM curve analyses suggested that the high expression level of GSPT2 (HR = 1.41, p = 0.0042), RNASE1 (HR = 1.45, p = 0.0012), and SAMD4A (HR = 1.75, p < 0.001) were significant with the poor OS of the BRCA patients. While the expression levels of TIPARP and TSEN54 were not associated with OS of BRCA patients ([142]Figure 10A-E). FIGURE 10. [143]FIGURE 10 [144]Open in a new tab The prognostic value of five model genes in BRCA Patients using the Kaplan-Meier plotter database. The survival probability of patients in high- and low-expression groups divided by the transcriptional expression levels of GSPT2 (A), RNASE1 (B), SAMD4A (C), TIPARP (D), and (E) TSEN54. The red colored lines indicate high expression levels while the black colored lines indicate low expression levels. Discussion BRCA is the most common factor in cancer deaths in women globally. Even though the prognosis for BRCA patients has significantly improved due to advancements in numerous treatment modalities such as local surgery, radiation, chemotherapy, and endocrine therapy, the risk of recurrence and death still exists for many individuals ([145]Dong et al., 2018; [146]Wang et al., 2020a). In clinical settings, the prognosis of BRCA patients is frequently predicted using the tumor size, tumor grade, and TNM stage ([147]Al-Keilani et al., 2021). However, because of individual variability, these clinical markers frequently fall short of the ideal prediction effect and the overall diagnostic standards. In recent years, many researchers have been exploring new biomarkers that can predict the prognosis of BRCA at the molecular level, and we have made new discoveries in this area. Functional enrichment analysis was used to determine the biological roles of these DE-FR-RBPs. Little is known about the specific mechanism of action of the spliceosome in cancer. In the past, it was found that there is a high frequency of mutations in different components of the spliceosome in chronic granulocytic leukemia ([148]Quesada et al., 2012), and deregulation of RNA splicing is common in many tumor transcriptomes ([149]Dvinge and Bradley, 2015; [150]Kahles et al., 2018), suggesting the development of cancer may be influenced by this biological process ([151]Ebert and Bernard, 2011). Similarly, a recent study in BRCA showed that mutations in key genes in the spliceosome lead to a BRCA-like cell phenotype ([152]Lappin et al., 2022). In our study, upregulated genes were significantly enriched for biological processes associated with spliceosomes. Nucleases are key components of biological processes and are expressed at both gene and protein levels in cancer cells ([153]Doherty and Madhusudan, 2015; [154]He et al., 2016), and the failure of nuclease activity can lead to genomic instability and susceptibility to many cancers, including BRCA ([155]Zheng et al., 2007). The mRNA metabolic pathways include mRNA transport, pre-mRNA splicing, RNA editing, mRNA degradation, and translation activation ([156]Zou et al., 2018). Numerous regulatory proteins are involved in these biological processes ([157]Niu et al., 2020), and some of these have been demonstrated to have intricate roles in cancer; for example, m6A regulatory proteins can induce oncogene expression, cancer cell proliferation, survival, and tumorigenesis and development ([158]Sun et al., 2019; [159]Wang et al., 2020b). Translation regulation is a critical stage in the development and progression of cancer, which mediates the overall expression of protein synthesis as well as the precise translation of certain mRNAs that may promote a variety of oncogenic characteristics, such as cell transformation, tumor cell survival, invasion, metastasis, and angiogenesis ([160]Khan et al., 2016). In conclusion, this is consistent with our findings that various genes, chemicals, and pathways are involved in the development of BRCA. The prognostic model proposed in this study consists of five DE-FR-RBPs (TIPARP, SAMD4A, TSEN54, GSPT2, and RNASE1). Several studies have demonstrated the significant role that these genes play in the development of numerous malignancies, including BRCA. TIPARP acts as a tumor inhibitor by down-regulating pro-tumor transcription factors. Therefore, small molecules that increase the expression or activity of TIPARP may be effective anticancer drugs ([161]Zhang et al., 2020a). In addition, the expression of TIPARP is negatively correlated with the methylation status, and DNA methylation may be an important mechanism for the dysregulation of TIPARP in BRCA. It is worth noting that the expression of TIPARP is significantly reduced in BRCA, and more advanced BRCA tends to express lower levels of TIPARP, compared with adjacent normal tissues. The expression of TIPARP in BRCA tissues has always been low ([162]Lin Cheng et al., 2019). SAMD4A is a novel breast tumor suppressor, significantly inhibiting breast tumor-induced angiogenesis by disrupting the balance of angiogenesis-related genes in tumor cells ([163]Zhou et al., 2021). TSEN54 is involved in the complex process of pre-tRNA splicing site identification and cutting and may be related to the survival rate of BRCA patients ([164]Lou et al., 2021; [165]Yan et al., 2022). Further research is required to fully understand its mechanism of action in BRCA. We originally reported GSPT2 as a protective factor linked with BRCA based on current prognostic models, as there is no research on the role of GSPT2 in BRCA. In comparison to liver cancer patients, the normal control group’s blood had greater levels of GSPT2 expression in other malignancies ([166]Li et al., 2014). Similarly, GSPT2 was discovered to be the most distinct and singular factor in stage II colorectal cancer patients when evaluating the difference in gene expression between stage II and stage III patients ([167]Groene et al., 2006). RNASE1 can regulate the vascular homeostasis of extracellular RNA ([168]Bedenbender et al., 2019). In prostate cancer, the overexpression of RNASE1 is associated with poor survival ([169]Gao et al., 2020). Li et al. pointed out that RNASE1 was highly expressed in BRCA ([170]Li et al., 2020b). However, further research is needed to confirm its specific mechanism of action in BRCA. TMB is defined as the number of mutations per DNA giant base ([171]Greillier et al., 2018). In this study, we found that the TMB, related to immunotherapy, in the high-risk group was significantly higher than that in the low-risk group. We found that NRP1, CD244, ADORA2A, TNFRSF14, and TNFRSF15 were sparsely expressed in the high-risk group by comparing the expression of immune checkpoint molecules in the high-risk and low-risk groups. LAG3 was highly expressed in the high-risk group. NRP1 is a potential molecule that can modify the tumor microenvironment (TME) and innate immune system in a targeted way to produce an efficient anti-tumor immune response, which binds to and uptakes neutrophil elastin on a number of BRCA cell lines ([172]Kerros et al., 2017). One of the immune cell receptors that CD244 regulates is NK cell toxicity ([173]Buller et al., 2020). Expression of CD244 is linked to the generation of inhibitory molecules and the inhibition of antigen-specific CD8+T cell activity, primarily spreading inhibitory signals in immune cells linked with tumors and immunosuppression in tumor microenvironment experiences ([174]Agresta et al., 2018). ADORA2A, an immunological checkpoint molecule, has been found to have a high association with the majority of female cancers, including BRCA ([175]Wan et al., 2022). Adenosine mediates TME immunosuppression of immune cells via ADORA2A, and in addition, drugs targeting ADORA2A have entered phase I clinical trials for immunotherapy of renal cell carcinoma ([176]Fong et al., 2020; [177]Feng et al., 2022). In the tumor microenvironment, TNFRSF14 is crucial for immune system activation and recruitment ([178]Lombardo et al., 2020). Studies of diffuse large B-cell lymphomas have revealed that TNFRSF14 changes are described as early driver mutations and accelerated mutations that may occur in the early stages of the disease, considerably earlier than the diagnosis of malignant lymphoma ([179]Vogelsberg et al., 2020). This further strengthens our observations and suggests that similar mechanisms can be shared in BRCA. LAG3 builds up in many malignancies, and the activity of this protein is associated with immune cell infiltration, proliferation, and secretion ([180]Shi et al., 2022). A favorable prognosis is connected with cutting the surface portion of LAG3, which weakens the lymphocytes’ inhibitory signal ([181]Du et al., 2020). The immunological checkpoint molecule LAG3 was also shown to be substantially expressed in breast tumor tissues as compared to the normal control group, according to BRCA research ([182]Sasidharan Nair et al., 2018), which is consistent with what we discovered. Clinically, susceptibility to chemotherapy drugs differs from person to person; even BRCA patients with the same stage of cancer will react differentially to the same treatment regimen. Breast tumors are often categorized using lymph node metastasis and histological grade in order to apply various treatment regimens. However, this approach lacks accuracy, as 70%–80% of patients survive without these treatments ([183]van’t Veer et al., 2002). Therefore, the finest individualized treatment programs for diverse people before chemotherapy will have an impact on the patient’s prognosis. In comparison to a single gene alone, many gene combinations may predict chemotherapy sensitivity with higher sensitivity and specificity ([184]Zhang et al., 2020b). In this study, our risk model based on TIPARP, SAMD4A, TSEN54, GSPT2, and RNASE1 was compared with the high and low risk groups of breast cancer cell lines in the CellMiner database. IC50 values for by-products of CUDC-305, Denileukin/Diftitox/Ontak, and Tamoxifen were lower in the low-risk group. A lower IC50, in our opinion, implies that these medications function more efficiently. Our study may enable physicians to customize medicine for each patient by assisting them in predicting the therapeutic response of BRCA patients before they undergo treatment. Previous studies have shown that gene mutations are the causes of cancer ([185]Frixa et al., 2015; [186]Liu et al., 2017), with gene amplification being the most common genetic change associated with cancer ([187]Wu et al., 2020). According to our analysis of the cBioPortal database, TIPARP, SAMD4A, TSEN54, GSPT2, and RNASE1 are all mutated in BRCA, and gene amplification is the main mutation type. This study’s GSEA enrichment analysis leads to the conclusion that certain biological processes are crucial to the development of cancer. In many malignant tumors, loss of tumor suppressor factors and inactivation of some proto-oncogenes can induce oxidative phosphorylation ([188]Becherini et al., 2021). In BRCA, breast cancer stem cells upregulate oxidative phosphorylation signals and rely on oxidative phosphorylation phenotypes to meet their own high metabolic requirements. Therefore, biological processes that inhibit oxidative phosphorylation can produce tumor suppressive effects ([189]Raninga et al., 2020). DNA replication is the basis of all tissues, including tumor tissue. A study of BRCA has shown that cancer cells somehow improve or compensate for any potential DNA replication and/or repair defects ([190]Lee et al., 1999). The cytokine-cytokine receptor interaction is a key pathway for regulating cellular inflammation ([191]He et al., 2019). Many studies have supported the role of inflammatory response in BRCA ([192]Guaita-Esteruelas et al., 2017; [193]Pham et al., 2020; [194]Zhang et al., 2021). Natural killer cells are lymphocytes of the natural immune system that have cytotoxic activity and help eliminate pathogen-induced infections and cancer cells ([195]Al Absi et al., 2018). The natural killer cell-mediated cytotoxicity pathway is associated with NK cell activation, which inhibits tumor development and eradicates malignancies ([196]Yoon et al., 2015; [197]Paul and Lal, 2017). Tumor adhesion, abscission, disintegration, motility, and proliferation are all impacted by the ECM receptor interaction pathway ([198]Bao et al., 2019). ECM plays a role in gastric cancer invasion and metastasis as well as the promotion of epithelial-mesenchymal transition in colorectal cancer cells ([199]Rahbari et al., 2016; [200]Yan et al., 2018). In breast tumor tissues, ECM proteins or genes are significantly expressed ([201]Yoon et al., 2015). TGF-β is a key regulator of epithelial-to-mesenchymal transformation, and TGF-β signal also upregulates a series of oncogenic genes to further promote metastasis. Therefore, inhibition of TGF-β signal transduction is conducive to inhibiting BRCA metastasis ([202]Tang et al., 2017). In addition, among the prognostic genes screened in this study, GSPT2, RNASE1, and SAMD4A are downregulated in the Cytokine-Receptor Interaction pathway, and cytokines play a broad range of immunomodulatory roles critical to human biology and disease ([203]Spangler et al., 2015). GSPT2, SAMD4A, and TIPARP are downregulated in ECM-receptor interaction pathways, which play important roles in tumor shedding, adhesion, degradation, motility, and proliferation and may be involved in breast cancer development ([204]Bao et al., 2019). High expression levels of ECM proteins or genes in breast tumor tissues may provide new ideas for cancer therapy. We believe that these genes and pathways may be potential markers of breast cancer, but the mechanisms of tumorigenesis and progression need further experimental validation. In addition, we acknowledge limitations in our work. First, because our data were obtained from public sources, certain clinical details, such as patient treatment information, were omitted to allow for more detailed analyses of the data for comparison purposes. Second, clinical application studies and more experimental mechanism studies are lacking. We will continue to focus on these topics in the next studies as well. In summary, we identified key genes and related pathways through bioinformatics analysis of differential expression of DE-FR-RBPs in BRCA. We established a new prediction model based on the key genes. ROC analysis further proved that this method was reliable in predicting the prognosis of BRCA patients. Furthermore, we found that the FR-RBPs-associated risk signature was an independent and effective predictor of BRCA survival. Subsequently, we verified the consistency of transcription levels in clinical samples and explored trends in the expression of prognostic genes at the protein level in BRCA and normal tissues. These results may make it easier to understand the mechanisms of BRCA and help us explore new biomarkers for BRCA patients. Funding Statement This work is supported by The Key Research and Development Program of Hainan province (ZDYF2021SHFZ055), Hainan Provincial Natural Science Foundation of China (822CXTD535) and National Natural Science Foundation of China (81960475). Data availability statement The original contributions presented in the study are included in the article/[205]Supplementary Material, further inquiries can be directed to the corresponding authors. Ethics statement The patient data in this work were acquired from the publicly available datasets whose informed consent of patients were complete. Author contributions HD, ZD, and SF designed this work. XC, CY, and WW integrated and analyzed the data. XC wrote this manuscript. XH, HS, WL, and KZ edited and revised the manuscript. All authors approved this manuscript. Conflict of interest The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Publisher’s note All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher. Supplementary material The Supplementary Material for this article can be found online at: [206]https://www.frontiersin.org/articles/10.3389/fgene.2023.1025163/fu ll#supplementary-material [207]Click here for additional data file.^ (1.3MB, TIFF) [208]Click here for additional data file.^ (41.3KB, XLSX) [209]Click here for additional data file.^ (9.8KB, XLSX) [210]Click here for additional data file.^ (342.3KB, TIF) [211]Click here for additional data file.^ (41.6KB, XLSX) [212]Click here for additional data file.^ (2MB, TIF) [213]Click here for additional data file.^ (45.3KB, XLSX) [214]Click here for additional data file.^ (17MB, XLSX) [215]Click here for additional data file.^ (41.2KB, XLSX) [216]Click here for additional data file.^ (32.5KB, XLSX) [217]Click here for additional data file.^ (185KB, XLSX) [218]Click here for additional data file.^ (36KB, XLSX) References