Abstract The ubiquitin-proteasome system (UPS) with a capacity of degrading multiple intracellular proteins is an essential regulator in tumor immunosurveillance. Tumor cells that escape from recognition and destruction of immune system have been consistently characterized an important hallmark in the setting of tumor progression. Little know about the exact functions of UPS-related genes (UPSGs) and their relationships with antitumor immunity in head and neck squamous cell carcinoma (HNSCC) patients. In this study, for the first time, we comprehensively identified 114 differentially expressed UPSGs (DEUPSGs) and constructed a prognostic risk model based on the eight DEUPSGs (BRCA1, OSTM1, PCGF2, PSMD2, SOCS1, UCHL1, UHRF1, and USP54) in the TCGA-HNSCC database. This risk model was validated using multiple data sets (all P < 0.05). The high-risk score was found to be an independently prognostic factor in HNSCC patients and was significantly correlated with T cells suppression. Accordingly, our risk model can act as a prognostic signature and provide a novel concept for improving the precise immunotherapy for patients with HNSCC. Keywords: head and neck squamous cell carcinoma (HNSCC), ubiquitin proteasome system (UPS), the cancer genome atlas (TCGA) database, gene expression omnibus (GEO) database, immunosuppression INTRODUCTION Head and neck squamous cell carcinoma (HNSCC) affects more than 800,000 people each year worldwide, leading patients with HNSCC about sixth in global incidence [[44]1, [45]2]. HNSCC is a collection of malignancies with complex local autonomy, given that it originates from the squamous epithelium of the upper aerodigestive tract [[46]3], attributing to this malignancy with high morbidity and mortality. Although advances in therapeutic strategies on local control have been improved, the 5-year overall survival (OS) rate remains 40%-50% [[47]4, [48]5]. In this regard, there is a potential to identify novel biomarkers to broaden current treatment and enhance survival for patients with HNSCC. The ubiquitin-proteasome system (UPS) is responsible for the degradation of more than 80% of intracellular proteins, such as short-lived and misfolded or non-essential proteins [[49]6, [50]7]. UPS is involved in multiple biological functions, including gene transcription, translation and repair, cell cycle, cell proliferation, apoptosis, and antigen presentation [[51]8, [52]9]. As UPS is involved in multitudinous critical functions, dysfunction of UPS may result in various diseases, such as nervous system diseases and cancers [[53]10–[54]14]. The UPS is a complex system encompassing a huge number of genes, including those encoding 10 ubiquitin (Ub)-activating enzymes (E1s), approximately 40 Ub-conjugating enzymes (E2s), over 600 validated Ub-protein ligases (E3s), approximately 100 deubiquitinating enzymes (DUBs), and nearly 50 proteasome subunits [[55]15–[56]18]. Accumulating studies have demonstrated that aberrant expression of UPS members altered proteolysis, facilitating the tumorigenesis and progression of HNSCC [[57]19]. For instance, UBE2C (ubiquitin-conjugating enzyme E2 C), an E3s that overexpressed in various cancers and served as an inhibitor of p53 by facilitating its ubiquitination-mediated degradation. It is an agonist of ZEB1/2, ABCG2, and ERCC1 by promoting their transcript, which played an essential role in increasing cell proliferation and invasion, inducing epithelial-mesenchymal transitions (EMT), and chemotherapeutic resistance [[58]20–[59]23]. In addition, abnormal expression of PSMD2 (proteasome 26S subunit ubiquitin receptor, non-ATPase 2), a subunit of the 19S regulatory particle (RP) of the proteasome, has been reported in lung adenocarcinoma, breast cancer, and hepatocellular carcinoma [[60]24–[61]26]. The clinical significance of UPS genes has not been systematically investigated in patients with HNSCC. Therefore, the mechanism underlying the relationship between UPS-related genes (UPSGs) and the prognosis of HNSCC patients still needs to be further determined. The Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) provide potent resources for directly obtaining gene expression profiles from patient tissues. Therefore, this study aims to provide a systematic investigation of the expression patterns via bioinformatics analysis and a prognostic risk model based on the UPSGs. We proposed that UPSGs are associated with the prognosis of patients with HNSCC. Importantly, our finding demonstrated that HNSCC patients with poor prognoses have high-risk scores that are associated with impaired T cell antitumor responses. RESULTS Identification of differentially expressed UPSGs in HNSCC tissues We analyzed the expression of 804 UPSGs ([62]Supplementary Table 1) that were distributed through all chromosomes in 498 head and neck tumor tissues and 44 adjacent tissues. 114 differentially expressed UPSGs (DEUPSGs) were identified, as shown in [63]Supplementary Figure 1, where 97 upregulated and 17 downregulated DEUPSGs were observed (FDR < 0.05 and |Fold change| > 1.5, [64]Figure 1A, [65]1B). The genomic information of DEUPSGs was shown in [66]Supplementary Table 2. Figure 1. [67]Figure 1 [68]Open in a new tab Differential expression of UPS-related genes (UPSGs) and identification of 8 UPSGs with prognostic value in HNSCC samples. (A) 114 differentially expressed UPSGs (DEUPSGs) are depicted as a heat map. (B) 97 upregulated and 17 downregulated DEUPSGs are shown as a volcano plot (FDR < 0.05 and |Fold change| > 1.5). (C) The eight risk DEUPSGs in the prognostic risk model are shown using a forest plot. Functional analysis of DEUPSGs in HNSCC To assess the potential functions of DEUPSGs, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway were analyzed using the TCGA database. The most significant GO terms were enriched for determining DEUPSGs functions, with respect to biological process (BP), cellular component (CC), and molecular function (MF), were shown in [69]Figure 2A–[70]2C. Enriched terms were significantly correlated with the proteasomal protein catabolic process, proteasome complex, and ubiquitin-like protein transferase activity. The KEGG pathway enrichment analysis indicated that ubiquitin-mediated proteolysis and proteasome pathways may play essential roles in the formation and development of HNSCC ([71]Figure 2D). Figure 2. [72]Figure 2 [73]Open in a new tab Functional enrichment analysis of DEUPSGs in HNSCC. (A–C) The top ten enriched terms in the GO analysis belonged to biological process (A), cell component (B), and molecular function (C) for DEUPSGs are demonstrated using an enriched scatter diagram. (D) The enriched pathways of the KEGG pathway analysis are showed using a scatter diagram. GO, gene ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes. Construction and identification of the prognostic risk model To identify the prognostic value of DEUPSGs in HNSCC patients, Univariate Cox regression analysis was used to confirm the expression patterns of 114 DEUPSGs in the TCGA training set. The forest plots displayed the eight prognosis-related DEUPSGs in HNSCC ([74]Figure 1C). The prognostic risk model was established based on the eight prognosis-related DEUPSGs using LASSO regression analysis ([75]Supplementary Figure 2). The coefficient values of the eight DEUPSGs were shown in [76]Table 1. The risk score was calculated for each sample as the following equation: Table 1. List of the eight prognostic genes of the risk model in the TCGA training set. ENSG ID Symbol Location Expression status Coefficient ENSG00000012048 BRCA1 Chromosome 17 Upregulated -0.0370 ENSG00000081087 OSTM1 Chromosome 10 Upregulated 0.0294 ENSG00000277258 PCGF2 Chromosome 17 Upregulated 0.0502 ENSG00000175166 PSMD2 Chromosome 3 Upregulated 0.0058 ENSG00000185338 SOCS1 Chromosome 16 Upregulated -0.0157 ENSG00000154277 UCHL1 Chromosome 4 Upregulated 0.0022 ENSG00000276043 UHRF1 Chromosome 19 Upregulated -0.0496 ENSG00000166348 USP54 Chromosome 10 Downregulated -0.0887 [77]Open in a new tab Risk score = BRCA1 * (-0.037) + OSTM1 * 0.0294 + PCGF2 * 0.0502 + PSMD2 * 0.0058 + SOCS1 * (-0.0157) + UCHL1 * 0.0022 + UHRF1 * (-0.0496) + USP54 * (-0.0887) Samples were classified into two groups based on the median risk score (0.0622) of the training set. OS analysis indicated that the low-risk group has a markedly better prognosis than the high-risk group (P < 0.0001, [78]Figure 3A). The receiver operating characteristic (ROC) curve indicated that the area under the ROC (AUC) value was 0.664 ([79]Figure 3B). The risk scores and OS were performed by risk plots and scatter plots ([80]Figure 3C, [81]3D). Expression patterns of risk genes in the high- and low-risk groups were depicted in [82]Figure 3E, showing that high expression levels of PSMD2, OSTM1 (Osteoclastogenesis associated transmembrane protein 1), PCGF2 (polycomb group ring finger 2), and UCHL1 (ubiquitin C-terminal hydrolase L1) can be considered as risk factors were correlated with a high-risk score. Furthermore, high expression levels of BRCA1 (BRCA1 DNA repair associated), SOCS1 (suppressor of cytokine signaling 1), UHRF1 (ubiquitin-like with PHD and ring finger domains 1), and USP54 (ubiquitin specific peptidase 54) were associated with a low-risk score. Figure 3. [83]Figure 3 [84]Open in a new tab Identification of the prognostic risk model in HNSCC patients. (A) Kaplan-Meier survival curve with overall survival (OS) in the high- and low-risk HNSCC patients in the TCGA training set. (B) ROC curve showing AUC for the risk score and other clinical factors of HNSCC patients in the TCGA training set. (C) The risk plot distribution of the high- and low-risk HNSCC patients. (D) Scatter plot showing the survival status of HNSCC patients. (E) The expression of risk genes of HNSCC samples in the TCGA training set. An independent validation data set was involved in the evaluation of the risk model. Based on the prognostic risk model, patients with HNSCC were divided into low- and high-risk groups. Survival analysis performed by Kaplan-Meier displayed significant prognostic differences between the two groups (P < 0.001, [85]Supplementary Figure 3A). The relationship between the eight DEUPSGs and risk scores was shown in [86]Supplementary Figure 3B. At the same time, similar results were found in the TCGA training set and TCGA all data set ([87]Supplementary Figure 3C, [88]3D). The GEO ([89]GSE65858) database was used as an external data set for validating the classification performance of the risk model. Samples in the [90]GSE65858 were segregated into low- and high-risk groups. The classification performance of the risk model and the expression pattern of the risk genes were consistent with the training set ([91]Supplementary Figure 4). Clinical independence of the risk model To assess the independence of the risk model in clinical application, univariate and multivariate Cox regression analyses were subjected to clinical parameters from the TCGA training set, TCGA test set, [92]GSE65858 database, and TCGA all data set. Univariate and multivariate Cox regression analyses suggested that the risk score had a significant association with prognosis in the TCGA training set (HR = 4.182, 95% CI = 2.540-6.887, P < 0.001; HR = 4.513, 95% CI = 2.732-7.375, P < 0.001, respectively, [93]Figure 4A, [94]4B), the TCGA test set (HR = 1.993, 95% CI = 1.173-3.385, P < 0.05; HR = 1.954, 95% CI = 1.144-3.338, P < 0.05, respectively, [95]Figure 4C, [96]4D), the TCGA all data set (HR = 2.947, 95% CI = 2.048-4.241, P < 0.01; HR = 2.961, 95% CI = 2.051-4.277, P < 0.01, respectively, ([97]Supplementary Figure 5A, [98]5B), and the GEO data set (HR = 1.754, 95% CI = 0.917-3.352, P < 0.01; HR = 1.598, 95% CI = 0.814-3.135, P < 0.05, respectively, [99]Supplementary Figure 5C, [100]5D). These data demonstrated that the risk model has an effectively prognostic power and exhibits an independent predictive value. A nomogram contained both risk score and clinical features was shown in [101]Figure 4E. Figure 4. [102]Figure 4 [103]Open in a new tab Prognostic predictive value of risk score in HNSCC patients. (A, B) Univariate and multivariate Cox regression analyses of the clinical factors of the patients in the TCGA training set, respectively. (C, D) Univariate and multivariate Cox regression analyses of the clinical factors of the patients in TCGA test set, respectively. (E) Nomogram for OS in HNSCC patients. Gene set enrichment analysis in the high-risk and low-risk groups Gene set enrichment analysis (GSEA) was employed for pathways identification in the TCGA database for detecting twenty pathways from the high- and low-risk groups ([104]Supplementary Table 3). The selected signaling pathways showed significant differences between the two groups (FDR < 0.25, NOM p < 0.05) ([105]Table 2), in which glycan metabolism, extracellular matrix, and proteolysis were significantly enriched in the high-risk group ([106]Figure 5A, [107]5B). In contrast, DNA repair, fatty acid metabolism, other metabolisms, and immune-related pathways were significantly enriched in the low-risk group ([108]Figure 5C–[109]5E). Interestingly, the T cell receptor (TCR) signaling pathway was enriched in the low-risk group ([110]Figure 5F), which may indicate a potential relationship of the high-risk core with impaired immunosurveillance and T cell antitumor response in HNSCC. The rest of GSEA graphs were shown in [111]Supplementary Figure 6. Table 2. Gene sets enriched in high-risk and low-risk groups. MSigDB collection Name NES ES NOM p-val FDR q-val c2.cp.kegg.v7.1.symbols.gmt KEGG_GLYCOSAMINOGLYCAN_BIOSYNTHESIS_CHONDROITIN_SULFATE 2.099 0.813 0.000 0.010 KEGG_ECM_RECEPTOR_INTERACTION 1.903 0.676 0.010 0.061 KEGG_GLYCOSPHINGOLIPID_BIOSYNTHESIS_GANGLIO_SERIES 1.889 0.743 0.002 0.046 KEGG_OTHER_GLYCAN_DEGRADATION 1.854 0.705 0.006 0.038 KEGG_FOCAL_ADHESION 1.813 0.556 0.018 0.045 KEGG_GLYCOSAMINOGLYCAN_BIOSYNTHESIS_KERATAN_SULFATE 1.801 0.683 0.004 0.045 KEGG_GLYCOSAMINOGLYCAN_DEGRADATION 1.758 0.587 0.008 0.052 KEGG_LYSOSOME 1.746 0.495 0.018 0.047 KEGG_GLYCOSAMINOGLYCAN_BIOSYNTHESIS_HEPARAN_SULFATE 1.719 0.563 0.010 0.054 KEGG_PROTEASOME 1.634 0.585 0.047 0.120 KEGG_DNA_REPLICATION -1.971 -0.810 0.000 0.143 KEGG_MISMATCH_REPAIR -1.961 -0.796 0.004 0.081 KEGG_T_CELL_RECEPTOR_SIGNALING_PATHWAY -1.917 -0.559 0.004 0.065 KEGG_PANTOTHENATE_AND_COA_BIOSYNTHESIS -1.841 -0.669 0.002 0.126 KEGG_HOMOLOGOUS_RECOMBINATION -1.803 -0.690 0.016 0.148 KEGG_CYSTEINE_AND_METHIONINE_METABOLISM -1.792 -0.552 0.012 0.137 KEGG_FC_EPSILON_RI_SIGNALING_PATHWAY -1.773 -0.488 0.008 0.141 KEGG_ALPHA_LINOLENIC_ACID_METABOLISM -1.726 -0.610 0.010 0.153 KEGG_LINOLEIC_ACID_METABOLISM -1.712 -0.565 0.018 0.155 KEGG_BASE_EXCISION_REPAIR -1.676 -0.620 0.045 0.152 KEGG_ARACHIDONIC_ACID_METABOLISM -1.667 -0.476 0.014 0.152 KEGG_FATTY_ACID_METABOLISM -1.660 -0.527 0.028 0.151 [112]Open in a new tab Figure 5. [113]Figure 5 [114]Open in a new tab GSEA analysis showing the enriched pathways of the high- and low-risk groups. (A) Multiple GSEA showing glycan-related metabolism pathways in the high-risk group. (B) Multiple GSEA showing extracellular matrix and proteolysis related pathways in the high-risk group. (C) Multiple GSEA showing DNA repair in the low-risk group. (D) Multiple GSEA showing fatty acid metabolism pathways in the low-risk group. (E) Multiple GSEA showing other metabolism- and immune- related pathways in the low-risk group. (F) Single GSEA showing the T cell receptor signaling pathway. Exploration of the relationship between the risk score and antitumor immunity To determine whether the high-risk score was correlated with impaired T cell activity in HNSCC, we employed the ESTIMATE algorithm to estimate immune and stromal scores based on the TCGA database. The immune score was higher in the low-risk group than the high-risk group (P < 0.001, [115]Figure 6A), and the risk score had a negative correlation with the immune score in HNSCC patients (R = -0.19, P < 0.0001, [116]Figure 6B). However, the stromal score was higher in the high-risk group than the low-risk group (P < 0.01, [117]Figure 6C), and the risk score had a positive correlation with the stromal score in HNSCC samples (R = 0.17, P < 0.001, [118]Figure 6D). Figure 6. [119]Figure 6 [120]Open in a new tab Association between risk score and tumor immunity. (A) Distribution of immune scores in high- and low-risk HNSCC patients. (B) Association between the risk score and immune score in HNSCC samples. (C) Distribution of stromal scores in high- and low-risk HNSCC patients. (D) Association between the risk score and stromal score in HNSCC samples. (E) Comparison of immune cell fractions between the high-risk and low-risk HNSCC patients. We subsequently analyzed the fraction of both tumor-infiltrated innate and adaptive immune cells in HNSCC samples by CIBERSORT. The relationships of risk score with those immune cells were shown in [121]Figure 6E. The results showed that the fraction of CD8 T cells (P < 0.001), CD4 memory activated T cells (P < 0.001), and follicular helper T cells (P < 0.01) in samples from the high-risk group were lower than the low-risk group, indicating that high-risk score was associated with immunosuppressive phenotypes that may result from impaired T cell response and activation. Correlation between the genes of the risk model and the four T cell subpopulations According to the potential relationship between the risk model and the four T cell subpopulations ([122]Figure 7A), we further tested the correlation between the eight genes and the four T cell subpopulations ([123]Figure 7B–[124]7I). Consistent with the profile of these genes in the risk model, decreased CD8 T cells were associated with the upregulated OSTM1 (P < 0.01), PCGF2 (P < 0.05), PSMD2 (P < 0.01), UCHL1 (P < 0.05) and the downregulated SOCS1 (P < 0.01). Additionally, decreased CD4 memory-activated T cells were related to the upregulated UCHL1 (P < 0.01) and the downregulated SOCS1 (P < 0.01). Similarly, the asthenia of follicular helper T cells may be caused by high expressed OSTM1 (P < 0.05), PSMD2 (P < 0.001) and low expressed USP54 (P < 0.001). Moreover, the increase of CD4 memory resting T cells was associated with the high expressed OSTM1 (P < 0.01). Thus, the UPS-related genes OSTM1, PCGF2, PSMD2, UCHL1, SOCS1, and USP54 were associated with immunosuppression in HNSCC. Figure 7. [125]Figure 7 [126]Open in a new tab Correlation of the genes of the risk model with the four subpopulations of T cells. (A) Comparison of the four subpopulations of T cells (CD8 T cells, CD4 memory activated T cells, and follicular helper T cells) between the high- and low-risk groups. (B–I) Distribution of the four T cell subpopulations based on the high and low expression of BRCA1, OSTM1, PCGF2, PSMD2, SOCS1, UCHL1, UHRF1, and USP54, respectively. DISCUSSION Tumor cells that escape from immunosurveillance and attack have been the characteristics of immunosuppression in the tumor progression, which is largely associated with the impaired or dysregulated host immune system [[127]27]. By managing protein degradation and abundance, the UPS pathway has been reported as an essential regulator in immunosurveillance of the tumor microenvironment [[128]28]. Accumulating evidence has demonstrated that E3 dysfunction perturbs antitumor immune responses [[129]29–[130]31]. Current knowledge on the functions of the UPS pathway in the development of HNSCC remains obscure. In this regard, an improved understanding of the UPSGs functions in the text of HNSCC can provide a therapeutic value. This study analytically conferred prognostic evidence by identification of 114 DEUPSGs and establishment of the eight prognosis-related DEUPSGs. A prognosis risk model was established based on the eight DEUPSGs. Previous studies have reported the abnormal expression of UPSGs in renal cancer and breast cancer [[131]32, [132]33]. Analyses performed at both mRNA and protein levels can improve the reliability of the results, whereas the limited sample size of these analyses may lead to a bias in the results, because only abnormal gene expression is analyzed, and verification of clinical significance or potential functional analysis is lacking. In this risk model, high expression levels of PSMD2, OSTM1, PCGF2, and UCHL1 were risk factors, while high expression levels of BRCA1, SOCS1, UHRF1, and USP54 acted as protective factors. Overexpression of PSMD2 has been previously shown to be involved in the development of lung adenocarcinoma, breast cancer, and hepatocellular carcinoma [[133]24–[134]26]. UCHL1 has been reported a potential oncoprotein in colorectal cancer, breast cancer, and uterine serous cancer. It promotes proliferation and metastasis of cancer cells and leads to a radioresistant phenotype via regulation of β-catenin/TCF and HIF-1α pathway [[135]34–[136]38]. In line with those previous studies, our results also suggested that PSMD2 and UCHL1 may have therapeutic potential as targets against cancer. Accumulating evidence has shown that BRCA1 mutation is a crucial risk factor in multiple cancers, including breast cancer, skin cancer, ovarian cancer, and colorectal cancer [[137]39–[138]43]. SOCS1 acts as an antioncogene in various tumors, arresting the cell cycle, inhibiting cancer cell migration and invasion, and attenuating tumor growth [[139]44–[140]47]. Nevertheless, non-coding RNA (ncRNA) regulating SOCS1 promotes the occurrence and development of cancers [[141]48–[142]50]. Similarly, UHRF1 also acts as an antioncogene in multiple cancers and inhibits tumor development and progression [[143]51–[144]54]. Therefore, the abovementioned studies suggest that BRCA1, SOCS1, and UHRF1 serve as antioncogenes, consistent with our results. Interestingly, our GSEA results showed that the TCR signaling pathway was highly enriched in the low-risk group, which suggested the impaired T cell activation presented in the high-risk group. TCR engagement initiates a central signaling pathway that is crucial for T cell proliferation, survival, and differentiation into killer cells for adaptive immunity. Abnormalities of TCR signaling could result in immunodeficiency [[145]55–[146]58]. Consequently, according to the relationship between the risk score and the distribution of T cell subpopulations, we found that the high-risk score has a decreased immune score, which may suggest that the high-risk score is critical for evaluating whether tumor-infiltrated T cells have an immunosuppressive status and dysregulated antitumor response. Moreover, the high expression of OSTM1, PCGF2, PSMD2 combined with the low expression of UCHL1, SOCS1, and USP54 were engaged in the suppression of T cell proliferation and activation in the HNSCC microenvironment. It has been previously shown that differential expression of UCHL1 was majorly involved in the proliferation and differentiation of T cells, including CD8 memory CTLp, circulating TCR-gamma/delta+ lymphocytes, and mature T cells [[147]59]. SOCS1 has been identified as an inhibitor against cytokine release and a growth factor receptor, it plays a crucial role in regulating T cell homeostasis, development, and homeostasis activation [[148]60–[149]64]. However, the function of OSTM1, PCGF2, PSMD2, and USP54 have not been investigated in the antitumoral immune response. The present studies suggest that the high expression of OSTM1, PCGF2, and PSMD2 and low expression of UCHL1 were associated with immunosuppressive phenotypes. Generally, we identified an eight -UPSGs based risk model according to the TCGA-HNSCC database and analyzed its biological functions, demonstrating that the risk score obtained from this model was significantly correlated with immunosuppressive status. However, some limitations in the context of the study should be acknowledged. First, this study is performed by bioinformatics analysis alone, the predictive results could be insufficient. Second, this is a retrospective study rather than a prospective one, further validations with large clinical cohort and actual experiments are needed. CONCLUSIONS In summary, we present a risk model constructed by DEUPSGs that can be considered as potential prognostic biomarkers and associated with an immunosuppressive status and impaired antitumor response of T cells in the HNSCC microenvironment. This systematic analysis on the interaction of UPSGs based risk model and immune profile provides a novel understanding of the precise immunotherapy for HNSCC patients. MATERIALS AND METHODS Data downloading and processing The most recent transcriptional data and clinical features on 500 HNSCC samples and 44 adjacent samples were obtained from the TCGA database ([150]https://portal.gdc.cancer.gov/) and the cBio Cancer Genomics portal ([151]https://www.cbioportal.org/) ([152]Table 3). We chose 498 HNSCC samples with follow-up data and randomly divided them into two groups: a training set (n=298) and a test set (n=298), as shown in [153]Supplementary Tables 4, [154]5. In addition, we obtained the [155]GSE65858 data setwith 270 HNSCC samples as an external set from the Gene Expression Omnibus (GEO) database ([156]https://www.ncbi.nlm.nih.gov/geo/). Table 3. Clinical characteristics of HNSCC patients in the TCGA and GEO databases. Clinical characteristics TCGA GEO ([157]GSE65858) n=500 % n=270 % Age < 60 220 44.0 153 56.7 ≥ 60 280 56.0 117 43.3 Gender Female 133 26.6 47 17.4 Male 367 73.4 223 82.6 Histologic grade G1 61 12.2 G2 299 59.8 G3 119 23.8 G4 2 0.4 Gx 16 3.2 NA 3 0.6 Stage I 19 3.8 18 6.7 II 95 19.0 37 13.7 III 102 20.4 37 13.7 IV 270 54.0 178 65.9 NA 14 2.8 T classification T1 33 6.6 35 13 T2 143 28.6 80 29.6 T3 130 26.0 58 21.5 T4 179 35.8 97 35.9 Tx 11 2.2 NA 4 0.8 N classification N0 239 47.8 94 34.8 N+ 239 47.8 176 65.2 Nx 18 3.6 NA 4 0.8 M classification M0 470 94.0 263 97.4 M1 5 1.0 7 2.6 Mx 20 4.0 NA 5 1.0 Vital status Deceased 218 43.6 94 34.8 Living 282 56.4 176 65.2 [158]Open in a new tab Identification of differentially expressed genes The differentially expressed UPS-related genes (DEUPSGs) between head and neck tumor tissues and adjacent tissues were screened out through the Wilcoxon signed-rank test using the limma R package, according to the following cut-off values: FDR < 0.05 and |Fold change| > 1.5. Subsequently, a heat map was generated using the “pheatmap” package, and volcano dot plots were created to present the DEUPSGs. We used the OmicCircos R package to show the distribution of the DEUPSGs [[159]65]. Functional enrichment analyses The enrichplot R package was built for the GO and KEGG pathway enrichment analyses to identify the functions of the abovementioned DEUPSGs [[160]66], containing BP, CC, and MF terms and pathways. Statistical significance was defined as both P-value and FDR < 0.05. Prognostic risk model establishment We used univariate Cox regression analysis to screen out DEUPSGs significantly associated with OS. P < 0.05 was chosen as the threshold. We further used Lasso regression analysis to establish a multi-gene prognostic model. The risk score of each patient was calculated, employing the regression coefficient of each selected gene based on the following equation: [MATH: Riskscore=i=1< /mn>nCoe f(genei)Expi, :MATH] where n is the number of prognostic genes, gene[i] is the [i]th prognostic gene, Coef is the regression coefficient of genes, and Exp[i]is the expression value of the prognostic genes. Then, the HNSCC patients were divided into high-risk and low-risk groups according to the median risk score. Moreover, we used rms package to generate a nomogram. Gene set enrichment analysis Gene set enrichment analysis (GSEA, [161]https://www.gsea-msigdb.org/) is a powerful computational tool used to determine statistical differences of specific functional gene sets between two biological states [[162]67]. Here, we performed the GSEA analysis using GSEA v4.0.3 to obtain the differentially expressed genes from the high- and low-risk groups. After 1000 repeats for each analysis, gene sets with p-value < 0.05 and FDR < 0.25 were identified as enriched sets. Identification of immune scores and tumor-infiltrating immune cells To identify infiltrated immune cell numbers and stromal cell numbers, the R software package “ESTIMATE algorithm” was used for calculating immune and stromal scores of HNSCC samples in the TCGA database [[163]68]. Additionally, CIBERSORT was applied to acquire the composition of the infiltrated immune cells in HNSCC samples [[164]69]. Statistical analysis R package (v4.0.1) was subjected to all analyses. Kaplan-Meier curve was drawn with the log-rank test. Univariate and multivariate Cox regression was used to determine the independence of gene markers. Spearman’s rank correlation test was utilized to identify the variables. Differences in the distributions of the variables were analyzed by the Chi-square test or Fisher’s exact test. P < 0.05 was considered statistically significant. Data availability statement Expression profiles of data sets in this study were downloaded from The Cancer Genome Atlas (TCGA) ([165]https://portal.gdc.cancer.gov/) and Gene Expression Omnibus (GEO) database ([166]https://www.ncbi.nlm.nih.gov/geo/). Supplementary Material Supplementary Figures [167]aging-13-203411-s001.pdf^ (2.2MB, pdf) Supplementary Table 1 [168]aging-13-203411-s002.xlsx^ (41.4KB, xlsx) Supplementary Table 2 [169]aging-13-203411-s003.xlsx^ (19.8KB, xlsx) Supplementary Table 3 [170]aging-13-203411-s004.xlsx^ (12.7KB, xlsx) Supplementary Table 4 [171]aging-13-203411-s005.xlsx^ (22.9KB, xlsx) Supplementary Table 5 [172]aging-13-203411-s006.xlsx^ (22.9KB, xlsx) Footnotes AUTHOR CONTRIBUTIONS: YT and XZ designed this project and approved the manuscript. JW and ZL contributed to the data analysis and carried out the manuscript. JL, LZ, YQ, FZ, RH, HC and YT co-contributed to revising and polishing the manuscript and figures. All authors have read and approved the final submitted manuscript. CONFLICTS OF INTEREST: The authors declare that they have no conflicts of interest. FUNDING: This work was supported by grants from the Hunan provincial Health and Family Planning Commission (20201947, B20180186) and the Hunan Provincial Natural Science Foundation of China (2019JJ50547). REFERENCES