Abstract Head and Neck Squamous Cell Carcinoma (HNSCC) is a widespread malignancy originating from the mucous epithelium of the oral cavity, pharynx, and larynx. Despite advances in diagnostic and therapeutic modalities, the prognosis of HNSCC remains challenging. This study investigates the intricate relationship among COPS5, immune infiltration patterns, and prognostic implications in HNSCC. Through comprehensive analyses of 519 HNSCC cases from TCGA and single-cell data from the GEO database, we utilize the CIBERSORT algorithm to discern immune cell dynamics influenced by COPS5 expression. Notably, Treg cells emerge as a central point in the interplay between COPS5 and immune modulation. Further analyses, encompassing differential gene expression, immune-related gene set enrichment, and protein-protein interaction networks, elucidate the molecular landscape associated with COPS5 in HNSCC. A prognostic risk model, incorporating CD27, TNFRSF4, FADD, and PSMD14, is formulated and validated across diverse datasets. The model demonstrates robust predictive power, underscoring its potential as a valuable prognostic tool. These genes, essential for immune regulation and cell cycle control, provide insights into the intricate mechanisms influencing HNSCC progression. In conclusion, this study not only reveals the impact of COPS5 on immune dynamics in HNSCC but also introduces a concise and effective prognostic model. 1. Introduction Head and neck squamous cell carcinoma (HNSCC) originates from the mucosal epithelium of the oral cavity, pharynx, and larynx. It is the most prevalent malignant tumor in the head and neck region and ranks as the sixth most common cancer globally. The incidence of HNSCC varies among countries [[31]1]. Despite rapid advancements in diagnostic and treatment methods for HNSCC, the prognosis of patients remains poor. Therefore, identifying new prognostic biomarkers is crucial for improving the survival outcomes of HNSCC patients [[32]2]. COP9 signaling subunit 5 (COPS5), also known as C-Jun activated domain binding protein-1 (Jab1), is a multifunctional protein involved in various biological processes, including cell proliferation, signal transduction, apoptosis, cell cycle regulation, and DNA damage repair [[33][3], [34][4], [35][5]]. Previous studies have established the potential of COPS subunits as biomarkers for HNSCC patients, with COPS5 demonstrating significant relevance to the prognosis, survival, and migration of cancer cells [[36]6]. Additionally, research indicates that COPS5 inhibits the ubiquitination and degradation of PD-L1. Inhibition of COPS5 by curcumin reduces the expression of PD-L1 in cancer cells, rendering them sensitive to CTLA4 therapy [[37]7]. However, limited research exists on the relationship between COPS5 and immune infiltration in HNSCC. The tumor microenvironment (TME) of HNSCC comprises a complex and heterogeneous mixture of tumor cells and immune cells [[38]1]. Numerous studies have highlighted the high immunosuppressive properties of the TME in many HNSCC tumors [[39]8,[40]9]. Anti-tumor immunity in the TME is primarily mediated by Teff cells and NK cells, while immune suppression and tumor cell growth are mainly orchestrated by Treg cells, MDSCs, and M2 macrophages. Infiltrating immune cell levels within the tumor microenvironment correlate with tumor growth, cancer progression, and patient prognosis [[41]10]. Notably, infiltrating Treg cells directly contribute to establishing an immunosuppressive tumor microenvironment and facilitating immune evasion [[42]11]. In recent decades, immunotherapy has rapidly advanced, emerging as a prominent treatment modality for numerous cancers. In 2016, both nivolumab and pembrolizumab, immune checkpoint inhibitors targeting programmed death 1 (PD-1), received approval for treating recurrent or metastatic HNSCC patients following platinum-containing chemotherapy. Subsequently, in 2019, pembrolizumab received approval for first-line treatment [[43]12]. While these findings highlight the significance of immunology in HNSCC, the molecular mechanisms, particularly regarding immune-related genomic effects, remain unclear [[44]13]. Several studies have made significant progress in establishing prognostic models for HNSCC to predict patient survival. These models have been constructed using various biological factors and gene sets, including immune-related long non-coding RNAs [[45]14], immune infiltration-related genes [[46]15], ferroptosis-related genes [[47]16], N6 methyladenosine (m6A) regulator-related genes [[48]17], and ion channel related genes [[49]18]. These models, constructed using Lasso regression, have proven effectiveness in predicting patient survival rates, as well as the response to of immunotherapy. Lasso regression is a method that selectively emphasizes important predictive variables and effectively eliminates less critical predictive variables by reducing their coefficients to zero [[50]19]. Due to its excellent accuracy and versatility across various fields, we chose the Lasso regression algorithm for model construction in our study. This approach aligns with the current trend in the field towards more precise prognostic models, reflecting the multifactorial nature of HNSCC. Given the limited research on the relationship between COPS5 and immune infiltration and the significant impact of Treg cells and COPS5 on treatment and prognosis in HNSCC, this study aims to establish a prognosis model associated with Treg immune genes linked to COPS5. 2. Methods 2.1. Data collection The flowchart of the study, depicted in [51]Fig. 1, delineates the data analysis process. RNA sequencing and somatic mutation data, along with clinical information for 519 cases of HNSCC, were obtained from the TCGA database (https//[52]www.cancer.gov/tcga). Single-cell data (scRNA seq) were sourced from the GEO database ([53]GSE139324). The data comprised live CD45^+ single cells from diverse sources, including peripheral blood mononuclear cells and paired tumor-infiltrating immune cells of HNSCC patients, as well as samples from healthy donors. Furthermore, immune-related gene sets were downloaded from the InnateDB website (https//[54]www.innatedb.ca/). Fig. 1. [55]Fig. 1 [56]Open in a new tab Flow Chart. This figure illustrates the workflow of the study, providing a comprehensive overview of the data collection, processing, and analysis procedures. 2.2. Digital spatial profiling (DSP) data Paraffin-embedded tumor tissues from nasopharyngeal carcinoma patients, collected before treatment for DSP RNA detection [[57]20], were included in the analysis, focusing exclusively on RNA sequencing and corresponding clinical information from the patient's tumor area. 2.3. Screening for differentially expressed genes The count data of 519 HNSCC patients from TCGA were stratified into high and low groups based on COPS5 expression levels. To identify differentially expressed genes between these groups, we employed the “edgeR” R package. 2.4. Immune infiltration analysis We utilized the CIBERSORT algorithm [[58]21], a widely-used method for estimating immune infiltration, to analyze the expression matrix of immune cell subtypes derived from TCGA's TPM data. This analysis involved categorizing the samples based on high and low COPS5 expression levels in tumor samples, aiming to identify differences in immune cell infiltration. 2.5. Screening for differentially expressed genes in Treg cells We analyzed 26 tumor-infiltrating immune cell samples from the [59]GSE139324 dataset utilizing the “seurat” package in R. Initial screening criteria, including gene type, gene number, and mitochondrial genes, were applied. Transcriptome data was standardized using the NormalizeData() function, and data normalization was conducted using the ScaleData() function. Dimensionality reduction was performed with RunPCA(), followed by further reduction using the t-SNE algorithm with dim = 28. The FindClusters() function was employed for clustering with a resolution of 0.5. Cell types were annotated using SingleR, and Treg cells were distinguished from the T cell population using cell markers such as FOXP3. Differentially expressed genes in Treg cells were identified using the FindMarkers() function. 2.6. Construction of protein-protein interaction network (PPI) We constructed a differential gene protein interaction network utilizing the String database ([60]http://string-db.org/) with a minimum required interaction score of 0.15. The Barplot() function in R was used to display the number of neighboring nodes of the core genes. 2.7. Constructing and validating a prognostic model for Treg related genes A prognostic model was constructed by identifying intersecting genes through univariate COX regression, LASSO regression, and multivariate COX regression. The risk score was calculated using a predefined formula. Samples were stratified into high-risk and low-risk groups according to the median risk score, and Kaplan-Meier (K-M) survival curves were drawn. The model's predictive performance was assessed using ROC curves. Univariate and multivariate COX regression analyses were performed to confirm whether the risk score serves as an independent prognostic factor. 2.8. Pathway enrichment analyses KEGG pathway enrichment analysis was conducted on the high-risk and low-risk groups identified by the prognostic risk model. Gene set enrichment analysis (GSEA) was carried out using R software along with the “clusterProfiler,” “org.Hs.eg.db,” and “enrichplot” packages. 2.9. GSVA gene set variation analysis GSVA is a non-parametric unsupervised analysis method. In this study, we utilized an immune feature gene set [[61]22] and assessed the enrichment of immune infiltration and function in TCGA data using the gsva() function. 2.10. Tumor mutation burden (TMB) analysis TMB was defined as the number of somatic, coding, base replacement, and insert-deletion mutations per megabase of the genome, detected using non-synonymous mutations and code-shifting indels, with a detection limit set at 5 %. The “maftools” package [[62]23] in R was utilized to generate a waterfall plot illustrating the TCGA-HNSCC somatic non-synonymous point mutations, revealing the mutation profiles of high-risk and low-risk patients in this model. 2.11. Correlation analysis of immune checkpoint genes The correlation coefficient between high-risk and low-risk groups of the model and the expression of immune checkpoint gene was calculated using “Spearman” correlation analysis. P-values were determined by either “t-test” or “wilcox.test." 2.12. Statistical analysis Data analysis, statistics, and visualization were performed using R (version 4.3.1). Differential analysis utilized the “t-test,” “Wilcoxon test,” or “chi-square test,” while correlation analysis employed the “Spearman” method. The Kaplan-Meier survival curve was assessed using the “Log-rank test” to evaluate statistical differences. A significance level of p < 0.05 was applied, with significance indicated in figures as follows: *, p < 0.05; **, p < 0.01; ***, p < 0.001. 3. Results 3.1. Analysis of COPS5 expression and immune infiltrating cells The research process is shown in [63]Fig. 1. Utilizing the CIBERSORT algorithm, we evaluated the immune cell composition in TCGA data samples. Differences in the proportion of immune cell types were observed between high and low COPS5 expression groups, categorized based on the median of COPS5 expression ([64]Fig. 2A). Notably, Treg cells exhibited the most significant difference with a P value of less than 0.001. A negative correlation between COPS5 expression and Treg cell content was identified ([65]Fig. 2B). KEGG pathway enrichment analysis, based on the median expression of COPS5, revealed distinct pathways enriched in high and low expression groups ([66]Fig. 2C). High COPS5 expression correlated with poorer prognosis, while increased Treg cell content indicated a better prognosis ([67]Fig. 2D), aligning with the observed negative correlation between COPS5 and Treg cells. Fig. 2. [68]Fig. 2 [69]Open in a new tab COPS5 Related Immune Infiltration. (A) The figure illustrates the identification of Treg cells, which show the most significant difference in immune infiltration in TCGA-HNSCC, based on the grouping of COPS5 expression levels. (B) It shows the relationship between COPS5 expression levels and the abundance of Treg cells. (C) KEGG pathway enrichment analysis of the high and low COPS5 expression groups is presented. (D) Survival analysis of COPS5 expression groups and Treg cell abundance groups is visually represented. 3.2. Analysis of single-cell data and identification of key immune genes in Treg cells The single-cell data were categorized into six cell types, with a particular focus on Treg cells ([70]Fig. 3A). T cells account for over half of all samples (58.45 %), followed by Monocyte (13.66 %), B cells (12.36 %), and Treg cells (10.72 %) ([71]Fig. 3B and C). Differential expression analysis identified 1971 genes in Treg cells. The intersection of Treg differentially expressed genes in Treg cells, COPS5 differentially expressed genes, and immune-related genes yielded 12 COPS5-related Treg immune genes ([72]Fig. 3D). A protein interaction network highlighted CD27, IL1R2, and LTB as core genes ([73]Fig. 3E). Single-factor COX regression analysis identified 8 genes affecting HNSCC patient prognosis ([74]Fig. 3F). A prognostic risk model, incorporating TNFRSF4, CD27, FADD, and PSMD14, was established through LASSO regression. Fig. 3. [75]Fig. 3 [76]Open in a new tab Screening of Prognostic Related Genes. (A) UMAP map of CD45^+ positive immune cells. (B) The proportion of different immune cells in each sample. (C) The overall proportion of immune cells. (D) The intersection of the immune-related gene set, Treg cell differential gene set, and COPS5 expression differential gene set, resulting in 12 genes. (E) A protein interaction network of the 12 intersecting genes. (F) Univariate COX regression showing that 8 out of 12 intersecting genes are associated with prognosis. 3.3. Analysis of risk score calculation and its impact on survival outcomes The calculation of the risk score, when applied to both TCGA and DSP data, revealed distinct survival outcomes between high-risk and low-risk groups ([77]Fig. 4A and B). Although the DSP data did not achieve significance, the trend was reflective of TCGA results. The survival curves for individual model genes showed consistent trends between DSP and TCGA datasets. The univariate COX regression analysis identified age, stage, and risk score as factors significantly correlated with prognosis ([78]Fig. 5A, Left). The multivariate COX regression confirmed age, stage, and risk score as independent prognostic factors, with the risk score having the greatest impact ([79]Fig. 5A, Right). The ROC curve indicated that the model, with the highest AUC value for the 3-year overall survival rate, has a superior predictive ability compared to other clinical features ([80]Fig. 5B). Chi-square tests revealed significant differences in grade and N stage between high and low-risk groups ([81]Fig. 5C). Fig. 4. [82]Fig. 4 [83]Open in a new tab Survival Analysis of the Prognostic Model. (A) The survival curve of HNSCC data from TCGA, which is used as the training set. (B) The survival curve of the DSP data, derived from our research group's NPC data, which is used as the validation set. Fig. 5. [84]Fig. 5 [85]Open in a new tab Verification of the Predictive Ability of the Treg Related Immune Gene Model. (A) Univariate (left) and multivariate (right) COX regression analyses of clinical features and risk scores. (B) The figure on the left displays the ROC curve of the risk model in predicting 1-year, 3-year, and 9-year OS; while the figure on the right shows the ROC curve of clinical features (age, gender, Grade, staging) in predicting 3-year OS. (C) Relationship between risk score and clinical traits. (D) Analysis of GSVA immune cell infiltration and function. (E) Analysis of KEGG pathway enrichment in the high-risk group (left) and low-risk group (right). 3.4. Comparative analysis of immune cell types and function scores in high-risk and low-risk groups We conducted a comparison of 16 immune cell types and 13 immune function scores between the high-risk and low-risk groups. The results indicated that there were significant differences across all measures, with the low-risk group scoring significantly higher than the high-risk group ([86]Fig. 5D). The GSEA analysis underscored pathways that were enriched in both high-risk and low-risk groups, thereby providing insights into their biological implications ([87]Fig. 5E). The high-risk group exhibited enrichment in cell cycle, glutathione metabolism, and pathways in cancer. Conversely, the low-risk group showed enrichment in immune-related pathways, such as cell adhesion molecule (CAMS) and cytokine-cytokine receptor interaction. 3.5. Genetic mutation characteristics and variant classifications in high-risk and low-risk populations We generated two waterfall plots to delve into the detailed genetic mutation characteristics between high-risk and low-risk populations. The mutation profile depicted a prevalent mistranslation mutation in HNSCC patients, with TP53, TTN, CSMD3, FAT1, CDKN2A, NOTCH1, MUC16, and SYNE1 emerging as common high-frequency mutation genes ([88]Fig. 6). Among these, TP53 exhibited the highest mutation frequency. The frequency of TP53 mutation notably decreased from 79 % in the high-risk group to 60 % in the low-risk group ([89]Fig. 6A and B). Furthermore, the main variant classification of TP53 was identified as missense mutation. The primary variant classification of HNSCC patients in both high and low-risk groups was identified as missing mutation, with SNP being the main variant type and C > T emerging as the most frequent SNV class ([90]Fig. 6C and D). Fig. 6. [91]Fig. 6 [92]Open in a new tab Tumor Mutation Landscape. Gene mutation map for the high-risk group of HNSCC (A, C)and for the low-risk group of HNSCC (B, D). 3.6. Correlation analysis of COPS5, risk score, and immune checkpoints We confirmed a significant positive correlation between COPS5 and risk score in both TCGA ([93]Fig. 7A, Left) and DSP ([94]Fig. 7A, Right) data. The correlation analysis unveiled a statistically significant relationship between gene expression levels and model risk score across 60 immune checkpoints. CD276, PVR, and TNFSF9 showed positive correlations, while other immune checkpoints displayed negative correlations with the model risk score ([95]Fig. 7B–[96]Supplementary Table 1). FADD and PSMD14 showed a significantly positive correlation with COPS5. There was a negative correlation between CD27 and COPS5. Differences existed in TNFRSF4 expression between high and low COPS5 expression groups, with the low COPS5 group exhibiting higher expression levels. ([97]Fig. 7C). Fig. 7. [98]Fig. 7 [99]Open in a new tab Relationship between COPS5 and Immune Checkpoint Genes/Model Genes. (A) Relationship between risk score and COPS5. (B) The relationship between immune checkpoint genes, COPS5, and risk score. (C) The relationship between COPS5 and model genes. 4. Discussion The high heterogeneity of HNSCC presents significant challenges, as patients are at a considerable risk of recurrence and mortality, even following complete surgical resection. Therefore, the development of reliable prognostic models is of paramount importance. Our study has successfully established a novel risk model that incorporates the COPS5-related genes CD27, TNFRSF4, FADD, and PSMD14, offering a promising tool for predicting patient outcomes. COPS5 has emerged as a critical regulator within the tumor microenvironment, significantly influencing the dynamics of immune infiltration. Its overexpression correlates with disrupted immune surveillance [[100]7], potentially leading to an inflammatory milieu that promotes tumor growth. COPS5 specifically modulates the ubiquitination and degradation of essential immune regulatory proteins [[101]24], affecting the recruitment and activation of a variety of immune cells. This modulation plays a vital role in defining the immune landscape of tumors, with profound implications for clinical outcomes. Moreover, COPS5's influence extends to tumor behavior, particularly through the degradation of the tumor suppressor protein p53 [[102]25]. The stability of p53 is crucial for its ability to suppress tumors, and its activation promotes a pro-inflammatory tumor microenvironment [[103]26], marked by a significant increase in the infiltration of CD8^+ and CD4^+ T cells, as well as B cells—all key players in the adaptive immune response [[104]27]. The prognostic significance of COPS5 is further reinforced by its association with survival rates. Elevated levels of COPS5 expression are linked to a worse prognosis in various cancers, highlighting its potential as a valuable prognostic biomarker. Furthermore, COPS5 expression levels may predict the effectiveness of immunotherapeutic interventions, providing insights that could guide personalized treatment strategies for cancer patients. Our findings, supported by the literature [[105]28], demonstrate a clear correlation between high COPS5 expression and a poorer overall survival rate in HNSCC patients. Experimental evidence also indicates that silencing COPS5 impedes the growth and migration of HNSCC cells, confirming its integral role in tumor progression [[106]6]. CD27, a member of the TNFRSF7, plays a pivotal role in T cell activation, proliferation, and survival [[107]29]. Our study aligns with existing literature, revealing a positive correlation between high CD27 expression in the tumor microenvironment and increased survival rates. TNFRSF4, also known as OX40, is predominantly expressed on activated T cells. Upregulation of TNFRSF4 enhances effector T cell differentiation, proliferation, and survival by inhibiting activation-induced cell death [[108]30]. Our study concurs with previous reports, associating high TNFRSF4 expression with better prognosis in HNSCC patients. Notably, we found a similar correlation with higher survival rates in Treg cells expressing TNFRSF4 [[109]31]. FADD is often amplified in solid tumors, including HNSCC, and is linked to poor prognosis [[110]32]. FADD's critical role in regulating cell proliferation and signaling complexes [[111]32] aligns with our results, indicating that high FADD expression is associated with poorer outcomes in HNSCC patients. PSMD14, a component of the 26S proteasome, is implicated in protein degradation and various cellular processes [[112]33]. Our study supports prior findings, associating abnormal PSMD14 expression with HNSCC occurrence and malignant progression, indicating a poor prognosis. Additionally, the PSMD14 inhibitor Thiolutin shows promising anti-tumor effects on HNSCC [[113]34]. TP53 is the most commonly mutated gene in HNSCC patients and is associated with a shorter recurrence-free survival [[114]35]. Our study found that the TP53 mutation frequency was higher in high-risk patients with poor survival than that in low-risk patients, which is consistent with previous research results. Compared to previous HNSCC prognostic models [[115][36], [116][37], [117][38]], our model demonstrates competitive predictive ability with only four screened prognostic biomarkers, thereby enhancing its clinical applicability. Pathway analysis revealed enrichment in cell cycle [[118]39], glutathione metabolism [[119]40], and cytochrome P450 metabolism of exogenous substances [[120]41] in the high-risk group. These pathways play pivotal roles in cancer development, detoxification, and drug metabolism. Upregulation immune checkpoint genes (ICGs) correlates positively with high lymphocyte infiltration and a favorable prognosis [[121]42], which is consistent with our findings. In the low-risk group, higher expression of ICGs was observed, accompanied by increased immune cell infiltration and enhanced anti-tumor immune activity, contributing to their improved survival prognosis. 5. Conclusion In conclusion, the COPS5-related Treg immune gene risk model, which includes CD27, TNFRSF4, FADD, and PSMD14, demonstrates its valuable in evaluating the prognosis of HNSCC patients. However, due to the retrospective nature of this study, inherent limitations exist, and further research is needed to elucidate the biological mechanisms. Ethics approval and consent to participate TCGA belong to public databases. Our study is based on open source data, so there are no ethical issues and other conflicts of interest. Consent for publication Not applicable. Funding This work was supported by the National Natural Science Foundation of China (81872200, 31900558), the Hubei Provincial Youth Talents Program for Public Health (WSJKRC2022013), Wuhan Young and middle-aged medical backbone talents Training Project (WHQG201904), the Wuhan Yellow Crane Talent Program (HHYC2019002), the Natural Science Foundation of Hubei Province (2020CFB298), the Zhongnan Hospital of Wuhan University Science, Technology and Innovation Seed Fund (ZNPY2018090, ZNPY2019002). Availability of data and material The original contributions presented in the study are included in the article material. Further inquiries can be directed to the corresponding authors. CRediT authorship contribution statement Yuhang Wan: Writing – original draft, Methodology, Data curation. Dujuan Wang: Data curation. Gui Yang: Methodology, Data curation. Guohong Liu: Writing – review & editing, Supervision, Funding acquisition, Data curation, Conceptualization. Yunbao Pan: Writing – review & editing, Supervision, Funding acquisition, Data curation, Conceptualization. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgements