Abstract

   Clear cell renal cell carcinoma (ccRCC) stands as the pivotal
   pathological subtype of renal cell carcinoma. However, there exists a
   dearth of pertinent biological targets crucial for advancing the
   clinical application of ccRCC. In our investigation, we employed the
   weighted gene co-expression network analysis (WGCNA) to discern 13
   distinct gene co-expression modules, with the yellow module exhibiting
   a pronounced association with tumorigenesis. Concurrently, we
   scrutinized 6147 differentially expressed genes through rigorous
   differential expression analysis. Through an intersecting approach with
   the genes within the yellow module, we pinpointed 265 cancer-related
   genes displaying notable differential expression. Subsequent
   Cox-regression analysis unveiled that among the 265 genes, four were
   notably linked to ccRCC prognosis. Furthermore, we executed
   single-sample gene set enrichment analysis (ssGSEA) on the signature
   comprising these four genes, subsequently deriving normalized
   enrichment scores (NESs). This investigation substantiated that the
   said signature bears significant implications for prognosis, holding
   the potential to forecast the efficacy of immunotherapy.

Supplementary Information

   The online version contains supplementary material available at
   10.1007/s12672-025-02991-5.

   Keywords: Kidney renal clear cell carcinoma, Weighted gene
   co-expression network analysis, Signature, Immunotherapy, Sensitivity

Introduction

   Renal cell carcinoma (RCC) stands as the most prevalent renal tumor in
   adults and ranks among the top 10 causes of cancer-related mortality
   worldwide [[36]1]. It encompasses various pathological subtypes,
   including clear cell renal cell carcinoma (ccRCC), papillary renal cell
   carcinoma, renal collecting duct carcinoma, chromophobe renal cell
   carcinoma, and unclassified renal cell carcinoma [[37]2]. Among these,
   ccRCC prevails as the most frequently encountered subtype, originating
   from the proximal renal tubules and typically exhibiting an aggressive
   phenotype [[38]3]. Surgical treatment can potentially cure RCC when all
   tumor lesions are completely resectable; however, in cases of
   metastasis, surgery alone is generally not curative. Nearly one-third
   of ccRCC patients present with metastases at the time of diagnosis
   [[39]4]. Given its limited responsiveness to chemotherapy and
   radiotherapy, the approach to treating ccRCC has undergone substantial
   transformations [[40]5]. Over recent decades, therapeutic strategies
   for ccRCC have transitioned from non-specific immune pathways, such as
   cytokines, to targeted therapies employing vascular endothelial growth
   factor (VEGF) inhibitors, and more recently, to innovative
   immunotherapeutic agents [[41]6]. Despite the heightened sensitivity of
   ccRCC patients to immunotherapy, a subset remains resistant to its
   effects. Therefore, we urgently need to find potential biological
   targets to identify patients who can poise to reap the benefits of
   immunotherapy.

   Currently, gene expression profiles of ccRCC (such as microarray data
   and RNA-seq data) have been widely used to identify biomarkers related
   to the occurrence of renal cell carcinoma [[42]7, [43]8]. The gene
   expression profile databases include TCGA, GEO, and cBioportal
   databases. However, many prior investigations primarily concentrated on
   sifting through differentially expressed genes, often overlooking the
   substantial interplay between genes [[44]9]. Within this context, the
   weighted gene co-expression network analysis (WGCNA) illuminates
   intricate correlation patterns among genes and their clinical
   phenotypes in expression profiles. The WGCNA algorithm has proven
   invaluable in uncovering underlying biological processes, therapeutic
   targets for various cancers, and specific biomarkers associated with
   complex maladies like hyperlipidemia [[45]10], Alzheimer’s disease
   [[46]11], and osteoporosis [[47]12]. Similarly, the WGCNA is used to
   identify key genes that are significantly associated with cancer
   phenotypes, including tumor stage, grade, and metastasis of different
   tumor types [[48]13, [49]14].

   In our study, our primary objective was to uncover promising prognostic
   biomarkers linked to immune response, with the aim of identifying
   immunotherapy-responsive patients. We employed WGCNA to delineate
   modules associated with ccRCC. Subsequently, we conducted functional
   enrichment and pathway analyses to shed light on the potential
   biological roles of these genes. To ascertain potential prognostic
   markers for ccRCC, we constructed both univariate and multivariate Cox
   proportional hazard regression models.

Materials and methods

Patients

   We obtained tumor and adjacent tissue samples from patients diagnosed
   with ccRCC at the First Affiliated Hospital of Harbin Medical
   University. The diagnosis was confirmed by both pathologists and
   clinicians. Following surgical resection of the tumor, paired samples
   of tumor and adjacent tissues were collected from each patient.
   Ultimately, tissue specimens and clinical data were obtained from five
   patients. This study was conducted with the approval of the Clinical
   Research Ethics Committee at the First Affiliated Hospital of Harbin
   Medical University.

Data downloading and processing

   All data in this study were collected from publicly available online
   databases, and the gene expression data of ccRCC were downloaded from
   the GEO database. The inclusion criteria were as follows: (1). There
   are at least 20 tumor and adjacent samples in the dataset; (2). The
   data included tumor and adjacent samples; (3). The data has not been
   processed. The final datasets are [50]GSE167573 and [51]GSE126964.
   Furthermore, both gene expression and clinical data for ccRCC were
   retrieved from the TCGA database. This yielded a total of 613 RNA-seq
   data, 537 clinical information data, and 19,895 mRNA for analysis.

   The data sourced from GEO databases underwent correction and
   normalization using the R package “limma” (version 3.50.1), followed by
   transformation via log2(data + 1). Data retrieved from the TCGA
   database were annotated with symbols utilizing the GENCODE database. In
   cases where multiple genes shared identical symbols, only the gene with
   the highest expression level was retained for further analysis.

Analysis of gene differential expression

   We employed the R package “DESeq2” (version 1.34.0) to conduct gene
   differential expression analysis on count data sourced from the TCGA
   database. To ensure robust statistical significance, all p-values
   underwent correction using the false discovery rate (FDR) for multiple
   tests. In this study, we considered genes as significantly
   differentially expressed when meeting the criteria of |log2FC| > 1 and
   FDR < 0.05.

Weighted gene co-expression network analysis

   The WGCNA is a systems biology approach that leverages the associations
   between sets of genes and various traits. It identifies highly
   interconnected gene modules, pinpointing potential biomarker candidates
   and therapeutic targets. In our study, we utilized the gene expression
   profile of ccRCC from the TCGA database to construct a WGCNA network.
   This involved meticulous filtering of samples and genes, followed by
   calculating Pearson’s correlation for all pairs of genes within the
   selected samples. The R package “WGCNA” (version 1.71) facilitated the
   construction of the adjacency matrix. To ensure a scale-free network,
   we selected β = 10 (R^2 = 0.90) as the soft threshold. Subsequently, to
   identify functional modules within the network, we employed the
   adjacency matrix to compute the topological overlap measure (TOM).
   Employing the TOM values, we employed dynamic tree building to
   establish gene modules and isolate module eigengenes (MEs). These MEs
   serve as representative profiles for the genes within the modules.

Enrichment analysis and single sample gene set enrichment analysis

   We conducted GO functional enrichment analysis and KEGG [[52]15,
   [53]16] pathway enrichment analysis using the R package
   “clusterProfiler” (version 4.2.2) to elucidate the biological functions
   associated with the genes. Additionally, we employed single-sample gene
   set enrichment analysis (ssGSEA) utilizing the “gsva” function from the
   R package “GSVA” (version 1.44.3) to evaluate the enrichment levels of
   gene sets in individual patients.

Deconvolution to calculate the proportion of immune cells

   We employed the CIBERSORT algorithm to quantify the composition of 22
   immune cell types within the samples [[54]17]. CIBERSORT utilizes gene
   expression profiles to estimate the relative proportions of these
   immune cells. To execute CIBERSORT, we downloaded the “LM22.txt” file
   from the website ([55]https://cibersort.stanford.edu/download.php).
   This “LM22.txt” file serves as a “signature matrix,” encompassing 547
   genes associated with 22 distinct immune cell types.

Cell lines and culture conditions

   Human RCC cell lines, including A-498, ACHN, HEK TE, SK-RC-20, UO-31,
   and KMRC-3, along with a normal renal cell line (HEK-293), were
   procured from the ATCC (Manassas, VA, USA). The RCC cell lines were
   cultured in Ham’s F-12 K medium (Jiangsu Kaiji Biotechnology Co., Ltd)
   or RPMI-1640 medium (Gibco Laboratories, Grand Island, NY), while
   HEK-293 was maintained in DMEM medium (Gibco Laboratories, Grand
   Island, NY).

Total RNA extraction and quantitative Real-Time PCR

   Total RNA was extracted from five pairs of ccRCC and adjacent tissues
   using RNA-easy Isolation Reagent (No. RC105-12, Vazyme, China).
   Subsequently, the HiScript III First Strand cDNA Synthesis Kit was
   utilized for quantitative reverse transcription polymerase chain
   reaction (qRT-PCR), following the manufacturer’s instructions. ChamQTM
   Universal SYBR^® qPCR Master Mix (No. Q635-04, Vazyme, China) was
   employed for qPCR amplification. The primer sequences are detailed in
   the Supplementary Table.

Western blotting assay and cell counting kit-8 (CCK-8) assay

   For gene expression analysis, Western blotting was employed. Blots were
   incubated with specific antibodies overnight at 4 °C. The following
   antibodies were used: C1orf54 (Cell Signaling Technology #4574S), P2RX7
   (Cell Signaling Technology #5786S), and SCN1B (Cell Signaling
   Technology #4903S). Cell proliferation was assessed using the CCK-8
   assay (Beyotime, Shanghai, China) in accordance with the provided
   protocol.

Statistical analysis

   We determined the optimal cut-off value and stratified patients using
   the “surv_cutpoint” function from the R package “survminer” (version
   0.4.9). Subsequently, we generated Kaplan-Meier survival curves for
   distinct patient groups utilizing both the “survminer” and “survival” R
   packages (version 3.3-1). To assess the significance of differences, we
   conducted log-rank tests. For ROC curve analysis, we employed the R
   package “pROC” (version 1.18.0). All visualizations were created using
   the R package “ggplot2” (version 3.3.5).

   Throughout this study, all statistical analyses were performed using
   the R programming language (version 4.2.0). Unless otherwise specified,
   all statistical tests were two-sided, and significance was established
   at p < 0.05.

Results

Construction of the weighted gene co-expression network

   The schematic diagram of this study is shown in Fig. [56]1. To detect
   the functional modules of renal clear cell carcinoma, we performed a
   weighted gene co-expression network analysis using the gene expression
   profile of ccRCC in TCGA. The data were filtered through median
   absolute deviation (MAD), and 4,974 genes whose expression level was in
   the top 25% of MAD were selected to construct the WGCNA network.
   Firstly, an appropriate soft threshold was selected according to
   R^2 = 0.9, and the final soft threshold was chosen as β = 10 to
   establish the relationship matrix (Fig. [57]2a). And the relation
   matrix is transformed into the adjacency matrix, and the power
   exponential weighting is introduced to construct the scale-free
   network. Then, based on the adjacency matrix, the TOM matrix is
   established to calculate the dissimilarity TOM between genes (disTOM),
   and the gene feature module is established. Finally, a total of 13 gene
   signature modules were obtained (Fig. [58]2b-d).

Fig. 1.

   [59]Fig. 1
   [60]Open in a new tab

   Schematic diagram of this study

Fig. 2.

   [61]Fig. 2
   [62]Open in a new tab

   Screening of tumor related genes. (A) The scale-free fit index and mean
   connectivity for various soft-thresholding powers; (B) Hierarchical
   clustering analysis is used to detect co-expression clusters with
   corresponding color assignments, and each color represents a module in
   the gene co-expression network constructed by WGCNA; (C) The heatmap of
   some genes in the network is used to describe the TOM between genes in
   the analysis. On a linear scale, the depth of red is positively
   correlated with the correlation strength between modules; (D) The
   connectivity of characteristic genes: red indicates positive
   correlation and blue indicates negative correlation; (E) Correlation
   between genes in different modules and tumor samples or normal samples;
   (F) Analysis of gene significance and module membership in yellow
   module. Scatter plot showing the correlation between gene significance
   for the tumor trait and module membership within the yellow module.
   Each dot represents a gene. The x-axis denotes the Pearson correlation
   between a gene and the yellow module eigengene (Module Membership),
   while the y-axis shows the correlation between gene expression and
   tumor status (Gene Significance). All values are Pearson correlation
   coefficients, not percentages. The observed correlation (r = 0.62,
   p < 3.3e–34) suggests a strong relationship between gene connectivity
   and relevance to the tumor phenotype

   Correlation analysis of gene modules and clinical phenotypes was
   performed, and the correlation coefficient between each module and
   phenotype was calculated (Fig. [63]2e). We found that the yellow module
   containing 309 genes had the largest correlation coefficient with
   cancer phenotypes (R = 0.62, Fig. [64]2f). For this reason, we choose
   the yellow module for further analysis.

Screening significant differentially expressed genes

   The R package “DESeq2” was used to perform the differential expression
   analysis of data from 72 normal tissue samples and 541 renal clear cell
   carcinoma samples in TCGA. According to the threshold criteria, a total
   of 2015 significantly down-regulated genes and 4132 significantly
   up-regulated genes were screened (Fig. [65]3a). We made a heatmap of
   some differentially expressed genes, and found that the expression
   levels of genes were significantly different between normal and tumor
   samples (Fig. [66]3b).

Fig. 3.

   [67]Fig. 3
   [68]Open in a new tab

   Identify differentially expressed tumor related genes. (A) The volcano
   of differentially expressed genes in TCGA: red represents significantly
   up-regulated genes, blue represents significantly down-regulated genes;
   (B) The heatmap of 100 differentially expressed genes; (C) The Venn
   shows the intersection of significantly up-regulated genes and
   significantly down-regulated genes with tumor related genes in the
   yellow module. It is found that one tumor related gene is significantly
   down regulated and 264 tumor related genes are significantly up
   regulated; (D) The upset plot shows the intersection of significantly
   up-regulated genes and significantly down-regulated genes with tumor
   related genes in the yellow module

   We intersected 147 significantly differentially expressed genes with
   309 tumor related genes in the yellow module, and obtained 1
   significantly down-regulated gene and 264 significantly up-regulated
   genes (Fig. [69]3c-d).

GO functional enrichment analysis and KEGG pathway enrichment analysis

   To explore the biological functions of tumor related genes, we
   performed GO functional enrichment analysis and KEGG pathway enrichment
   analysis for 265 significantly differentially expressed tumor related
   genes. In biological process (BP), we found that these genes are
   related to immune cell activity (Fig. [70]4a), such as T cell
   promotion, B cell activation, and T cell receiver signaling pathway;
   The cellular components (CC) involved in the genes include secret
   granule membrane, cytoplasmic vascular lumen, ficolin-1-rich granule
   and endothelial vascular (Fig. [71]4b); And the results show the
   enrichment analysis results related to molecular function (MF,
   Fig. [72]4c), such as T cell receptor binding, MHC class I protein
   binding, MHC protein binding and immune receptor activity. In the
   analysis of the KEGG pathway, we found the pathway related to some
   immune cells (Fig. [73]4d), such as Natural Killer cell mediated
   cytotoxicity, T cell receptor signaling pathway, Neutrophil
   extracellular trap formation, Th17 cell differentiation, Th1, and Th2
   cell differentiation.

Fig. 4.

   [74]Fig. 4
   [75]Open in a new tab

   GO enrichment analysis and KEGG pathway enrichment analysis of tumor
   related genes. A-D, the histogram of the enrichment analysis results of
   265 tumor related genes at BP, CC, MF and KEGG levels respectively;
   E-H, cluster diagrams of the enrichment analysis results of 265 tumor
   related genes at BP, CC, MF and KEGG levels respectively. The inner
   circle is a gene, the color represents the size of log2FC, and the
   inner sector represents a gene; An internal sector corresponds to an
   external color sector, indicating that the gene only exists in the term
   corresponding to this color; An internal sector corresponds to multiple
   external sectors, indicating that the gene exists in multiple terms

   Moreover, we also analyzed the relationship between these terms and
   differentially expressed genes. The results show that most BP and CC
   related terms share the same gene (Fig. [76]4e-f); However, MF and KEGG
   mainly show that there is no same gene between terms, and only some
   terms have the same gene (Fig. [77]4g-h).

Four prognostic related genes based on Cox proportional hazard regression
model

   Among tumor related differentially expressed genes, we constructed
   univariate and multivariate Cox proportional hazard regression models
   to further identify genes related to the outcomes of the patient. In
   the univariate Cox proportional hazard regression analysis, 114 genes
   have a potential predictive effect on the prognosis of patients. The
   analysis results of some genes are shown in the forest (Fig. [78]5a).
   Considering that the occurrence and development of the tumor is the
   result of multiple factors, we further performed multivariate Cox
   proportional hazard regression analysis on 114 genes. Meanwhile, a
   total of 27 genes were found to have a statistically significant hazard
   ratio (HR), and HR was consistent in univariate and multivariate Cox
   regression analyses (Fig. [79]5b). Ultimately, among the 27 genes,
   survival analysis found that only 4 genes (C1orf54, P2RX7, SCN1B, and
   TMIGD3) had statistically significant differences in their prognostic
   impact on patients (Fig. [80]5c). ROC curve analysis confirms that all
   4 genes have good predictive power for patient prognosis (Fig. [81]5d).

Fig. 5.

   [82]Fig. 5
   [83]Open in a new tab

   Four genes were associated with prognosis. (A) The forest of some genes
   related to prognosis in univariate Cox proportional risk model; (B) The
   forest of 27 genes related to prognosis in multivariate Cox
   proportional risk model; (C) The survival analysis of 4 genes in TCGA
   datasets; (D) The ROC curve analysis of 4 genes in TCGA datasets; (E)
   The boxplot shows the difference in the expression of 4 genes in normal
   and tumor tissues of the datasets TCGA, [84]GSE126964, and
   [85]GSE167573

   We found that most genes were highly expressed in tumor samples. To
   further verify the expression of 4 genes in tumors, we analyzed several
   data sets. As we expected, 4 genes were highly expressed in tumor
   tissues from multiple datasets (Fig. [86]5e).

Constructing the prognostic related signature

   To explore the comprehensive effect of 4 genes on the prognosis of
   patients, 4 genes were considered as a signature for further analysis.
   The single sample gene set enrichment analysis was performed on the
   signature based on the R package “GSVA”, and the normalized enrichment
   scores (NESs) of the signature in each patient was calculated.

   The optimal cut-off value was calculated according to the NESs, and the
   patients were divided into high score and low score groups.
   Kaplan-Meier survival curve analysis showed that patients in the high
   score group had poor overall survival time, suggesting that the NESs
   may be a potential risk factor for patients (Fig. [87]6a). ROC curve
   analysis revealed that the NESs of the signature had a potential
   predictive effect on patient prognosis (AUC = 0.967, Fig. [88]6a). The
   validation cohort [89]GSE167573 further validated our analysis results
   (AUC = 0.806, Fig. [90]6b). We also analyzed the relationship between
   signature and clinicopathological features. The patients were grouped
   according to gender or location of tumor occurrence, and it was found
   that there was no statistically significant difference in the NESs
   among different patients, but there were significant differences in the
   NESs among patients with different pathological stages (Fig. [91]6c-d).
   It is not difficult to see that the NESs of patients with stages III
   and IV is significantly higher than that of patients with stage I and
   II, which seems to demonstrate that patients with high NESs have poor
   overall survival time (Fig. [92]6e-g). Interestingly, in the M stage,
   we observed that the NESs of patients with metastasis was significantly
   higher than that of patients without metastasis, which confirmed that
   the prognosis of patients with tumor metastasis was worse
   (Fig. [93]6h).

Fig. 6.

   [94]Fig. 6
   [95]Open in a new tab

   Establishment of prognostic related signature. (A) NESs was calculated,
   patients were divided into high score group and low score group
   according to the best cut-off value, and survival analysis and ROC
   curve analysis were conducted in TCGA datasets; (B) NESs was
   calculated, patients were divided into high score group and low score
   group according to the best cut-off value, and survival analysis and
   ROC curve analysis were conducted in [96]GSE167573 datasets; (C) The
   boxplot shows the difference of NESs between different genders; (D) The
   boxplot shows the difference of NESs between different tumor locations;
   E-H showed the differences of NESs among different pathological stages,
   T stages, N stages and M stages respectively; I-L, we compared the
   heterogeneity of 22 immune cells abundance between patients with high
   score and patients with low score in different ways

   Since GO functional enrichment analysis and KEGG pathway enrichment
   analysis suggest that this signature is closely related to immune cell
   function, we compared the abundance of immune cell infiltration between
   patients with high and low scores. The high score group showed a higher
   abundance of tumor immunosuppressive cells such as mast cells,
   monocytes, and regulatory T cells; In the low score group, antitumor
   immune cells, such as NK cells, CR8 + T cells and CD4 + T cells, were
   enriched (Fig. [97]6i-l). The difference of overall survival time
   between the two groups may be related to the heterogeneity of immune
   cell abundance.

Verification of the expression levels of the signature-related genes in
cancer samples and cell lines

   In addition, to verify the accuracy of the signature, we performed the
   in vivo assay. We examined the expression of 4 signature genes in
   multiple kidney renal clear cell carcinoma cell lines and normal kidney
   epithelial cell (Fig. [98]7a-d). The results showed that the expression
   of 3 genes was higher in kidney renal clear cell carcinoma cell lines,
   that is C1orf54, P2RX7, and SCN1B. The expression of 4 genes has been
   further verified in clinical samples of patients with kidney renal
   clear cell carcinoma by the qRT-PCR (Fig. [99]7e).

Fig. 7.

   [100]Fig. 7
   [101]Open in a new tab

   The expression of 4 genes in human tissue specimens and cell lines.
   A-D. Western blotting was employed to test the expression level of 4
   genes in multiple kidney renal clear cell carcinoma cell lines. E,
   qRT-PCR detection of mRNA expression of 4 genes in kidney renal clear
   cell carcinoma patients. F-H. C1orf54, P2RX7, and SCN1B was
   downregulated in cell lines using siRNAs. I-K. the proliferation of
   cells transfected with siRNA against C1orf54, P2RX7, and SCN1B were
   measured using CCK8 assays

   Next, we further explored the functional role of C1orf54, P2RX7, and
   SCN1B using the HEK TE, SK-RC-20 and the OS-RC-2 cell lines. As shown
   in Fig. [102]7f-h, the expression of C1orf54, P2RX7, and SCN1B was
   significantly reduced after transfection of si-C1orf54, si-P2RX7, and
   si-SCN1B in HEK TE, SK-RC-20 and the OS-RC-2 cell lines. CCK-8 assay
   showed that compared to the control group, the growth rate of the HEK
   TE, SK-RC-20 and the OS-RC-2 cell lines transfected with si-C1orf54,
   si-P2RX7, and si-SCN1B significantly decreased (Fig. 7i-k).

Prediction of immunotherapy response of ccRCC by the signature

   As a novel modality to remedy cancer, immunotherapy has been widely
   concerned because of the high response rates of cancer patients to
   immunotherapy. In this research, we wanted to investigate whether the
   signature model can predict the benefit of immunotherapy in patients.
   We first calculated the TIDE scores of TCGA cohort and [103]GSE167573
   cohort to distinguish the response patients and non-response patients
   for immunotherapy. We found that the low score group was more effective
   in responding to immunotherapy, with higher response rates
   (Fig. [104]8a-d). We also used the signature to analyze the
   immunotherapy cohort (CheckMate009, CheckMate010, and CheckMate025)
   [[105]18, [106]19] to verify the suitability of the signature for
   immunotherapy. Firstly, the NESs of each patient in the cohort was
   calculated for survival analysis. Interestingly, in the immunotherapy
   cohort, the high score group still showed a shorter overall survival
   time (Fig. [107]8e). Then, we used ROC curve to analyze the predictive
   efficacy of the NESs on patient prognosis. The results showed that the
   NESs can better predict the sensitivity of patients to immunotherapy
   (AUC = 0.734, Fig. [108]8f). The violin further indicates that the NESs
   significantly reduces the sensitivity of patients to immunotherapy
   (Fig. [109]8g). Finally, we compared the difference in the response to
   immunotherapy between the two groups of patients. The patients in the
   high score group had a lower response rate to immunotherapy, while the
   patients in the low score group were more sensitive to immunotherapy
   (Fig. [110]8h).

Fig. 8.

   [111]Fig. 8
   [112]Open in a new tab

   Analysis of prognosis related signature in immunotherapy cohort. (A)
   The waterfall diagram shows the distribution of patients with different
   immunotherapy response in the TCGA cohort; (B) The histogram shows the
   number of immunotherapy responses in the high and low score groups of
   the TCGA cohort. Chi-square test p-value differences are shown; (C) The
   waterfall diagram shows the distribution of patients with different
   immunotherapy response in the [113]GSE167573 cohort; (D) The histogram
   shows the number of immunotherapy responses in the high and low score
   groups of the [114]GSE167573 cohort. Chi-square test p-value
   differences are shown; (E) Calculating the NESs of each patient in the
   immunotherapy cohort, divide the patients into high score group and low
   score group according to the optimal cut-off value, and conduct
   survival analysis; (F) ROC curve analysis of NESs in immunotherapy is
   used to judge the predictive efficacy of this signature on
   immunotherapy response; (G) The violin shows the difference of NESs
   between the responders and non-responders; (H) The histogram shows the
   number of patients with high score group and low score group responding
   to immunotherapy

Discussion

   The biological process of ccRCC is heterogeneous and its clinical
   course is variable [[115]20]. Therefore, to better diagnose and remedy
   ccRCC, we need to understand its molecular mechanism. In this study, we
   analyzed the gene expression profile of ccRCC in TCGA, which included
   537 renal clear cell carcinoma samples and 76 normal kidney tissue
   samples, to explore the molecular mechanism of ccRCC and to search for
   effective biomarkers or potential therapeutic targets through
   bioinformatics analysis.

   We used WGCNA [[116]21] combined with a large number of clinical
   phenotypic data from ccRCC patients to screen for biomarkers that could
   well distinguish between cancerous and normal tissue. Based on the
   co-expression genes screened in the ccRCC dataset, a gene weighted
   co-expression network was constructed, and 13 modules were identified
   by the dynamic tree cutting method. Correlation analysis showed that
   among the 13 modules, the yellow module had the highest positive
   correlation with tumor tissue. The yellow module contains 309 genes.
   The differential expression analysis of genes expression profile in
   TCGA identified 2015 downregulated genes and 4132 upregulated genes.
   Further analysis revealed that 265 tumor related genes were
   significantly differentially expressed in the expression profile of
   ccRCC.

   Functional and pathway enrichment analysis revealed that 265 genes were
   significantly associated with immune related functions and pathways of
   ccRCC. For example, T cell proliferation, B cell activation, T cell
   receptor signaling pathway, T cell receptor binding, MHC class I
   protein binding, MHC protein binding, immune receptor activity, Natural
   killer cell mediated cytotoxicity, and T cell receptor signaling
   pathway.

   Subsequently, Cox proportional hazard regression model of 265 genes
   with univariate and multivariate identified four genes that were
   significantly associated with outcomes of ccRCC. We found that most of
   these four genes were highly expressed in tumor tissues. To analyze the
   comprehensive effect of four genes on ccRCC, we regard four genes as a
   signature, analyze the variation of the signature, and calculate the
   NESs. Kaplan-Meier survival curve indicates that this signature is a
   risk factor for ccRCC, which is not conducive to the prognosis of
   patients. We used an immunotherapy cohort for validation. The results
   suggest that this gene set is a reliable risk predictor, and patients
   with high NESs may be resistant to immunotherapy, while patients with
   low NESs may be highly sensitive to immunotherapy.

   In this study, we identified four key genes—C1orf54, P2RX7, SCN1B, and
   TMIGD3—that are closely associated with the prognosis of ccRCC. These
   genes were generally highly expressed in tumor tissues and RCC cell
   lines, and their elevated expression was significantly correlated with
   poor patient survival. C1orf54, although its biological function
   remains poorly characterized, may contribute to tumor progression
   through enhanced cell proliferation or immune evasion. P2RX7, a
   purinergic receptor activated by ATP, plays a key role in inflammation,
   immune cell activation, and cell death. In the tumor microenvironment,
   it may facilitate immune escape and resistance to immunotherapy by
   modulating cytokine release and immune cell recruitment. SCN1B,
   typically involved in neuronal excitability, may affect cancer cell
   migration and indirectly impact immune interactions by altering tumor
   immunogenicity. TMIGD3, a member of the immunoglobulin superfamily,
   could influence cell adhesion and immune cell trafficking, though its
   exact mechanisms remain unclear. These genes are involved in various
   immune-related functions such as T cell activation, MHC molecule
   binding, and immune receptor activity, which shape the tumor immune
   microenvironment. Our analysis showed that patients with high
   expression of these genes exhibited increased infiltration of
   immunosuppressive cells (e.g., regulatory T cells and mast cells),
   while antitumor immune cells (e.g., CD8 + T cells and NK cells) were
   relatively less abundant. This indicates a tumor microenvironment more
   prone to immune evasion. Therefore, we speculate that such expression
   patterns may reduce the responsiveness of patients to immunotherapy and
   worsen survival outcomes. Overall, this four-gene signature may have
   potential as an indicator for prognosis assessment and immunotherapy
   response prediction in ccRCC patients.

Conclusion

   In summary, our rigorous bioinformatics analyses led us to uncover four
   key genes, C1orf54, P2RX7, SCN1B, and TMIGD3, significantly associated
   with ccRCC prognosis. This four-gene signature holds promise as a
   potential biomarker for risk assessment in ccRCC prognosis. Notably, it
   also demonstrates the ability to predict patient responsiveness to
   immunotherapy. Specifically, individuals with lower NESs exhibit
   heightened sensitivity to immunotherapy, while those with higher NESs
   display reduced responsiveness to immunotherapy.

Electronic supplementary material

   Below is the link to the electronic supplementary material.
   [117]Supplementary Material 1^ (8.9KB, xlsx)
   [118]Supplementary Material 2^ (1.7MB, pdf)

Acknowledgements