Abstract Abnormal autophagic levels have been implicated in the pathogenesis of multiple cancers, however, its role in tumors is complex and has not yet been explored clearly. Hence, we aimed to explore the prognostic values of autophagy‐related genes (ARGs) for kidney renal clear cell carcinoma (KIRC). Differentially expressed ARGs and transcription factors (TFs) were identified in KIRC patients obtaining from the The Cancer Genome Atlas (TCGA) database. Then, networks between TFs and ARGs, gene ontology functional annotations and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis were conducted. Next, we performed consensus clustering, COX regression analysis and Lasso regression analysis to identify the prognostic ARGs. Finally, an individual prognostic index (PI, riskScore) was established. Based on TCGA cohort and ArrayExpress cohort, Survival analysis, ROC curve, independent prognostic analysis, and clinical correlation analysis were also performed to evaluate this PI. Based on differentially expressed ARGs, KIRC patients were successfully divided into two clusters (P = 5.916e‐04). AS for PI, it was constructed based on 11 ARGs and significantly classified KIRC patients into high‐risk group and low‐risk group in terms of OS (P = 4.885e‐15 for TCGA cohort, P = 6.366e‐03 for ArrayExpress cohort). AUC of its ROC curve reached 0.747 for TCGA cohort and 0.779 for ArrayExpress cohort. What's more, this PI was proven to be a valuable independent prognostic factor in both univariate and multivariate COX regression analysis (P < .001). Prognostic nomograms were also performed to visualize the relationship between individual predictors and survival rates in patients with KIRC. By means of connectivity map database, emetine, cephaeline and co‐dergocrine mesilate related to ARGs were found to be negatively correlated with KIRC. This study provided an effective PI for KIRC and also displayed networks between TFs and ARGs. KIRC patients were successfully divided into two clusters based on differentially expressed ARGs. Besides, small molecule drugs related to ARGs were also identified for KIRC. Keywords: autophagy‐related genes, kidney renal clear cell carcinoma, prognostic index __________________________________________________________________ The strength of this article was that we performed a systematic analysis of the roles of autophagy in kidney renal clear cell carcinoma (KIRC) with a robust statistical approach. A novel prognostic index (PI) was successfully established and carefully evaluated. Moreover, networks between autophagy‐related genes (ARGs) and transcription factors were constructed for future basic research. Tumor clustering based on ARGs was effective, indicating ARGs played a vital role in KIRC. Last but not least, small molecule drugs related to ARGs were also identified for KIRC. graphic file with name CAM4-9-7034-g009.jpg 1. INTRODUCTION Autophagy as a conservative metabolic way to maintain the homeostasis of the cell environment in the body, it can be expressed in all cells and selectively recover necrotic, injured and genetically deficient cells or tissues, providing energy for the body or regulating the stability of organ function.[32] ^1 However, the definite role of autophagy in tumors is complex and has not yet been explored clearly. Autophagy has different mechanisms in different types of cancer, tumor microenvironments, tumor stages and so on.[33] ^2 , [34]^3 Jan Karlseder of the Salk Institute in the United States in a recent study further explained the mechanism by which autophagy inhibits cancer in the early stages of cancer that it was closely related to the ‘replication crisis’ caused by telomere fusion during cell division.[35] ^4 With the development of cancer, the role of autophagy in tumor would change from inhibiting cancer to promote cancer survival.[36] ^5 , [37]^6 Even in established tumors, autophagy could also help tumor cells resist a variety of drug treatments.[38] ^7 , [39]^8 Globally, there were 404,300 new cases of renal cell cancer and 175,100 new deaths of this tumor in 2018, ranking 16th in morbidity and mortality among all cancers, making it the third most common tumor in the urinary system.[40] ^9 Although most renal cell carcinoma (RCC) developed slowly, many patients would already have metastasis at the time of detection due to the lack of typical symptoms.[41] ^10 Moreover, renal cancer cells lacked sensitivity to radiotherapy and chemotherapy. Surgery remained the mainstay of therapy, however there were still 1/4 of these patients having a recurrence or metastasis after surgical treatment.[42] ^11 , [43]^12 Due to the lack of accurate predictive markers for the prognosis of patients with RCC, the establishment of an effective prognosis prediction model is of great significance for the management of patients in the whole course of the disease. Thanks to the availability of high‐throughput expression data nowadays, it has become feasible for us to use public database data for analyzing the associations between autophagy‐related genes (ARGs) and the clinical outcomes of kidney renal clear cell carcinoma (KIRC) patients. Here, we explored the associations between ARGs and transcription factors (TFs) and constructed prognosis prediction indexes for KIRC, based on related transcriptome profiling data and clinical information downloaded from the The Cancer Genome Atlas (TCGA) database. Our study was anticipated to provide new insights of autophagy for future work. 2. MATERIALS AND METHODS 2.1. Acquisition and preparation of data Transcriptome profiling data and related clinical information of KIRC were downloaded from TCGA Data Portal ([44]https://tcga-data.nci.nih.gov/tcga/; accessed August 2019) and ArrayExpress ([45]https://www.ebi.ac.uk/arrayexpress/; accessed March 2020). The Human Autophagy Database (HADb, [46]http://autophagy.lu/clustering/index.html) is a dedicated database reserving human ARGs. We did an overlap by comparing the obtained RNA‐seq data with the HADb database. Then, the RNA‐seq data were background corrected and standardized by the R programming language. 2.2. Identification and enrichment analysis of differently expressed ARGs Differently expressed ARGs were carried out by using “Lima” package in R statistical software between KIRC and solid tissue normal samples. The threshold for identification of ARGs was set as adjusted P‐value (FDR) < .05 and |log2fold changes (FC)| > 1. Gene functional enrichment analyses, including gene ontology (GO) functional annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, were conducted to analyze the biological functions, cellular localization, and signaling pathways of targeted genes. In this study, we performed GO and KEGG enrichment analysis on differentially expressed ARGs by using the “clusterProfiler” R package. 2.3. Identification of differently expressed TFs and construction of a network between TFs and ARGs The Cistrome database ([47]http://www.cistrome.org/) is a comprehensive resource for predicted TF targets and enhancer profiles in cancers. The prediction was from integrative analysis of TCGA expression profiles and public ChIP‐seq profiles. Differently expressed TFs were carried out by using “Lima” package in R statistical software between KIRC and solid tissue normal samples. Correlation test between differently expressed TFs and ARGs was performed by R programming language. Moreover, correlation coefficient at least 0.4 corresponding to a P < .01 were selected as the significantly correlated. 2.4. Cluster analysis In order to show whether autophagy has an important impact on the overall prognosis of patients with KIRC, consensus clustering was performed to divide patients into clusters based on the differently expressed ARGs. “ConsensusClusterPlus” package in R statistical software was adopted to perform the Consensus clustering. 2.5. Establishment of an independent prognostic index (PI, riskScore) based on ARGs In order to identify the key ARGs, a univariate COX regression analysis was firstly performed by us to exclude some ARGs with little prognostic value. Subsequently, a multivariate Cox regression analysis was utilized to remove the genes that might not be an independent indicator in prognosis monitoring. In addition, in order to prevent the occurrence of overfitting, we also used Lasso regression to remove key ARGs highly correlated with one other. According to the weight of each gene in Lasso regression analysis, we finally obtained the correlation coefficient in the model formula for predicting the prognosis of patients. Combined with the expression of various prognosis‐related genes, we established an independent prognostic model. The PI (riskScore) was calculated using the following formula β1 × gene1 expression + β2 × gene2 expression + ⋯ + βn × genen expression, where β corresponded to the correlation coefficient. 2.6. Evaluation of the prognostic index in TCGA cohort and ArrayExpress cohort According to our prognostic model, each patient in TCGA cohort and ArrayExpress cohort will get a risk score. In each cohort, we set the median risk score as the cutoff value for dividing KIRC patients into a high‐risk group and a low‐risk group, respectively. Kaplan‐Meier (K‐M) method was utilized to plot the survival curves, and the log‐rank test was performed to assess differences in the survival rates between high‐risk group and low‐risk group. The receiver operating characteristic curves (ROC) were created by the “survivalROC” package, and the area under the curve (AUC) values was calculated to evaluate the specificity and sensitivity of the model. The riskScore distribution of patients in different risk groups, the number of censored patients, and the heatmap of prognosis‐related ARGs were also displayed. A prognostic nomogram was also performed to visualize the relationship between individual predictors and survival rates in patients with KIRC based on the Cox proportional hazard regression model by means of “rms” package of R software. C‐index and the Calibration curves were used to evaluated the performance of the prognostic nomogram. To further evaluate whether our model can be used as an independent prognostic factor, we included age, gender, stage, race, grade, T, M, N, and PI as independent variables. And then we did univariate cox regression analysis and multivariate cox regression analysis on the changes of survival time and survival outcome. Multivariate ROC curves were also made to evaluate the prognostic value of each variable. Finally, we combined various clinical variables and riskScore to make a new nomogram to predict the survival outcome of patients in different cohorts. In addition, we also made a clinical correlation analysis to analyze the correlation between PI and clinical features such as age, gender, stage, race, grade, T, M, N. Besides, the correlation between each prognosis‐related ARGs and clinical features such as age, gender, stage, race, grade, T, M, N were also analyzed. 2.7. Identification of candidate small molecule drugs Connectivity map (cMap), as a gene expression profiles database led by Todd Golub and Eric Lander, it facilitated researchers to quickly identify molecule drugs highly correlated with diseases and discover its possible mechanism.[48] ^13 Up‐regulated and down‐regulated ARGs related to KIRC were uploaded and then functional connection between genes and bioactive chemicals was explored. Connectivity scores ranging from −1 to 1 were utilized to estimate how closely a compound is connected to the query signature. Positive score indicated that the query signature could be promoted by a drug, while a negative score could be repressed by a drug in cMap. 2.8. Statistical analysis Statistical analyses of all data utilized in this article were completed by R software (version 3.4.1, [49]https://www.r-project.org/). When the difference met a joint satisfaction of FDR < 0.05 and |log2FC| > 1, it was regarded to be statistically significant. “ConsensusClusterPlus” package was adopted to perform the Consensus clustering. The univariate and multivariate COX regression analysis were used to evaluate the relationship between ARGs expression and survival data to establish a prognostic model. “rms” package of R software was used to create the nomogram. The receiver operating characteristic curves were created by the “survivalROC” package of R and AUC values were also calculated by this package too. All statistical tests were two‐sided and P < .05 was considered to be statistically significant. 3. RESULTS 3.1. Differentially expressed ARGs The flow diagram for this study was displayed in Figure [50]S1. Through the online TCGA database, we obtained the RNA sequences and clinical information of 539 KIRC samples and matched 72 solid tissue normal samples. By comparing autophagy‐related genes from HADb, we finally obtained the expression of 232 relevant genes. In order to further screen out valuable differentially expressed ARGs, we set the joint satisfaction of FDR < 0.05 and |log2FC| > 1 to the filtration condition. Heatmap of differently expressed ARGs was presented in Figure [51]1A. Figure [52]1B was a volcano map showing 9 down‐regulated and 36 up‐regulated differentially expressed ARGs. Boxplot of these ARGs was detailed in Figure [53]1C. Figure 1. Figure 1 [54]Open in a new tab Differentially expressed autophagy‐related genes (ARGs); A, Heatmap of differentially expressed ARGs; B, Volcano map of differentially expressed ARGs; C, Boxplot of differentially expressed ARGs 3.2. Functional annotation of differentially expressed ARGs In order to better understand the functions and mechanisms of these ARGs, we analyzed the enrichment of GO terms function and KEGG pathway. The results of the functional enrichment analysis are summarized in Figure [55]2. Table [56]1 lists the top 10 main GO entries and the KEGG pathways. In terms of biological processes, these differential genes are mainly concentrated in autophagy, regulation of peptidase, and endopeptidase activity and so on. Separately, autophagosome is the highest enrichment level in GO terms for cellular components, protein heterodimerization activity, and peptidase regulator activity were most enriched GO terms for molecular function (Figure [57]2A; Table [58]1). In addition, the results of KEGG pathway enrichment analysis were shown in Figure [59]2B, which shows that these differentially expressed ARGs were closely related to Human cytomegalovirus infection, Autophagy animal, HIF‐1 signaling pathway, and other functional pathways. We also show the correlation between these differentially expressed ARGs and the related pathways in the form of heatmap (Figure [60]2C). Figure 2. Figure 2 [61]Open in a new tab Functional annotation of differentially expressed autophagy‐related genes (ARGs); A, The bubble plot of enriched gene ontology (GO) terms. Greed circles correspond to the biological process, red indicates the cellular component, and blue shows the molecular function category. B, Circle diagram of Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Red circles display up‐regulation and blue ones down‐regulation; C, Heatmap of KEGG pathways; The color of each block depends on the logFC values Table 1. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of differentially expressed autophagy‐related genes (ARGs) Category ID Term P‐value Genes BiologicalProcess GO:0043 281 Regulation of cysteine‐type endopeptidase activity involved in apoptotic process .0000000 BAX/FAS/CASP4/NLRC4/MYC/VEGFA/TP63/CASP1/BID/RACK1/BIRC5 BiologicalProcess GO:0052 548 Regulation of endopeptidase activity .0000000 BAX/FAS/CASP4/NLRC4/MYC/VEGFA/TP63/CASP1/BID/GAPDH/RACK1/BIRC5/SERPINA1 BiologicalProcess GO:2000 116 Regulation of cysteine‐type endopeptidase activity .0000000 BAX/FAS/CASP4/NLRC4/MYC/VEGFA/TP63/CASP1/BID/RACK1/BIRC5 BiologicalProcess GO:0052 547 Regulation of peptidase activity .0000000 BAX/FAS/CASP4/NLRC4/MYC/VEGFA/TP63/CASP1/BID/GAPDH/RACK1/BIRC5/SERPINA1 BiologicalProcess GO:0006 914 Autophagy .0000000 RAB24/IFNG/ATG12/BNIP3/CASP1/RGS19/HIF1A/VMP1/GAPDH/ATG9B/ATG16L2/MTOR/ GABARAPL1 BiologicalProcess GO:0061 919 Process utilizing autophagic mechanism .0000000 RAB24/IFNG/ATG12/BNIP3/CASP1/RGS19/HIF1A/VMP1/GAPDH/ATG9B/ATG16L2/MTOR/ GABARAPL1 BiologicalProcess GO:0097 193 Intrinsic apoptotic signaling pathway .0000000 BAX/CASP4/BNIP3/TP73/P4HB/ERO1A/TP63/HIF1A/BID/RACK1 BiologicalProcess GO:0070 482 Response to oxygen levels .0000000 FAS/MYC/BNIP3/P4HB/VEGFA/ERO1A/CXCR4/CASP1/HIF1A/MTOR BiologicalProcess GO:2001 233 Regulation of apoptotic signaling pathway .0000000 BAX/FAS/BNIP3/TP73/P4HB/TP63/HIF1A/BID/CX3CL1/RACK1 BiologicalProcess GO:1904 951 Positive regulation of establishment of protein localization .0000002 IFNG/TP73/ERBB2/TP63/CASP1/HIF1A/BID/GAPDH/EGFR/RACK1 CellularComponent GO:0005 776 Autophagosome .0000001 RAB24/ATG12/VMP1/ATG9B/ATG16L2/GABARAPL1 CellularComponent GO:0000 421 Autophagosomemembrane .0000010 VMP1/ATG9B/ATG16L2/GABARAPL1 CellularComponent GO:0061 702 Inflammasomecomplex .0000058 CASP4/NLRC4/CASP1 CellularComponent GO:0000 407 Phagophore assembly site .0000740 ATG12/VMP1/ATG9B CellularComponent GO:0005 741 Mitochondrialouter membrane .0008724 BAX/BNIP3/BID/MTOR CellularComponent GO:0031 968 Organelle outer membrane .0013680 BAX/BNIP3/BID/MTOR CellularComponent GO:0019 867 Outermembrane .0014187 BAX/BNIP3/BID/MTOR CellularComponent GO:0005 774 Vacuolarmembrane .0030639 VMP1/ATG9B/ATG16L2/MTOR/GABARAPL1 CellularComponent GO:0005 793 Endoplasmicreticulum‐Golgiintermediatecompartment .0032226 P4HB/VMP1/SERPINA1 CellularComponent GO:0044 445 Cytosolicpart .0032461 CASP4/NLRC4/CASP1/RACK1 MolecularFunction GO:0046 982 Proteinheterodimerizationactivity .0000248 BAX/BNIP3/P4HB/VEGFA/ERBB2/HIF1A/BID/EGFR MolecularFunction GO:0061 134 Peptidase regulatoractivity .0000143 NLRC4/CASP1/GAPDH/RACK1/BIRC5/SERPINA1 MolecularFunction GO:0004 857 Enzyme inhibitor activity .0003877 NLRC4/CDKN2A/GAPDH/RACK1/BIRC5/SERPINA1 MolecularFunction GO:0005 126 Cytokine receptor binding .0005663 IFNG/VEGFA/BID/CX3CL1/CCR2 MolecularFunction GO0031 625 Ubiquitin protein ligase binding .0009118 CXCR4/HIF1A/BID/EGFR/GABARAPL1 MolecularFunction GO0044 389 Ubiquitin‐like protein ligase binding .0011111 CXCR4/HIF1A/BID/EGFR/GABARAPL1 MolecularFunction GO0048 018 Receptor ligand activity .0057178 IFNG/IL24/VEGFA/CX3CL1/NRG3 MolecularFunction GO0019 903 Protein phosphatase binding .0002023 SPHK1/ERBB2/EGFR/RACK1 MolecularFunction GO0019 902 Phosphatasebinding .0007477 SPHK1/ERBB2/EGFR/RACK1 MolecularFunction GO0004 866 Endopeptidaseinhibitoractivity .0008342 NLRC4/GAPDH/BIRC5/SERPINA1 KEGG PATHWAY hsa05163 Humancytomegalovirusinfection .0000000 BAX/FAS/CDKN2A/MYC/VEGFA/CXCR4/BID/EIF4EBP1/CX3CL1/EGFR/MTOR KEGG PATHWAY hsa04140 Autophagy ‐ animal .0000000 ATG12/BNIP3/HIF1A/VMP1/ATG9B/PRKCQ/ATG16L2/MTOR/GABARAPL1 KEGG PATHWAY hsa04066 HIF‐1 signaling pathway .0000000 IFNG/VEGFA/ERBB2/HIF1A/EIF4EBP1/GAPDH/EGFR/MTOR KEGG PATHWAY hsa01524 Platinum drug resistance .0000013 BAX/FAS/CDKN2A/ERBB2/BID/BIRC5 KEGG PATHWAY hsa05219 Bladdercancer .0000015 CDKN2A/MYC/VEGFA/ERBB2/EGFR KEGG PATHWAY hsa05212 Pancreaticcancer .0000015 BAX/CDKN2A/VEGFA/ERBB2/EGFR/MTOR KEGG PATHWAY hsa01521 EGFR tyrosine kinase inhibitor resistance .0000021 BAX/VEGFA/ERBB2/EIF4EBP1/EGFR/MTOR KEGG PATHWAY hsa05167 Kaposisarcoma‐associatedherpesvirusinfection .0000026 BAX/FAS/MYC/VEGFA/HIF1A/BID/MTOR/GABARAPL1 KEGG PATHWAY hsa04012 ErbB signaling pathway .0000032 MYC/ERBB2/EIF4EBP1/EGFR/MTOR/NRG3 KEGG PATHWAY hsa04136 Autophagy ‐ other .0000164 ATG12/ATG9B/MTOR/GABARAPL1 [62]Open in a new tab 3.3. Differentially expressed TFs By comparing genetic sequences data with TFs from Cistrome, we finally obtained the expression of 317 relevant TFs. In order to further screen out valuable differentially expressed TFs, we set the joint satisfaction of FDR < 0.05 and |log2FC| > 1 to the filtration condition. Heatmap of differently expressed ARGs was presented in Figure [63]3A. Figure [64]3B was a volcano map showing 19 down‐regulated and 41 up‐regulated differentially expressed TFs. Networks between TFs and ARGs were detailed in Figure [65]3C. Figure 3. Figure 3 [66]Open in a new tab Differentially expressed transcription factors (TFs); A, Heatmap of differentially expressed TFs; B, Volcano map of differentially expressed TFs; C, A network shows the relationship between TFs and ARGs 3.4. Identification of clusters for KIRC based on ARGs Furthermore, we did consensus clustering for patients with KIRC based on ARGs. Figure [67]4A‐C suggested that satisfactory clustering effect could be obtained when k = 3. However, Figure [68]4D‐E suggested that k = 2 is the best option. Finally, patients with KIRC were divided into two groups (cluster 1 and cluster 2). Then, the clinical characteristics and survival curves of these two groups were analyzed. From Figure [69]4G, we found that there was a significant correlation between tumor stage, grade, age, fustat, and clustering. According to Kaplan‐Meier analysis, we noticed that the survival of patients in cluster 2 was worse than that in cluster 1 (Figure [70]4H). Considering the distribution of clinical features and survival curve, this clustering method had a certain significance. Figure 4. Figure 4 [71]Open in a new tab Identification of two clusters of kidney renal clear cell carcinoma (KIRC) patients that exhibited distinct ARG features and clinical outcomes using consensus clustering; A, Cumulative distribution function for k = 2 to 9; B, Relative change in the area under the CDF curve for k = 2 to 9. C, Tracking plot for k = 2 to 9. D–F, Consensus clustering matrix for k = 2, 3, and 4. G, Heatmap of the consensus matrix. *P < .05; ***P < .001; H, Kaplan‐Meier OS curves for the KIRC patients stratified by two clusters 3.5. Construction of a prognostic model index (PI, riskScore) based on ARGs Based on the obtained differentially expressed ARGs, we carried out univariate and multivariate COX regression analysis, respectively, to evaluate the prognostic value of these ARGs (Figure [72]S2A,B). According to the results of multivariate cox regression analysis, we obtained 11 risk ARGs. To avoid overfitting the model, we further took Lasso regression (Figure [73]2C‐D). Finally, 11 risk ARGs were obtained. According to the coefficient of each differentially expressed ARGs in Lasso regression, we then constructed a PI to predict the prognosis of patients with KIRC. The 11 prognostic ARGs related PI formula was as follows: riskScore = CASP4 expression × 0.409410245939865 + IFNG expression × 0.247091026343113 + BAG1 expression × (−0.31339800616801) + BNIP3 expression × (−0.312754657270375) + ERBB2 expression × 0.230285057472967 + RGS19 expression × (−0.336769784907294) + BID expression × 0.553711988544078 + EIF4EBP1 expression × 0.23902996965133 + CX3CL1 expression × (−0.26126419480746) + PRKCQ expression × (−0.409509859853768) + ATG16L2 expression × 0.241519437514572. After obtaining the PI (riskScore) based on ARGs for predicting the prognosis of KIRC patients, we got the riskScore of each patient in TCGA cohort. Then, we divided patients into two groups (high‐risk group and low‐risk group) according the median riskScore. Next, we evaluated this model in TCGA cohort from the following aspects: clinical characteristics, survival curve, ROC curve, and prognostic nomogram (Figures [74]5 and [75]6). Figure [76]5E showed that the higher the risk scores, the higher the patients in high‐risk group, and the higher the numbers of dead persons. The heatmap of these 11 key genes expression profiles in the TCGA dataset was also detailed in this figure. Kaplan‐Meier plot represents that patients in the high‐risk group had significantly shorter overall survival time than those in the low‐risk group (P = 4.885e‐15, Figure [77]5A). From the ROC curve of Figure [78]5C, AUC of this model for predicting prognosis reached 0.747, having a moderate prediction accuracy. In addition, a prognostic nomogram was created to quantify the relationship between these risk genes and survival. From this nomogram, we could obtain the total points and estimate the 1‐year, 2‐year, and 3‐year survival rate of each patient (Figure [79]6A). Table [80]2 showed the evaluation results for this nomogram (the C‐index and the AUC). The Calibration curves (Figure [81]6C‐D) further clarified the accuracy of this nomogram. Figure 5. Figure 5 [82]Open in a new tab Evaluation of prognostic index (riskScore) based on autophagy‐related genes (ARGs) for kidney renal clear cell carcinoma (KIRC) patients; A, Kaplan‐Meier plot based on TCGA cohort; B, Kaplan‐Meier plot based on ArrayExpress cohort; C, ROC curve based on TCGA cohort; D, ROC curve based on ArrayExpress cohort; E, Clinical characteristics in TCGA database (in order from top to bottom): The risk score distribution of KIRC patients in high and low risk groups; The overall survival status distribution of KIRC patients with increasing risk score; The heatmap of the 11 key genes expression profiles in the TCGA dataset; F, Clinical characteristics in ArrayExpress database (in order from top to bottom): The risk score distribution of KIRC patients in high and low risk groups; The overall survival status distribution of KIRC patients with increasing risk score; The heatmap of the 11 key genes expression profiles in the ArrayExpress dataset Figure 6. Figure 6 [83]Open in a new tab Diagnostic nomograms to clarify the relationship between risk genes and overall survival; A, A nomogram for TCGA cohort; B, A nomogram for ArrayExpress cohort; C, the Calibration curve of nomogram‐predicted probability of 3‐Year survival based on TCGA cohort; D, the Calibration curve of nomogram‐predicted probability of 5‐year survival based on TCGA cohort; E, the Calibration curve of nomogram‐predicted probability of 3‐year survival based on ArrayExpress cohort; F, the Calibration curve of nomogram‐predicted probability of 5‐year survival based on ArrayExpress cohort Table 2. Evaluation results of nomograms Cohort Nomogram composed of risk genes Nomogram composed of clinical characteristics and riskScore C‐index AUC of 1‐y ROC AUC of 3‐y ROC AUC of 5‐y ROC C‐index AUC of 1‐y ROC AUC of 3‐y ROC AUC of 5‐y ROC TCGA cohort 0.7149080 0.744 0.729 0.760 0.8033716 0.861 0.806 0.800 ArrayExpress cohort 0.8278069 0.800 0.850 0.834 0.8726003 0.895 0.897 0.861 [84]Open in a new tab 3.6. Verification of the model in external cohort In order to verify whether our model was reliable, we used it to analyze the external cohort from ArrayExpress database (E‐MTAB‐1980). The external cohort contained 101 KIRC patients. Similarly, we calculated the riskScore of each patient based on PI, and divided the patients into high‐risk group and low‐risk group according to the cut‐off value we obtained in the TCGA cohort. A Kaplan‐Meier curve based on the log‐rank test and the ROC curve were created to visualize the prognostic value of our established prognostic model in external cohort (Figure [85]5B,D). The areas under the ROC (AUC) values of PI was 0.779. Figure [86]5F showed that the higher the risk scores, the higher the patients in high‐risk group, and the higher the numbers of dead persons. The heatmap of these 11 key genes expression profiles in the external cohort was also detailed in this figure. In addition, a prognostic nomogram was also created to quantify the relationship between these risk genes and survival in external cohort (Figure [87]6B). The corresponding evaluation results of this nomogram are shown in Table [88]2 and Figure [89]6E, F. 3.7. Independent prognostic factor evaluation and correlation with clinical characteristics To further evaluate whether our model could be used as an independent prognostic factor, we included age, gender, stage, race, grade, T, M, N and riskScore as independent variables. By means of univariate and multivariate cox regression analysis, our established PI (riskScore) remained significant (both P < .001, Figure [90]7A,B, Table [91]3). Figure [92]7C presented the multiple ROC curves according to riskScore, age, gender, race, grade, stage, T, N, M. The AUC of the ROC curve made by riskScore and stage was among the largest two (0.747 and 0.800 respectively). We next made a prognostic nomogram to quantify the relationship between clinical traits and survival in TCGA cohort and in external cohort, respectively, and its evaluation (Figure [93]8; Table [94]2). Figure 7. Figure 7 [95]Open in a new tab Independent prognostic factor evaluation based on TCGA dataset; A, Univariate cox regression analysis; B, Multivariate cox regression analysis; C, Multiple ROC curves according to risk score, age, gender, race, grade, stage, T, N, M Table 3. Univariate and multivariate analyses of OS for kidney renal clear cell carcinoma patients based on TCGA Characteristics Univariate Cox Multivariate Cox HR HR.95L HR.95H P‐value HR HR.95L HR.95H P‐value Age 1.03091297 1.0173015 1.04470657 7.14E‐06 1.03111693 1.01631139 1.04613815 3.29E‐05 Gender 0.93989482 0.68296808 1.2934752 .70359364 1.05777612 0.75494048 1.48209077 .74412445 Race 1.17571796 0.70579892 1.95850782 .53411201 1.06556943 0.63023642 1.8016068 .81264081 Grade 1.97156097 1.64097145 2.36875092 4.20E‐13 1.20198028 0.95240081 1.51696279 .12131815 STAGE 1.88044574 1.66452924 2.12437012 3.38E‐24 1.88332306 1.33728269 2.65232308 .00029047 T 2.04314639 1.7242097 2.42107858 1.57E‐16 0.95256084 0.72346258 1.25420746 .7291504 M 2.1356788 1.68860772 2.70111518 2.42E‐10 0.67454412 0.36606895 1.24296195 .20676067 N 0.86208791 0.73740524 1.00785231 .06262676 0.86765667 0.73651849 1.02214418 .08951048 riskScore 2.83466442 2.34467305 3.42705452 5.28E‐27 2.06352682 1.66755103 2.55353082 2.67E‐11 [96]Open in a new tab Bold fonts represents that P value is <.05. Figure 8. Figure 8 [97]Open in a new tab Diagnostic nomograms to clarify the relationship between clinical characters, riskScore and prognosis; A, A nomogram for TCGA cohort; B, A nomogram for ArrayExpress cohort; C, the Calibration curve of nomogram‐predicted probability of 3‐year survival based on TCGA cohort; D, the Calibration curve of nomogram‐predicted probability of 5‐year survival based on TCGA cohort; E, the Calibration curve of nomogram‐predicted probability of 3‐year survival based on ArrayExpress cohort; F, the calibration curve of nomogram‐predicted probability of 5‐year survival based on ArrayExpress cohort In order to further evaluate the relationship between 11 prognostic ARGs, riskScore, and clinical characteristics, we further made the independent t tests. Table [98]4 detailed the riskScore is significantly related to stage, T, M, and grade (P < .05). Besides, the correlation between 11 prognosis‐related ARGs and clinical features such as age, gender, stage, race, grade, T, M, N was also analyzed. Bold fonts represented that P value was <.05. Table 4. Correlation analysis between 11 prognostic ARGs, riskScore and clinical characteristics ID Ag e Gender Race Grade Stage T M N Coef P‐value Coef P‐value Coef P‐value Coef P‐value Coef P‐value Coef P‐value Coef P‐value Coef P‐value CASP4 0.429 .668 −1.627 .105 3.819 .148 −4.145 .000 −5.615 .000 −4.951 .000 −2.845 .005 −0.669 .504 IFNG −0.081 .936 −0.900 .369 9.066 .011 −4.992 .000 −5.112 .000 −4.495 .000 −2.832 .005 0.548 .584 BAG1 −0.451 .652 1.487 .138 29.483 .000 5.117 .000 6.740 .000 6.478 .000 1.478 .141 −1.062 .289 BNIP3 −0.744 .458 2.353 .019 5.198 .074 3.224 .001 1.967 .050 2.223 .027 0.217 .828 −0.660 .510 ERBB2 0.532 .595 2.390 .017 17.158 .000 4.826 .000 5.770 .000 5.870 .000 1.040 .300 −1.176 .240 RGS19 −0.359 .720 −1.122 .262 0.500 .779 −5.965 .000 −5.249 .000 −4.434 .000 −2.717 .007 −0.579 .563 BID −1.152 .250 −1.605 .110 0.327 .849 −5.110 .000 −6.542 .000 −5.611 .000 −4.407 .000 −0.627 .531 EIF4EBP1 −2.547 .011 −0.279 .781 3.076 .215 −5.309 .000 −6.417 .000 −5.680 .000 −4.810 .000 −1.745 .082 CX3CL1 0.706 .481 3.987 .000 5.456 .065 4.753 .000 4.006 .000 4.420 .000 0.304 .762 −1.011 .312 PRKCQ 0.786 .432 −2.983 .003 2.306 .316 1.829 .068 2.197 .029 2.382 .018 0.239 .812 −1.254 .210 ATG16L2 −0.842 .400 2.084 .038 9.989 .007 0.413 .680 −0.877 .381 −0.953 .341 −2.596 .010 −0.168 .867 riskScore −1.540 .125 −0.501 .617 0.226 .893 −4.995 .000 −4.537 .000 −4.158 .000 −2.747 .007 −0.293 .770 [99]Open in a new tab Bold fonts represents that P value is <.05. Abbreviations: ARGs, autophagy‐related genes; Coef, correlation coefficient. 3.8. Identification of relevant small molecule drugs cMap database was utilized to screen out candidate small molecule drugs related to ARGs of KIRC. Based on differently expressed ARGs of KIRC, the most significantly related small molecule drugs were identified. P < .05, |mean| > 0.5 and n ≥ 4 were set as the threshold. As detailed in Table [100]5, 10 small molecule drugs were negatively correlated with KIRC containing emetine, cephaeline, co‐dergocrine mesilate, tobramycin, fluvastatin, piribedil, pivampicillin, saquinavir, methylprednisolone, and ifenprodil, indicating the potential to repress this disease. Four small molecule drugs were positively correlated with KIRC containing thioproperazine, copper sulfate, carbachol, and bambuterol, indicating the potential to promote this disease. Table 5. Results of connectivity map (cMap) analysis Rank cMap name Mean n Enrichment P Specificity Percent on‐null 1 Emetine −0.654 4 −0.788 .0041 0.0824 100 2 Cephaeline −0.628 5 −0.78 .0009 0.1145 100 3 Co‐dergocrine mesilate −0.569 4 −0.762 .00656 0.0226 100 4 Tobramycin −0.549 4 −0.813 .00229 0 100 5 Fluvastatin −0.549 4 −0.788 .00408 0 100 6 Piribedil −0.545 4 −0.781 .00475 0.01 100 7 Pivampicillin −0.535 4 −0.767 .00593 0 100 8 Saquinavir −0.527 4 −0.744 .00851 0.0114 100 9 Methylprednisolone −0.522 4 −0.733 .01026 0.0223 100 10 Ifenprodil −0.502 4 −0.717 .01313 0.0402 100 11 Thioproperazine 0.547 5 0.826 .00032 0 100 12 Copper sulfate 0.612 4 0.877 .0003 0.0057 100 13 Carbachol 0.613 4 0.897 .0001 0 100 14 Bambuterol 0.728 4 0.872 .00036 0 100 [101]Open in a new tab 4. DISCUSSION The concept of transformational medicine was first put forward in < lancet>, emphasizing clinical application as the center, to transform the results of basic scientific research into valuable clinical applications.[102] ^14 Currently, surgical resection remained the main method for the treatment of renal cell carcinoma, however these postoperative patients still had a high possibility of recurrence and their survival status varied differently.[103] ^15 , [104]^16 Hence, an effective way to predict the prognosis of renal cell carcinoma was of great significance to guide the whole process managing patients with renal cell carcinoma. At present, TMN staging, UISS risk grading system and SSIGN scoring system provided a certain reference value for evaluating the prognosis of renal cell carcinoma patients [[105] ^17 , [106]^18 , [107]^19 ]. Moreover, some prognostic molecular markers such as p53 and PTEN had also been further explored by researchers,[108] ^20 but their efficiency was not so satisfactory. Due to the development of high‐throughput sequencing, it had become feasible for us to use public database (TCGA, GEO or other databases) data for analyzing the associations between different key genes and the clinical outcomes of KIRC. Not long ago, a study by Wang et al have successfully established an autophagy‐clinical prognostic index in bladder cancer patients.[109] ^21 In this article, we not only constructed a prognosis prediction index for KIRC in both TCGA and ArrayExpress databases, but also explored associations between ARGs and TFs. Moreover, we successfully divided KIRC patients into two clusters based on differentially expressed ARGs. Our study was anticipated to provide new insights of autophagy for future work. We made full use of the RNAseq data in the TCGA database to find autophagy related genes with high correlation with KIRC survival. Finally, we obtained 45 differentially expressed ARGs, and analyzed their functions including GO analysis and KEEG pathways. Preliminary analysis showed that the expression of these ARGs is mostly up‐regulated in some of the most important pathways, which provided us with a reference that autophagy might play a role in promoting tumor development. However, the expressions of some ARGs were down‐regulated, which might be related to the complex mechanism of autophagy in tumors.[110] ^22 Interestingly, we also performed a consensus clustering analysis of existing renal cancer patients based on these ARGs, and KIRC patients were successfully divided into two clusters with significant differences in overall survival, indicating that ARGs might play an important role in the prognosis of patients with KIRC. By means of cMap database, 10 small molecule drugs were negatively correlated with KIRC containing emetine, cephaeline, co‐dergocrine mesilate, tobramycin, fluvastatin, piribedil, pivampicillin, saquinavir, methylprednisolone, and ifenprodil, indicating the potential to repress this disease. Four small molecule drugs were positively correlated with KIRC containing thioproperazine, copper sulfate, carbachol, and bambuterol, indicating the potential to promote this disease. By means of univariate COX regression analysis, multivariate COX regression analysis and Lasso regression analysis, we ultimately obtained 11 key ARGs (CASP4, IFNG, BAG1, BNIP3, ERBB2, RGS19, BID, EIF4EBP1, CX3CL1, PRKCQ, and ATG16L2). Most of these ARGs had been reported and were consistent with the role in our study. Therein, BNIP3 was an interacting protein of BCL2, which was considered to be a surface receptor of mitochondria, regulating cell death and promoting survival in some diseases.[111] ^23 , [112]^24 BID played a similar role by activating BAX/BAK [113]^25 and ERBB2, also known as HER2, was a member of the human epidermal growth factor receptor family, promoting cell proliferation, survival, and playing an important role in the occurrence and development of tumor. Clinically, ERBB2 mutation had become an important target for cancer therapy. CX3CL1 was the ligand of chemokine CX3CR1, which could promote tumor infiltrating cells into tumor microenvironment and play the role of immunotherapy.[114] ^26 As for IFNG, it was believed that IFNG could affect the blocking of immune checkpoints.[115] ^27 Eukaryotic initiation factor 4e binding protein (EIF4EBP1), as an important gene regulating autophagy, had been found to be highly expressed in many cancers with poor prognosis of tumor.[116] ^28 PRKCQ was an important kinase in the activation of T cells. Its phosphorylation induced the activation of Fra‐1 and played an important role in tumor recurrence and invasion.[117] ^29 , [118]^30 CASP4, as a kind of human apoptotic protease, was considered to be related to inflammation, immune activity and apoptosis.[119] ^31 , [120]^32 It has also been found to be related to the poor prognosis of esophageal cancer, colorectal cancer and breast cancer.[121] ^33 , [122]^34 , [123]^35 Contrary with the reported results, the HR value of BAG1 was <1 suggesting that it was associated with a better prognosis,.[124] ^36 , [125]^37 ATG16L2 had been found to be associated with positive prognosis.[126] ^38 , [127]^39 , [128]^40 However, the results of our multivariate COX regression analysis suggested that ATG16L2 played an opposite role in KIRC patients. This also brought us new thinking about the roles of BAG1 and ATG16L2 in kidney cancer. Further, based on these 11 prognostic ARGs and clinical characteristics in both TCGA and ArrayExpress databases, an individualized KIRC PI (riskScore) was established. The strength of this article was that we performed a systematic analysis of the roles of autophagy in KIRC with a robust statistical approach. The KIRC PI was successfully established and carefully evaluated in TCGA cohort and ArrayExpress cohort. Moreover, networks between ARGs and TFs were constructed for future basic research. Last but not least, tumor clustering based on ARGs was effective, indicating that ARGs played a vital role in KIRC. However, the limitations of the present article should not be ignored. On the one hand, we only discussed the relationship between ARGs and the prognosis of KIRC patients, without further clarifying the specific mechanism. On the other hand, the results of our study were only validated in the KIRC patient data in the TCGA database and ArrayExpress database. Retrospective data analysis made our prediction model valuable in the training set. Whether it had real application value or not, required more data support from clinical patients. 5. CONCLUSIONS Taken together, an individualized KIRC PI (riskScore) was successfully established in both TCGA and ArrayExpress databases. Based on clinical characteristics and 11 key ARGs (CASP4, IFNG, BAG1, BNIP3, ERBB2, RGS19, BID, EIF4EBP1, CX3CL1, PRKCQ, and ATG16L2), our study realized the transformation of a large number of sequencing data and clinical features to the clinical diagnosis and treatment methods. Besides, networks between TFs and ARGs were also displayed and KIRC patients were successfully divided into two clusters based on differentially expressed ARGs. Last but not least, small molecule drugs related to ARGs were also identified for KIRC. Our findings were anticipated to provide new insights of autophagy for future work. CONFLICT OF INTEREST None declared. AUTHORS CONTRIBUTIONS Y.W: Protocol/project development; R.C: Data collection or management; BY.Z: Data analysis; QW.X, CJ.J: Manuscript writing/editing. ETHICS APPROVAL Not applicable. CONSENT FOR PUBLICATION Not applicable. Supporting information Fig S1 [129]Click here for additional data file.^ (13.2MB, tif) Fig S2 [130]Click here for additional data file.^ (2.1MB, tif) Supplementary Material [131]Click here for additional data file.^ (2.2MB, txt) Supplementary Material [132]Click here for additional data file.^ (370.6KB, txt) Supplementary Material [133]Click here for additional data file.^ (7.6KB, txt) ACKNOWLEDGMENTS