Abstract

   Human clear cell renal cell carcinoma (ccRCC) is the most common solid
   lesion within kidney, and its prognostic is influenced by the
   progression covering a complex network of gene interactions. In our
   study, we screened differential expressed genes, and constructed
   protein-protein interaction (PPI) network and a weighted gene
   co-expression network to identify key genes and pathways associated
   with the progression of ccRCC (n = 56). Functional and pathway
   enrichment analysis demonstrated that upregulated differentially
   expressed genes (DEGs) were significantly enriched in response to
   wounding, positive regulation of immune system process, leukocyte
   activation, immune response and cell activation. Downregulated DEGs
   were significantly enriched in oxidation reduction, monovalent
   inorganic cation transport, ion transport, excretion and anion
   transport. In the PPI network, top 10 hub genes were identified (TOP2A,
   MYC, ALB, CDK1, VEGFA, MMP9, PTPRC, CASR, EGFR and PTGS2). In
   co-expression network, 6 ccRCC-related modules were identified. They
   were associated with immune response, metabolic process, cell cycle
   regulation, angiogenesis and ion transport. In conclusion, our study
   illustrated the hub genes and pathways involved in the progress of
   ccRCC, and further molecular biological experiments are needed to
   confirm the function of the candidate biomarkers in human ccRCC.

   Keywords: clear cell renal cell carcinoma (ccRCC), differentially
   expressed genes (DEGs), biomarker, weighted gene co-expression network
   analysis (WGCNA), protein-protein interaction (PPI)

Introduction

   Renal cell carcinoma (RCC), which makes up approximately 3% of all
   adult malignancies and 90-95% of adult kidney neoplasm, is being
   diagnosed with increasing incidence and mortality rates worldwide
   [45]^1. Clear cell renal cell carcinoma (ccRCC) is the most common
   tumor in kidney. However, its prognostic is influenced by the
   progression covering a complex network of gene interactions. As the
   most common pathological type of renal cancer, apart from surgery, it
   is both radiotherapy and chemotherapy resistant. At present, biomarkers
   for early detection and follow-up of the disease are not available.
   Therefore, targeted therapies are the best choices of non-surgical
   treatment for ccRCC [46]^2. Many studies revealed that targeted
   therapies, such as multipotase inhibitors, anti-VEGF antibodies and
   mTOR, had been approved for clinical use [47]^3. However,
   identification of novel therapeutic targets or biomarkers for
   prognostic, diagnostic or predictive use remains a priority.

   Currently, gene expression profiles have been used to identify genes
   associated with progression of renal cancer [48]^4^-[49]^7. Meanwhile,
   some researchers also used integrated approach to screen changes in
   renal carcinogenesis [50]^8^, [51]^9. However, most studies just
   focused on the screening of differentially expressed genes and ignored
   the high degree of interconnection between genes, although genes with
   similar expression patterns may be functionally related. Weighted gene
   expression network analysis (WGCNA) is a systems biology method for
   describing the correlation patterns among genes across microarray or
   RNA Sequence data, and it is an algorithm used for finding clusters
   (modules) of highly correlated genes as well as identifying phenotype
   related modules or genes clusters [52]^10. Nowadays, there are various
   studies revealing the phenotypes-related genes via WGCNA method,
   especially in cancers [53]^11^, [54]^12. For example, Zhou et al. found
   that TOP2A could be used as the potential prognostic and progression
   biomarker for pancreatic ductal adenocarcinoma [55]^12. Wang et al.
   revealed that ASPM could cause cirrhosis and eventually led to
   hepatocellular carcinoma [56]^13. In this study, we attempt to firstly
   screen differential expressed genes, construct protein-protein
   interaction networks and a co-expression network of relationships
   between genes through a systematic biology method based on WGCNA and to
   identify key genes and pathways participating in the carcinogenesis of
   ccRCC [57]^13^, [58]^14.

Materials and Methods

Data collection

   Gene expression profile was downloaded from Gene Expression Omnibus
   (GEO) database ([59]http://www.ncbi.nlm.nih.gov/geo/). Dataset
   [60]GSE53000 performed on Affymetrix Human Gene 1.0 ST Array
   [transcript (gene) version] (Affymetrix, Santa Clara, CA, USA) was used
   to screen differential expressed genes, construct protein-protein
   interaction networks and co-expression networks and identify hub genes
   and pathways in this study. This dataset included 56 clear cell renal
   cell carcinoma samples (including 2 lymph node metastasis samples and 1
   venous thrombus samples) and 6 normal kidney samples.

Study design and data preprocessing

   The study design was performed in a flow diagram (Fig. [61]1). Raw
   expression data were calculated following the pre-processing
   procedures: RMA background correction, log2transformation, quantile
   normalization and median polish algorithm summarization using the
   “affy” R package. Probes were annotated by the Affymetrix annotation
   files. Microarray quality was assessed by sample clustering according
   to the distance between different samples in Pearson's correlation
   matrices. No samples were removed from subsequent analysis in the test
   dataset (Fig. [62]2A).

Figure 1.

   [63]Figure 1
   [64]Open in a new tab

   Flow diagram of study. Data preparing, processing and analysis was
   shown in the flow diagram.

Figure 2.

   [65]Figure 2
   [66]Open in a new tab

   Samples clustering and identification of differentially expressed genes
   (DEGs) in ccRCC tissues. (A) Samples clustering of [67]GSE53000 to
   detect outliers. (B) The volcano plot of all DEGs.

Differentially expressed genes (DEGs) screening

   We use the “limma” R package to screen the DEGs between ccRCC samples
   and normal kidney samples. The FDR (false discovery rate) < 0.05 and
   |log[2]fold change (FC)| > 1 were chosen as the cut-off criteria (Fig.
   [68]2B). Furthermore, we chose the same condition and screened
   differentially expressed genes between ccRCC samples and metastasis
   samples.

Functional and pathway enrichment analysis

   The Database for Annotation, Visualization and Integrated Discovery
   (DAVID) ([69]http://david.abcc.ncifcrf.gov/) is an online program
   providing a comprehensive set of functional annotation tools for
   investigators to understand biological meaning behind large list of
   genes [70]^15. Enriched biological themes of DEGs, particularly GO
   terms and visualization of those on KEGG pathway maps were performed
   using DAVID database, p < 0.05 was set as the cut-off criterion.

PPI network construction

   We used Search Tool for the Retrieval of Interacting Genes (STRING)
   Database (STRING) ([71]http://www.string-db.org/) to assess
   protein-protein interaction (PPI) information [72]^16. In addition, to
   evaluate the interrelationships of DEGs, we used STRING database for
   analysis and Cytoscape software for visualization [73]^17. Confidence
   score > 0.4 was set as significant.

Hub module selection and functional analysis

   We used plug-in Molecular Complex Detection (MCODE) to select hub
   modules of PPI network in Cytoscape [74]^18. Meanwhile, degree = 5,
   node score = 0.2, k-core = 2, and max. depth = 100 were used as cut-off
   criteria. Then we uploaded the genes in hub module to DAVID database to
   perform functional analysis.

Co-expression network construction and module functional analysis

   Firstly, expression data profile of DEGs was tested to check if they
   were the good samples and good genes. Then, we used the “WGCNA” package
   in R to construct co-expression network for the DEGs [75]^19^, [76]^20.
   At first, the Pearson's correlation matrices were both performed for
   all pair-wise genes. Then, a weighted adjacency matrix was constructed
   using a power function a[mn] = |c[mn]|^β (c[mn] = Pearson's correlation
   between gene m and gene n; a[mn] = adjacency between gene m and gene
   n). β was a soft-thresholding parameter that could emphasize strong
   correlations between genes and penalize weak correlations. Next, the
   adjacency was transformed into topological overlap matrix (TOM), which
   could measure the network connectivity of a gene defined as the sum of
   its adjacency with all other genes for network generation. To classify
   genes with similar expression profiles into gene modules, average
   linkage hierarchical clustering was conducted according to the
   TOM-based dissimilarity measure with a minimum size (gene group) of 50
   for the genes dendrogram. To further analyze the module, we calculated
   the dissimilarity of module eigengenes, chose a cut line for module
   dendrogram and merged some modules, after which we performed GO
   enrichment analysis and KEGG pathway enrichment analysis on gene
   modules to characterize modules related to ccRCC.

Hub genes and pathway validation

   Here, we chose genes with the most connectivities as hub genes and used
   TCGA KIRC data to perform validation using GEPIA database [77]^21.
   Common pathways in all modules were chosen to perform validation on
   both TCGA data and patients tissues.

Preparation for human ccRCC samples

   The ccRCC and paracancerous tissues samples were collected from
   patients after surgery at Zhongnan Hospital of Wuhan University. The
   histology diagnosis was confirmed by two pathologists independently.
   The ccRCC and paracancerous tissues were immediately frozen and stored
   in liquid nitrogen or fixed in 4 % PFA after collection. The study
   using ccRCC and paracancerous tissue samples for total RNA isolation
   and qRT-PCR analysis was approved by the Ethics Committee at Zhongnan
   Hospital of Wuhan University (approval number: 2015029). Informed
   consent was obtained from all subjects.

Total RNA isolation

   Total RNA from ccRCC tissues were isolated using RNeasy Mini Kit (Cat.
   #74101, Qiagen, Germany) according to the manufacturer's instruction.
   DNase I digestion (Cat. #79254, Qiagen, Germany) was used in each RNA
   preparation to remove genomic DNA. After that, total RNA quantity was
   measured using NanoPhotometer (Cat. #N60, Implen, Germany).

Quantitative real time PCR (qRT-PCR)

   The cDNA was synthesized using 1 µg of total RNA isolated from patients
   tissues by ReverTra Ace qPCR RT Kit (Toyobo, China) and qRT-PCR was
   performed using 400 ng cDNA per 25 μl reaction. Each reaction was
   conducted with iQTM SYBR^® Green Supermix (Bio-Rad, China) using 400 or
   500 ng of cDNA in a final volume of 25 µl. Primers sequences and
   annealing temperature were summarized in Supplementary Tab. S1.

Statistical analyses

   All analyses were performed three times and represent data from three
   individual experiments. Two-tailed Student's t-test was used for
   significance of differences between subgroups. Statistical analyses
   were performed with SPSS 16.0. Statistical significance was set at
   probability values of p < 0.05.

Results

Identification of DEGs in ccRCC tissues

   The gene expression profiling of [78]GSE53000 including 56 ccRCC
   tissues and 6 normal kidney tissues were analysed. Using “limma”
   package of R software, selecting p < 0.05 and |log[2]fold change (FC)|
   > 1 as the cut-off criteria, 1175 DEGs were identified, of which, 533
   were upregulated and 642 were downregulated (Supplementary Tab. S2).
   The volcano plot of all DEGs is shown in Fig. [79]2B. Meanwhile, we
   screened 11 genes differentially expressed in metastasis samples
   compared with primary ccRCC (Supplementary Tab. S3).

Functional enrichment analysis of DEGs

   To obtain further insight into the function of DEGs of ccRCC, the up-
   and downregulated DEGs were respectively uploaded to the DAVID
   database. GO analysis results showed as to biological process (BP), the
   upregulated DEGs significantly enriched in response to wounding,
   positive regulation of immune system process, leukocyte activation,
   immune response and cell activation, and the downregulated DEGs
   significantly enriched in oxidation reduction, monovalent inorganic
   cation transport, ion transport, excretion and anion transport. For
   molecular function (MF), the upregulated DEGs significantly enriched in
   protein homodimerization activity, protein dimerizetion activity,
   phospholipid binding, kinase binding and identical protein binding, and
   downregulated DEGs significantly enriched in sodium ion binding,
   cofactor binding, alkali metal ion binding, coenzyme binding and anion
   transmembrane transporter activity. About cellular component, the
   upregulated DEGs significantly enriched in intrinsic to plasma
   membrane, integral to plasma membrane, cell surface, plasma membrane
   and external side of plasma membrane, and downregulated DEGs
   significantly enriched in plasma membrane part, apical plasma membrane,
   apical part of cell, plasma membrane and basolateral plasma membrane
   (Fig. [80]3A-B). These significantly enriched GO terms could help us a
   lot to further understand the role of DEGs played in ccRCC development
   and progression.

Figure 3.

   [81]Figure 3
   [82]Open in a new tab

   GO analysis of DEGs. (A) upregulated DEGs with fold change > 2. (B)
   downregulated DEGs with fold change < -2. BP: biological process, CC:
   cellular component, MF: molecular function.

KEGG pathway enrichment analysis of DEGs

   KEGG pathway analysis was performed for all DEGs, and we found that
   complement and coagulation cascades, primary immunodeficiency, cell
   adhesion molecules (CAMs), glycolysis / gluconeogenesis and glycine,
   serine and threonine metabolism were most significantly enriched (Fig.
   [83]4).

Figure 4.

   [84]Figure 4
   [85]Open in a new tab

   KEGG enrichment analysis of all DEGs with |fold change| > 2. All
   differentially expressed genes (DEGs) were analysed by KEGG enrichment.
   Fold change > 2 was set as cut-off value.

PPI network construction

   Based on the string profile obtained from STRING analysis tool
   (Supplementary Tab. S3), the PPI network of DEGs consisted of 949 nodes
   and 4774 edges, including 367 upregulated genes and 414 downregulated
   genes. We considered the top 10 DEGs with high degree of connectivity
   as the hub genes of ccRCC (TOP2A, MYC, ALB, CDK1, VEGFA, MMP9, PTPRC,
   CASR, EGFR and PTGS2), which may play a critical role in tumorogenesis
   and proliferation.

Hub module selection and validation

   Degree cut-off = 5, node score cut-off = 0.2, k-core = 2, and max.
   depth = 100 as the criterion, top 3 significant modules were selected
   by using plug-in MCODE. Gene Ontology analysis of each module was
   performed in DAVID database (Fig. [86]5).

Figure 5.

   [87]Figure 5
   [88]Open in a new tab

   Module analysis of PPI network. (A) Module rank 1. (B) GO enrichment
   analysis of module rank 1. (C) Module rank 2. (D) GO enrichment
   analysis of module rank 2. (E) Module rank 3. (F) GO enrichment
   analysis of module rank 3.

Weighted co-expression network construction and analysis

   We used “WGCNA” package in R, after the quality assessment for
   expression data matrix of [89]GSE53000, select the power of β = 6
   (scale free R^2 = 0.92) to ensure a scalefree network (Fig. [90]6A-D).
   After putting the DEGs with similar expression patterns into modules by
   average linkage clustering, a total of 7 modules were identified (Fig.
   [91]6E). Except grey module, Gene Ontology was performed for other 6
   modules so as to explore the underlying biological process correlated
   to ccRCC. In Fig. [92]7, we could find that DEGs in blue module
   significantly enriched in hexose biosynthetic process, oxidation
   reduction and monosaccharide biosynthetic process. In brown module,
   DEGs were enriched in immune response, defense response and
   inflammatory response. In green module, DEGs were enriched in M phase,
   nuclear division and mitosis. In red module, DEGs were enriched in
   blood vessel morphogenesis, blood vessel development and vasculature
   development. In turquoise module, DEGs were enriched in ion transport,
   monovalent inorganic cation transport and cation transport. In yellow
   module, DEGs were enriched in immune response, T cell activation and
   lymphocyte activation. In Fig. [93]8, we found the common pathways
   related to renal carcinoma: PI3K/AKT pathway, Rap1 pathway, NF-kappa B
   pathway and insulin associated pathways.

Figure 6.

   [94]Figure 6
   [95]Open in a new tab

   Determination of soft-thresholding power in the weighted gene
   co-expression network analysis (WGCNA). (A) Analysis of the scale-free
   fit index for various soft-thresholding powers (β). (B) Analysis of the
   mean connectivity for various soft-thresholding powers. (C) Histogram
   of connectivity distribution when β = 6. (D) Checking the scale free
   topology when β = 6. (E) Dendrogram of all differentially expressed
   genes clustered based on a dissimilarity measure (1-TOM).

Figure 7.

   [96]Figure 7
   [97]Open in a new tab

   GO enrichment analysis of 6 genes modules. (A) blue module, (B) brown
   module, (C) green module, (D) red module, (E) turquoise module and (F)
   yellow module.

Figure 8.

   [98]Figure 8
   [99]Open in a new tab

   KEGG pathway enrichment analysis of 6 genes modules. (A) blue module,
   (B) brown module, (C) green module, (D) red module, (E) turquoise
   module and (F) yellow module.

Hub genes and pathway validation

   In Supplementary Fig. [100]S1-2, we performed validation of hub genes
   using TCGA KIRC data and microarray data ([101]GSE53000). We found that
   CASR and PTGS2 were tumor suppressors in both microarray and RNA-Seq
   data; CDK1, EGFR, MMP9, MYC, PTPRC, TOP2A and VEGFA were oncogenes in
   the common expression data. To perform the pathway validation, we chose
   the common pathway related to renal carcinogenesis and validated the
   key molecules using qRT-PCR. We found PIK3CD, PIK3CG and NFKB2 were
   upregulated in tumor samples and PIK3CB, AKT1, AKT2, AKT3, RAP1A and
   RAP1B were downregulated in tumor samples (Fig. [102]9). To further
   validate the pathway, we use TCGA KIRC data to perform the validation
   as well (Fig. [103]10). And we found that there were 7 genes
   deregulated in common validation sets (PIK3CB, PIK3CD, PIK3CG, AKT2,
   AKT3, NFKB2, RAP1A, Fig. [104]11).

Figure 9.

   [105]Figure 9
   [106]Open in a new tab

   Pathway validation using qRT-PCR analysis. (A) PIK3CB, (B) PIK3CD, (C)
   PIK3CG, (D) AKT1, (E) AKT2, (F) AKT3, (G) NFKB2, (H) RAP1A and (I)
   RAP1B.

Figure 10.

   [107]Figure 10
   [108]Open in a new tab

   Pathway validation using TCGA KIRC data. (A) PIK3CB, (B) PIK3CD, (C)
   PIK3CG, (D) AKT1, (E) AKT2, (F) AKT3, (G) NFKB2, (H) RAP1A and (I)
   RAP1B (* p < 0.05; ** p < 0.01; *** p <0.001).

Figure 11.

   [109]Figure 11
   [110]Open in a new tab

   Venn plot of common deregulated genes. Blue: downregulated genes; red:
   upregulated genes.

Discussion

   Clear cell renal cell carcinoma is biologically heterogeneous and has
   variable clinical courses, therefore, it is essential to understand the
   molecular mechanism for better diagnosis and treatment of ccRCC. In
   this study, we investigated the gene expression profile of
   [111]GSE53000, including 56 clear cell renal cell carcinoma samples
   (including 2 lymph node metastasis samples and 1 venous thrombus
   samples) and 6 normal kidney samples to explore the molecular mechanism
   of ccRCC and find some biomarkers, which might be helpful therapeutic
   targets by using bioinformatics analysis.

   In this study, results showed that expressions of total 1175 genes were
   altered between normal kidney tissues and ccRCC tissues at FDR < 0.05.
   Among the 1175 DEGs, 533 were upregulated and 642 were downregulated.
   PPI network analysis and WGCNA analysis were performed to identify
   protein-protein interactions and gene co-expression modules related
   with the clinical features of ccRCC. In addition, functional and
   pathway analysis were also performed to find ccRCC-related biological
   process and pathways.

   According to the GO analysis of DEGs, we found that upregulated DEGs
   were significantly enriched in immune response, cell activation,
   leukocyte activation and positive regulation of immune system process;
   downregulated DEGs were significantly enriched in ion transport,
   monovalent inorganic cation transport, cation transport and anion
   transport. Giraldo NA et al. investigated that the infiltration and the
   localization of DC, and the expression of immune checkpoints (PD-1,
   LAG-3, PD-L1, and PD-L2) in relation with prognosis in the tumor
   microenvironment modulated the clinical impact of CD8(+) T cells in
   ccRCC [112]^22. In addition, Balan M et al. also reported that c-Met
   can promote increased survival of renal cancer cells through the
   regulation of HO-1 and PD-L1 [113]^23. Ciarimboli G et al. found that
   OCT2 played a decisive role in the renal secretion of creatinine and
   the process could be inhibited by OCT2 substrates [114]^24. Pochini L
   et al. also discovered that OCTN cation transporters were associated
   with several pathologies [115]^25. Meanwhile, KEGG pathway analysis
   revealed those glycolysis/gluconeogenesis and glycine, serine and
   threonine metabolisms were significantly enriched. Many studies
   illustrated that carcinogenesis could have very closely correlation
   with metabolism [116]^26^-[117]^28. As to renal carcinoma, significant
   progress had been made to understand the metabolic derangements
   present, which had been derived through translational, in vitro, and in
   vivo studies. So far, von Hippel-Lindau (VHL) loss was the
   well-characterized metabolic features linked to renal cancer [118]^29.
   And several metabolic pathways were altered, including glycolysis and
   oxidative phosphorylation due to VHL loss and the influence caused by
   increasing expression of hypoxia-inducible factor [119]^30^-[120]^35.

   Furthermore, protein-protein interaction network analysis demonstrated
   that TOP2A, MYC, ALB, CDK1, VEGFA, MMP9, PTPRC, CASR, EGFR and PTGS2
   had the highest degree of connectivity among DEGs. TOP2A and CDK1
   played an important role in regulation of cell cycle, which might
   modulate the tumor proliferation [121]^36^-[122]^38. MYC, as an
   oncogene, encoded a transcription factor, triggered selective gene
   expression amplification to promote cell growth and proliferation
   [123]^39. Shroff EH et al. demonstrated that MYC overexpression causes
   RCC and pointed to the inhibition of glutamine metabolism [124]^40. ALB
   was reported to be an independent prognostic factor for patients with
   mRCC treated with angiogenesis-targeted therapy [125]^41. VEGFA, as a
   member of PDGF/VEGF growth factor family, induced proliferation and
   migration of vascular endothelial cells, and was essential for both
   physiological and pathological angiogenesis, which played a vital role
   in renal carcinogenesis [126]^42^, [127]^43. MMP9 (matrix
   metalloproteinase-9) was reported to have a strong correlation with
   tumor invasion and migration [128]^44^-[129]^46. PTPRC, as an essential
   regulator of T- and B-cell antigen receptor signaling, might influence
   tumor proliferation via immune regulation [130]^47. CASR was a robust
   promoter of differentiation in colonic epithelial cells and functions
   as a tumor suppressor in colon cancer [131]^48. EGFR, upregulated in
   ccRCC, was a receptor tyrosine kinase involved in many important
   aspects of cell biology that are related to tumorigenesis
   [132]^49^-[133]^51. PTGS2 was reported to be an prognostic biomarker
   for poor clinical outcome of upper tract urothelial cancer [134]^52.

   WGCNA analysis found six modules with highly relevant expression
   pattern. Then Gene Ontology enrichment analysis was performed to
   explore the biological process of each module. Blue module including
   244 DEGs was involved in glycometabolic process such as hexose
   biosynthetic process, monosaccharide biosynthetic process and hexose
   metabolic process. Many studies had mentioned the close relationship
   between metabolism and renal carcinoma [135]^53^, [136]^54. Brown
   module including 199 DEGs was associated with immune response.
   Meanwhile, yellow module including 182 DEGs also participated in immune
   response, which revealed the correlation between renal cancer and
   immune regulation. 125 DEGs in green module were enriched in cell cycle
   regulation, playing a critical role in tumor proliferation. Red module
   including 71 DEGs was closely related to angiogenesis and vascular
   repair, relating to the prognosis of ccRCC. Turquoise module including
   326 DEGs has a strong correlation to ion transport, which might
   participated in the tumor migration and invasion.

   In conclusion, by using a series of bioinformatics analysis, we have
   illustrated the hub genes and pathways which may be involved in the
   progress of ccRCC, based on differentially expressed genes between
   ccRCC samples and normal kidneys. However, further molecular biological
   experiments are needed to confirm the function of the candidate
   biomarkers in ccRCC.

Supplementary Material

   Supplementary figures and tables.
   [137]Click here for additional data file.^ (9.8MB, pdf)

Acknowledgments