Abstract
Clear cell renal cell carcinoma (ccRCC) is the most prevalent subtype
of kidney cancer. Although there is increasing evidence linking ccRCC
to genetic alterations, the exact molecular mechanism behind this
relationship is not yet completely known to the researchers. Though
drug therapies are the best choice after the metastasis, unfortunately,
the majority of the patients progressively develop resistance against
the therapeutic drugs after receiving it for almost 2 years. In this
case, multi-targeted different variants of therapeutic drugs are
essential for effective treatment against ccRCC. To understand
molecular mechanisms of ccRCC development and progression, and explore
multi-targeted different variants of therapeutic drugs, it is essential
to identify ccRCC-causing key genes (KGs). In order to obtain
ccRCC-causing KGs, at first, we detected 133 common differentially
expressed genes (cDEGs) between ccRCC and control samples based on nine
(9) microarray gene-expression datasets with NCBI accession IDs
[38]GSE16441, [39]GSE53757, [40]GSE66270, [41]GSE66272, [42]GSE16449,
[43]GSE76351, [44]GSE66271, [45]GSE71963, and [46]GSE36895. Then, we
filtered these cDEGs through survival analysis with the independent
TCGA and GTEx database and obtained 54 scDEGs having significant
prognostic power. Next, we used protein-protein interaction (PPI)
network analysis with the reduced set of 54 scDEGs to identify
ccRCC-causing top-ranked eight KGs (PLG, ENO2, ALDOB, UMOD, ALDH6A1,
SLC12A3, SLC12A1, SERPINA5). The pan-cancer analysis with KGs based on
TCGA database showed the significant association with different
subtypes of kidney cancers including ccRCC. The gene regulatory network
(GRN) analysis revealed some crucial transcriptional and
post-transcriptional regulators of KGs. The scDEGs-set enrichment
analysis significantly identified some crucial ccRCC-causing molecular
functions, biological processes, cellular components, and pathways that
are linked to the KGs. The results of DNA methylation study indicated
the hypomethylation and hyper-methylation of KGs, which may lead the
development of ccRCC. The immune infiltrating cell types (CD8+ T and
CD4+ T cell, B cell, neutrophil, dendritic cell and macrophage)
analysis with KGs indicated their significant association in ccRCC,
where KGs are positively correlated with CD4+ T cells, but negatively
correlated with the majority of other immune cells, which is supported
by the literature review also. Then we detected 10 repurposable drug
molecules (Irinotecan, Imatinib, Telaglenastat, Olaparib, RG-4733,
Sorafenib, Sitravatinib, Cabozantinib, Abemaciclib, and Dovitinib.) by
molecular docking with KGs-mediated receptor proteins. Their ADME/T
analysis and cross-validation with the independent receptors, also
supported their potent against ccRCC. Therefore, these outputs might be
useful inputs/resources to the wet-lab researchers and clinicians for
considering an effective treatment strategy against ccRCC.
1. Introduction
Renal carcinoma, a common urinary tract tumor, accounts for 2% to 3% of
adult cancers [[47]1]. Clear cell renal cell carcinomas (ccRCC) account
for 70–85% of all cases of renal cell carcinomas (RCC), which make up
90% of cases overall [[48]2]. There are four pathological phases of
ccRCC, ranging from grade I to grade IV, where the survival rate of 20%
grade IV patients is more than 5-years [[49]3]. With RCC patients
responding poorly to traditional radiation and chemotherapy, surgical
excision remains the most effective therapeutic option [[50]4].
However, drug therapies would be the best choice after the metastasis
[[51]5]. Unfortunately, the majority of the patients progressively
develop resistance against the therapeutic drugs after receiving it for
almost 2 years [[52]6]. Moreover, the combination of ICIs (immune
checkpoint inhibitors) with anti-VEGF (vascular endothelial growth
factor) drugs for the treatment of renal cancer may increase the risk
of cardiovascular events, according to a meta-analysis. This emphasizes
the need for more research in this therapeutic setting [[53]7]. In this
case, multi-targeted different variants of therapeutic drugs may be
essential for effective treatment against ccRCC to increase the
survival length of patients [[54]8,[55]9]. The therapy of RCC can be
improved by integrating Artificial Intelligence (AI), sophisticated
imaging, and genomics [[56]10,[57]11]. To understand molecular
mechanisms of ccRCC development and progression, and explore
multi-targeted different variants of therapeutic drugs, it is essential
to identify ccRCC-causing key genes (molecular signatures).
Nevertheless, it is very difficult to explore top-ranked KGs and
candidate therapeutic agents from huge number of alternatives through
the wet-lab experiments only, since wet-lab experiments are time
consuming, laborious and costly. To overcome this issues,
bioinformatics and system biology approaches playing the significant
roles [[58]12–[59]17]. There are several studies that explored
ccRCC-causing key genes (KGs) [[60]18–[61]36]. However, their KGs-sets
based on single transcriptomics dataset were not so consistent
[[62]18,[63]19,[64]29]. It may have occurred as a result of
environmental and geographical differences in the datasets. To identify
KGs-set that is more reliable and efficient, some researchers analyzed
multiple transcriptomics datasets [[65]20,[66]21,[67]30–[68]36]. As for
example, some individual studies considered two datasets
[[69]30,[70]34,[71]35], three datasets [[72]20–[73]24,[74]36] and at
most four datasets [[75]25–[76]28] for exploring KGs, respectively. Few
of them, nonetheless, investigated the KGs-set guided drug compounds
they had suggested for the therapy of ccRCC [[77]25,[78]28,[79]30].
None of them have yet examined the efficacy of their recommended
therapeutic compounds against the separate KGs that are proposed by
other studies to cause ccRCC. Therefore, in this study, we considered
nine transcriptomics datasets to explore ccRCC causing KGs more
accurately for diagnosis, prognosis and therapies. To find
ccRCC-causing KGs based on nine datasets, first, for each dataset, we
must determine which genes are differentially expressed (DEGs) between
the ccRCC and control samples. Then we have to find the common DEGs
(cDEGs)-set from the nine DEGs-sets. To make the cDEGs-set more
reliable and effective, we have to screening cDEGs-set through the
survival analysis based on independent TCGA database [[80]37]. Then the
reduced cDEGs-set will be used to find the ccRCC-causing key genes
(KGs) through their protein-protein interaction (PPI) network [[81]38]
analysis. To verify these KGs as the ccRCC-causing genes through other
independent database, we have to performed their pan-cancer [[82]39]
analysis. To investigate the pathogenetic processes of KGs, we have to
performed GO functional [[83]40] and KEGG [[84]41] pathway enrichment
analysis, DNA methylation analysis[[85]42] and gene regulatory network
analysis (GRN) [[86]43]. In order to explore, KGs-guided candidate drug
molecules, we have to consider molecular docking analysis. We must take
into consideration molecular docking studies once again in order to
confirm the effectiveness of our suggested medication compounds against
the separate ccRCC causing KGs that have previously been proposed by
other researchers. [87]Fig 1 displays the study’s procedure.
Fig 1. The study’s workflow.
[88]Fig 1
[89]Open in a new tab
2. Methodology
2.1. Data source and descriptions
We took into account both the raw data and the meta-data related to
ccRCC in order to achieve the goal of this research.
2.1.1. Gene expression profile collection
Nine (9) gene expression profile datasets that are publicly available
and have accession numbers [90]GSE16441 [[91]44], [92]GSE53757
[[93]45], [94]GSE66270 [[95]46], [96]GSE66272 [[97]46], [98]GSE16449
[[99]47], [100]GSE76351 [[101]48], [102]GSE66271 [[103]46],
[104]GSE71963 [[105]49] and [106]GSE36895 [[107]50] were downloaded
from the Gene Expression Omnibus (GEO) databae plugin in the National
Center for Biotechnology Information (NCBI) database
([108]https://www.ncbi.nlm.nih.gov/geo/). The details of these 9 gene
expression profile datasets are given in S1 Table in [109]S1 File.
2.1.2. Meta-data collection
A total of 110 ccRCC-associated meta-drug agents (S2 Table in [110]S1
File) were collected from the published articles
[[111]30,[112]51–[113]59], online database [[114]60] and other sources
[[115]61,[116]62] to select the potential drug agents. Additionally, we
chose the top 30 publicly accessible KGs that cause ccRCC (S3 Table in
[117]S1 File) by the literature review
[[118]8,[119]24,[120]30,[121]33,[122]36,[123]63–[124]105] to verify the
performance of the proposed candidate drug agents by molecular docking
analysis against the other independent receptors.
2.2. Identification of DEGs from microarray gene expression profiles
The statistical linear models for microarray (LIMMA) data analysis
[[125]106] is a popular approach for identification of differentially
expressed genes (DEGs) between case and control samples
[[126]107,[127]108].
The LIMMA model is written as,
[MATH:
zg=Xαg
+ϵg
:MATH]
(1)
where z[g] = (z[g1], z[g2],…, z[gn])^/ is the responses (expressions)
m-vector of gth gene with n = n[1]+n[2] samples (g = 1, 2,…, G), X is
an n×2 design matrix, α[g] = (α[g1], α[g2]) ^/ is a 2-vector (2 γ[g] =
(α[g1]−α[g2]) = 0 is true. Otherwise, it is said to be a differentially
expressed gene (DEG). To test the significance of H[0], the test
statistic for LIMMA approach is defined as
[MATH: t˜g=γ^g+γg
S˜gδ
g :MATH]
(2)
which follows t-distribution under H[0]. Adjusted P-values based on the
moderated t-statistic and the average of log[2] fold-change (aLog[2]FC)
values of treatment group with respect to the control group were used
to select DEGs or EEGs as follows,
[MATH: DEGg=
mo>{DEG(Up),ifadj.P.value<0.05andaLog2FCg>
mo>+1.0DEG(Down),ifadj.P.value<0.05andaLog2FCg<
mo>−1.0 :MATH]
Where,
[MATH: aLog2FCg=
mo>{1n1∑i
n1log2(zgiT)−1n2∑j
n2log2(zgjC),ifn1
mrow>≠n21n∑i
nl
og2(zgi
mrow>Tz
gjC),ifn1
mrow>=n2=n :MATH]
Here
[MATH:
zgi
T :MATH]
and
[MATH:
zgj
C :MATH]
are the responses/expressions for the gth gene with the ith case and
jth control samples, respectively. We applied the LIMMA R-package
[[128]109] for computing the P-values and aLog2FC values to separate
the DEGs/EEGs from each of nine datasets, significantly. Then we
selected the common DEGs (cDEGs) from 9 DEGs-sets as ccRCC causing
genes.
2.3. Screening of cDEGs through survival analysis
To select the cDEGs having significant prognostic power, we performed
survival analysis by using the GEPIA2 [[129]110] web-tool linked with
TCGA and GTEx databases. The GEPIA web-tool performed survival analysis
by using Next-Generation Sequencing (NGS) based RNA-seq profile
datasets from both TCGA and GTEx databases. We constructed two
Kaplan-Meier survival probability curves for each of cDEGs based on its
low and high expression groups. The significant difference between two
curves was assessed by using log-rank test with P-value <0.1. The
significant difference between two survival curves indicates the strong
prognostic performance of the gene. The reduced set of cDEGs-set based
on survival analysis was denoted as scDEGs for convenience of
presentation.
2.4. Selection of key genes (KGs) from scDEGs
Proteins interact with one another within cells to carry out their
functions, and the protein-protein interaction (PPI) network’s
information helps us understand how proteins operate better [[130]111].
A PPI network of DEGs was created using the web database and analysis
tool "Search Tool for the Retrieval of Interacting Genes" (STRING v11)
[[131]38]. The STRING web tool produces information based on
anticipated and experimental interactions as well as 3D structures,
additional data, and a confidence score [[132]112]. To improve the
network’s visual representation, the acquired PPIs are analyzed using
Cytoscape [[133]113]. Eleven topological measures rendered available by
the CytoHubba plugin for Cytoscape can be utilized to rank the PPI
network nodes utilized for KG identification [[134]114]. Degree, MNC,
Maximal Clique Centrality (MCC), Bottleneck, Closeness, Betweenness,
and Edge Percolated Component (EPC) were the seven topological measures
that were taken into account in this research. Based on several
topological evaluations of the PPI-network, the greatest percentage of
interactions was used to pick the highest-ranked KGs.
2.5. Functional enrichment analysis of scDEGs highlighting KGs
Functional enrichment analysis of a gene-set identifies biological
functions or pathways that are overrepresented among a set of genes
compared to a background. This process reveals significant associations
between genes and biological terms to understand the roles of those
genes in a specific biological term (function/processes/pathways). The
scDEGs-set enrichment analysis with three (3) Gene Ontology (GO) terms
Biological Process (BP), Molecular Function (MF), and Cellular
Component (CC), were conducted in order to comprehend cellular
function, the molecular activity, and location inside the cell where
scDEGs including KGs perform their tasks [[135]40]. A typical tool for
understanding metabolic pathways is the "Kyoto Encyclopedia of Genes
and Genomes" (KEGG) pathway, which utilizes extensively gene annotation
[[136]41]. The DAVID web server was used to analyze GO terms and KEGG
pathways to understand the biological importance of scDEGs, with a
particular focus on key genes [[137]115]. In this instance, the
threshold was set to the adjusted P-value < 0.05 for selecting the
significantly enriched functions by the Fisher exact test.
2.6. Pan-cancer analysis of KGs
Pan-cancer analysis was performed to verify the association of KGs with
ccRCC including different other cancers based on independent database.
Using the TCGA database [[138]37] and the TIMER2.0 (Tumor Immune
Estimation Resource) web-tool [[139]39], we conducted a pan-cancer
study for each of KGs to look into its potential function in
pan-cancer. The significant difference between two groups for each
cancer was examined by Wilcoxon signed rank test with the cutoff at
P-value <0.1.
2.7. Association of KGs with different immune infiltration levels in ccRCC
Immune infiltration analysis with KGs investigates how their expression
levels are statistically associated with different immune cell types
and their abundance within a tissue or tumor. This analysis helps to
understand how KGs and immune cell infiltration stimulate each other,
providing insights into the interactions between gene expression and
immune responses in the disease context. The Tumor Immune Estimation
Resource (TIMER 2.0) [[140]39] is a comprehensive tool that estimates
the quantity of tumor-infiltrating immune cell types from TCGA data. We
utilized TIMER’s online tools to investigate the immune infiltration
levels of CD8+ T cells, CD4+ T cells, neutrophils, B cells,
macrophages, and dendritic cells with KGs in ccRCC.
2.8. DNA methylation of KGs
DNA methylation of KGs stimulates their expressions and regulations by
adding methyl groups to their DNA. It helps to identify significant
changes in gene activity associated with diseases, providing insights
into gene regulation mechanisms and contributing to understanding
disease processes [[141]116]. MethSurv web-tools with TCGA-KIRC
methylation data were used to investigate DNA methylation, a complex
epigenetic process that controls gene expression in both normal and
malignant cells [[142]42]. DNA methylation values (ranging from 0 to 1)
were represented by β values, which were computed as M/(M + U + 100)
for every CpG site. The intensities of methylation and unmethylation
are represented by M and U, respectively. We classified the methylation
levels into two groups based on the difference in methylation β value
between the cut-off point and higher (methylation β value above) in
order to assess the impact on patient survival. One can use data
quantiles or the means to get the grouping cut-off point.
2.9. Detection of top regulators of KGs
Transcription factors (TFs) and microRNAs (miRNAs) are considered as
the transcriptional and post-transcriptional regulators of protein
coding genes. The regulatory network analysis was conducted to explore
TFs-proteins and miRNAs as the transcriptional and post-transcriptional
regulators of KGs. The network of TFs and KGs links were examined by
using the JASPAR database [[143]117] in order to pinpoint the primary
TFs associated with KGs. By using the TarBase database [[144]118] to
examine miRNA-KG connections, we identified key miRNAs that impact KGs
at the post-transcriptional level. NetworkAnalyst [[145]119] was used
to replicate these relationships. The post-transcriptional regulators
of KGs were selected from the top-ranked miRNAs. The Cytoscape tool
[[146]113] was used to show the networks of their interactions.
2.10. KGs-guided drug repurposing
There are two main in-silico methods for drug discovery: de-novo
design, which is time-consuming and costly, and drug repurposing (DR),
which utilizes existing, approved drugs for new diseases. Both methods
use molecular docking to evaluate potential drug candidates, by
computing their binding affinities and interactions with target
proteins. Molecular docking analysis is a popular approach for
exploring potential drug-molecules from huge number of alternatives to
inhibit disease-causing genes [[147]120,[148]121]. Therefore, in this
study, we also used this analysis to investigate KGs-guided potential
therapeutic molecules/ligands for the treatment of ccRCC. For the
purposes of docking study, we considered KGs-mediated receptor proteins
and the regulatory TFs proteins that accompany them. The Protein Data
Bank (PDB) [[149]122], AlphaFold Protein Structure Database [[150]123],
and SWISS-MODEL [[151]124] were the sources from which the
three-dimensional (3D) receptor structures were obtained. The 3D
structures of possible medicinal agents were obtained by searching the
PubChem database [[152]125]. The 3D architectures of protein
interactions were visualized using the Discovery Studio Visualizer
[[153]126]. AutoDock tools were used to pre-process the receptor
proteins, which involved removing water molecules and adding charges
[[154]127]. The Avogadro [[155]128] was used to reduce the energy of
the drugs agents, and AutoDock tools [[156]127] were used to preprocess
them. After that, AutoDock Vina was used to perform molecular docking
between receptors and ligands in order to determine their binding
affinity scores (kcal/mol) [[157]129]. Let B[mn] denote the binding
affinity score between m^th receptors (m = 1, 2, …, q) and n^th
ligands/agents (n = 1, 2,…, p). Receptors were sorted based on the
descending order of their average scores
[MATH:
(1q
mfrac>∑n=1
pBmn,m=1,2…q
) :MATH]
, while ligands/agents were ranked by the descending order of their
average scores
[MATH:
(1p
mfrac>∑m=1
qBmn,n=1,2<
mo>,…,p) :MATH]
. This process was used to select the top-ranked ligands/agents as
potential drug candidates.
2.11. ADME/T analysis of top-ranked drug molecules
ADME/T analysis assesses a drug candidate’s absorption, distribution,
metabolism, excretion, and toxicity to predict its safety and
effectiveness. In drug repurposing, a compound may show strong binding
in docking studies with a new target, suggesting potential for a new
use. However, if it fails ADMET criteria—like poor absorption, rapid
metabolism, or high toxicity—it is unlikely to succeed in this new
role. ADMET analysis helps filter out such candidates, ensuring that
only those with favorable pharmacokinetics and safety profiles move
forward in the repurposing process. We analyzed the drug-like
properties and ADME/T (absorption, distribution, metabolism, excretion,
and toxicity) profiles of the top ten ranked drug compounds to better
understand their structural features and chemical descriptors. The
SCFBio web application [[158]130] was used to evaluate compliance with
Lipinski’s rule. ADME/T parameters were then predicted using the online
databases SwissADME [[159]131] and pkCSM [[160]132], utilizing the
optimal structures of the drug compounds in SMILES format for the
calculations.
3. Results
3.1. Identification of DEGs by LIMMA approach
We have identified the genes that are differentially expressed (DEGs)
in each of the nine datasets comparing ccRCC and control samples by
applying the statistical LIMMA technique. Adj.P.value ≤ 0.05 and
log[2]FC > 1 for upregulated DEGs and adj.P.value ≤ 0.05 and logFC < -1
for downregulated DEGs were the cutoff values, as stated in section
2.2. From nine datasets with NCBI accession IDs [161]GSE16441,
[162]GSE53757, [163]GSE66270, [164]GSE66272, [165]GSE8449,
[166]GSE76351, [167]GSE66271, [168]GSE71963 and [169]GSE36895, we
identified 413, 1602, 2430, 2929, 1448, 899, 2748, 314, 1160
upregulated DEGs and 1168, 352, 1294, 1584, 1378, 694, 2473, 496, 1980
downregulated DEGs, respectively. The total number of cDEGs was found
133; of these, 26 were downregulated and the remaining 107 were
upregulated (S4 Table in [170]S1 File).
3.2 Screening cDEGs through survival analysis
We selected 54 cDEGs out of 133 based on the significant difference
(P-value ≤0.1) between two survival probability curves with the low and
high expression groups in the independent TCGA and GTEx databases (S5
Table in [171]S1 File). The reduced cDEGs-set based on survival
analysis was denoted as scDEGs for convenience of presentation.
3.3. Identification of Key Genes (KGs) from scDEGs
A PPI network with 113 edges and 54 nodes for 54 scDEGs was built.
Using seven topological measures (Degree, MNC, MCC, EPC, Closeness,
Betweenness and BottleNeck) with the PPI network, we detected the top
eight KGs that are PLG, ENO2, ALDOB, UMOD, ALDH6A1, SLC12A3, SLC12A1,
and SERPINA5 ([172]Fig 2 and S6 Table in [173]S1 File).
Fig 2.
[174]Fig 2
[175]Open in a new tab
(a) Identification of KGs by the PPI network analysis of scDEGs, Where
Hexagonal nodes in blue colour represent the KGs, (b) Survival
probability curves with the low and high expression groups with KGs.
3.4. Functional analysis of scDEGs highlighting KGs
To investigate the KG-involved pathogenetic processes of ccRCC, we
conducted GO and KEGG pathway enrichment analysis on 133 scDEGs
([176]Table 1). In order to highlight the KGs, the top five keywords
from each of the KEGG pathways, cellular components, molecular
functions and biological processes (CC, MF and BP) were considered in
this work. potassium ion homeostasis, glycolytic process, chloride ion
homeostasis, response to xenobiotic stimulus, chloride transmembrane
transport were the top 5 highly enriched BPs; CCs were apical plasma
membrane, extracellular exosome, mitochondrial matrix, extracellular
region, plasma membrane; MFs were heparin binding
sodium:potassium:chloride symporter activity, cation: chloride
symporter activity, potassium: chloride symporter activity, identical
protein binding, and KEGG pathways were Metabolic pathways, Complement
and coagulation cascades, Propanoate metabolism and Carbon metabolism.
Table 1. Significantly enhanced KEGG pathways and GO function with cDEGs
through the involvement of KGs linked to the pathogenetic processes of ccRCC.
Biological processes (BPs)
GO ID GO function No of scDEGs P.Value Associated KGs
GO:0055075 potassium ion homeostasis 10 0.001 SLC12A 3, SLC12A1, UMOD
GO:0006096 glycolytic process 9 0.010 ALDOB, ENO2, ALDH6A1
GO:0055064 chloride ion homeostasis 5 0.028 SLC12A3, ALDH6A1
GO:0009410 response to xenobiotic stimulus 9 0.029 UMOD, ENO2
GO:1902476 chloride transmembrane transport 13 0.030 SLC12A1, SERPINA5
Cellular Components (CC)
GO ID GO function No of scDEGs P.Value Associated KGs
GO:0016324 apical plasma membrane 8 4.37E-05 SLC12A3, SLC12A1, UMOD
GO:0070062 extracellular exosome 17 7.36E-05 SLC12A3, SLC12A1, UMOD,
PLG, ENO2, SERPINA5, ALDOB
GO:0005759 mitochondrial matrix 6 0.003 ALDH6A1, PLG, ENO2
GO:0005576 extracellular region 12 0.015 SERPINA5, SERPINA5, ALDOB,
GO:0005886 plasma membrane 21 0.031 SLC12A3, SLC12A1, PLG,ENO2
Molecular Functions (MFs)
GO ID GO function No of scDEGs P.Value Associated KGs
GO:0008201 heparin binding 5 0.001 SERPINA5, UMOD, PLG, ENO2
GO:0008511 sodium:potassium:chloride symporter activity 8 0.006
SLC12A3, SLC12A1, PLG, ENO2
GO:0015377 cation: chloride symporter activity 4 0.009 SLC12A3,
SLC12A1, PLG, ENO2
GO:0015379 potassium: chloride symporter activity 4 0.020 SLC12A3,
SLC12A1, ALDH6A1, ENO2
GO:0042802 identical protein binding 9 0.055 ALDOB, ENO2
KEGG Pathways
KEGG ID KEGG function No of scDEGs P.Value Associated KGs
hsa01100 Metabolic pathways 13 0.002 ENO2, ALDH6A1
hsa04610 Complement and coagulation cascades 6 0.003 PLG, SERPINA5
hsa00640 Propanoate metabolism 4 0.005 ALDH6A1, SLC12A3, SLC12A1, PLG
hsa01200 Carbon metabolism 9 0.006 ALDH6A1, ALDOB, ENO2
[177]Open in a new tab
3.5. Pan-cancer analysis of KGs
To examine the potential function of KGs in pan-cancer, we also
conducted a pan-cancer study using the TCGA database, focusing on the
top 20 malignancies, as seen in [178]Fig 3. The kidney chromophobe,
kidney renal papillary cell carcinoma, and kidney renal cear cell
carcinoma are the three tumors that we demonstrated in this figure to
be substantially correlated with KGs (S8 Table in [179]S1 File).
Fig 3. Using the TCGA database and the TIMER2 online tool, pan-cancer
analysis was used to identify the top 20 cancers associated with KGs.
[180]Fig 3
[181]Open in a new tab
Orange and green colors, respectively, indicate significant (P-value
<0.01) and insignificant (P-value ≥0.01) pairwise gene-disease
relationships.
3.6. Immune infiltration level analysis of KGs in ccRCC
The tumor microenvironment (TME) is a complex environment composed of
different stromal components including immune cell along with the tumor
cells [[182]133–[183]135]. Immune infiltration analysis involves
assessing the types, densities, and functional states of immune cells
within the tumor microenvironment to understand their impact on cancer
progression and therapeutic responses, particularly in the context of
immunotherapy efficacy [[184]136–[185]139]. RCC is one of the tumors
with the highest immune infiltration rate [[186]140–[187]142]. The
Tumor Immune Estimation Resource (TIMER 2.0) [[188]39] is a
comprehensive tool that estimates the quantity of tumor-infiltrating
immune cell types from TCGA data. To predict the infiltration of immune
cells in ccRCC by the TIMER algorithm, we assessed the correlations
between the expression levels of the KGs and the levels of infiltration
of six immune cells (CD8+ T cell, B cell, CD4+ T cell, dendritic cell,
neutrophil, and macrophage) (S1 Fig in [189]S1 File). These findings
indicate that the majority of tumor-infiltrating immune cells in ccRCC,
with the exception of CD4+ T(0.11≥ Rho ≥ 0.28) cells, have a negative
correlation with KGs (CD8+ T cell (-0.20 ≤ Rho ≤ 0.32), B cell (-0.11≤
Rho ≤ 0.24), Macrophage (-0.23≤ Rho ≤ 0.21), Neutrophil (-0.14 ≤ Rho ≤
0.29)), indicating a special relationship between tumor metabolism and
immunological infiltration. This result could help to discover
potential immunotherapy for ccRCC. These findings strongly suggest that
KGs plays an important role in immune infiltration in ccRCC.
3.7. DNA methylation of KGs
The development of cancer may be significantly influenced by DNA
methylation at several CpG (CG) sites in KGs. Thus, we used the
TCGA-KIRC methylation data and the MethSurv online tool to analyze DNA
methylation at CpG sites for KGs (PLG, ALDH6A1, ENO2, ALDOB, SERPINA5,
SLC12A1, SLC12A3, and UMOD). [190]Table 2 shows that all KGs were
notably (P-value of <0.01) within 5 KGs (ENO2, SERPINA5, SLC12A1,
SLC12A3, UMOD) were hypomethylated and 3 KGs (PLG, ALDH6A1, ALDOB) were
hypermethylated at various CpG locations.
Table 2. The significant prognostic value of CpG in KGs.
KGs Genomic Region Relation to CpG island CpG site HR LR test (P-Value)
PLG TSS1500 Open_Sea cg04181478 2.109 0.003596
PLG 5’UTR Open_Sea cg08531365 2.834 3.22E-07
ALDH6A1 Body N_Shelf cg14849468 3.099 4.07E-05
ENO2 TSS200 Island cg02531908 0.424 0.000725
ENO2 TSS200 Island cg22977254 0.488 0.001793
ALDOB TSS1500 Open_Sea cg13585586 2.742 0.000193
ALDOB 5’UTR Open_Sea cg14316227 2.742 0.000312
SERPINA5 5’UTR Open_Sea cg16937611 0.462 0.003249
SERPINA5 Body Open_Sea cg18731460 1.934 0.001429
SLC12A1 TSS1500 Open_Sea cg07915221 0.442 0.000104
SLC12A1 Body Open_Sea cg14930674 0.422 1.18E-05
SLC12A1 Body Open_Sea cg23530596 3.148 3.19E-05
SLC12A3 TSS200 Open_Sea cg01191774 0.295 9.22E-06
SLC12A3 1stExon Open_Sea cg03455024 0.432 0.001318
SLC12A3 TSS200 Open_Sea cg04303879 0.46 0.001954
UMOD Body S_Shore cg03084308 0.292 7.18E-06
UMOD Body Island cg06861044 3.327 4.78E-05
UMOD 5’UTR S_Shelf cg07456201 0.393 0.000291
UMOD 3’UTR Open_Sea cg07817806 2.062 0.006039
UMOD Body N_Shelf cg27605307 0.431 0.001249874
[191]Open in a new tab
3.8. Identification of key regulators for KGs
We investigated the networks of TFs and miRNAs with KGs to identify the
transcriptional and post-transcriptional regulators of the KGs.
Initially, we chose the top four transcription factors (TFs), namely
GATA2, FOXC1, NFIC, and STAT3, to serve as the transcriptional
regulators of KGs. This selection was based on two key topological
metrics: betweenness and degree, where the cutoff values were set at
≥75.88 and ≥3, respectively ([192]Fig 4A). Then, applying the same
topological metrics with cutoff degree ≥2 and betweenness ≥120,
respectively, we chose the top five miRNAs (hsa-mir-124-3p,
hsa-mir-34a-5p, hsa-mir-27a-3p, hsa-mir-335-5p, and hsa-mir-200b-3p) as
the post-transcriptional regulators KGs ([193]Fig 4B).
Fig 4.
[194]Fig 4
[195]Open in a new tab
(A) miRNA-KGs interaction network with. Here, KGs were denoted by
hexagon-shaped markings in the color red. The top miRNAs were denoted
by circles with a blue color. (B) TF-KGs interaction network. Here, KGs
were denoted with yellow hexagon shapes and the top -TFs denoted with
pink circles.
3.9. Exploring KGs-guided candidate drugs by molecular docking
As the m = 12 drug target receptors, we employed 8 KGs and the 4 TFs
that govern them. The three-dimensional (3D) structures of eight of
them (ALDOB, ENO2, OCLN, PLG, SERPINA5, STAT3, GATA2, UMOD) were
obtained from Protein Data Bank (PDB) by using the PBD codes 8d44,
2akz, 1xaw, 1b2i, 2ol2, 5u5s, 5o9b, and 6zs5). The 3D-structures of
ALDH6A1, FOXC1, SLC12A1, and SLC12A3 were obtained using the "AlphaFold
Protein Structure Database" using UniProt with ID [196]Q02252,
[197]Q12948, [198]Q13621 and [199]P55017. Uniport ID [200]P08651 was
used to obtain the 3D structures of the rest targets (NFIC) from
SWISS-MODEL. Molecular docking study between our potential receptors
and meta-drug compounds was used to compute the binding affinity scores
(BAS). Subsequently, we choose a limited set of drugs as potential
drugs by arranging the target receptors according to their row sums and
column sums in the binding affinity matrix. A = (Xij) ([201]Fig 5A & S8
Table in [202]S1 File). All 13 receptor proteins had strong bindings
with the top four lead drugs, irinotecan, imatinib, telaglenastat, and
RG-4733 (BAS< - 7.0 kcal/mol). Then, 12 of the 13 receptor proteins had
strong bonds with sorafenib, nexavar, sitravatinib, dactinomycin,
olaparib, and capozantinib (BAS < - 7.0 kcal/mol). Consequently, we
examined those ten lead drugs as possible ccRCC treatments. To assess
the potential of the indicated candidate drugs to bind to the most
sophisticated alternative independent receptors, we looked at 48 papers
that postulated KGs that may induce ccRCC. ALB, ALDH6A1, ALDOB, BIRC5,
CASR, CCNB2, CCND1,CEP55, CXCL12, CXCR4, EGF,EGFR, ENO2, FBP1,
FOXM1,HMGCS2, HSD11B1, KIF20A, KNG1, MELK, OGDHL, PCK1,PLG, PTPRC,RRM2,
SLC12A1, TOP2A,TPX2, UBE2C, VSIG4 were published in at least three
studies were taken into consideration (S3 Table in [203]S1 File). These
30 KGs-induced proteins were thought to be distinct receptors. The 3D
structures of ALB, ALDOB, BIRC5, CASR, CCND1, CEP55, CXCL12, CXCR4,
EGF, EGFR, ENO2, FBPI, FOXM1, HMGCS2, KIF20A, KNG1, MELK, PCK1, PLG,
PTPRC, RRM2, TOP2A, TOX2, UBE2C, and VSIG4 were obtained using the
codes 1e7a, 2akz, 1xox, 5k5t, 6p8e, 3e1r, 1sdf, 2n55, 1p9j, 1z9i, 1b2i,
1fta, 7fj2, 2wya, 6yip, 6f3v, 4d2p, 1M51, 1b2i 5fn7, 30lj, 1zxm, 6vph,
1i7k, and 2icc, respectively. The UniProt IDs [204]Q02252, [205]O95067,
[206]P80365, [207]Q9ULD0, and [208]P55017 allowed researchers to get
the three-dimensional structures of ALDH6A1, CCNB2, HSD11B1, OGDHL, and
SLC12A1 from the "AlphaFold Protein Structure Database". We found that
our top 10 proposed drug agents can also effectively bind to 30
independent receptors. Nearly all of these receptors demonstrated
strong binding affinities with these agents ([209]Fig 5B and S9 Table
in [210]S1 File). As a result, we heartily advise that the ten drugs
that have been suggested (Irinotecan, Imatinib, Telaglenastat,
Olaparib, RG-4733, Sorafenib, Sitravatinib, Cabozantinib, Abemaciclib,
and Dovitinib) may be better options for treating ccRCC. S10 Table in
[211]S1 File displays the 3D interactions between therapeutic compounds
and the top three likely receptors.
Fig 5.
[212]Fig 5
[213]Open in a new tab
Image of drug-target binding affinity matrices (A) X-axis indicates
top-ordered 40 drug agents (out of 110) and Y-axis indicates ordered
proposed receptor proteins. (B) the top 10 proposed and published drugs
(out of 111) in the Y-axis and the top 30 target proteins (published by
others) in the X-axis.
3.10. Drug likeness, ADME and toxicity analysis
The ADME properties of a drug molecule are employed to evaluate how it
is absorbed, distributed, metabolized, excreted, and its potential
toxicity. Conversely, drug-likeness properties analyze the molecule’s
physicochemical characteristics to understand its diverse chemical
properties ([214]Table 3). According to Lipinski’s rule, these ten
(Irinotecan, Imatinib, Telaglenastat, Olaparib, RG-4733, Sorafenib,
Sitravatinib, Cabozantinib, Abemaciclib, and Dovitinib) drug molecules
were classified as drugs as they satisfied at least four of Lipinski’s
five requirements (S11 Table in [215]S1 File). The standard range (1 to
less than equal 5) [[216]130] of Lipinski’s rule is supported by the
lipophilicity (LogP value) of these 10 medications. Based on a
comparison of their LogP values with standard values, these substances
are identified as lipophilic. Consequently, almost all of the
drug-likeness criteria has been met by our proposed top ten drug
compounds (S11 Table in [217]S1 File). To assess the efficacy and
indemnity of the suggested drugs, many parameters can be used to
analyze their toxicity and ADME analyses. The chemicals are a possible
oral medication candidate since it is expected that they would be
sufficiently absorbed in the gastrointestinal tract. In the human
intestines, a chemical is considered well absorbed if its Human
Intestinal Absorption (HIA) score is > 30% [[218]143,[219]144]. Our
study revealed that, of the ten medications we suggested, all had a
high HIA score of ≥68%, indicating strong absorption characteristics by
the human body. Additionally, our top eight pharmaceuticals, with the
exception of RG-4733 and dovitinib, were able to block the
P-glycoprotein inhibitor (P-gpI). The blood-brain barrier (BBB)
permeability index calculates a compound’s capacity to pass the BBB.
When a compound’s LogBB value is less than -1, it is thought to be
poorly disseminated over the BB barrier, but compounds with a LogBB
value of 0.3 or above can possibly pass the BBB. After evaluating all
substances, it was found that none of the suggested medications could
effectively cross the blood-brain barrier, as all had BBB values below
0.3 ([220]Table 3). Additionally, they are thought to partially reach
the central nervous system based on the LogPS (CNS) value. Human
cytochrome P 450 (CYP) enzymes are hemoproteins that are membrane-bound
and are crucial to homeostasis, drug detoxification, and cellular
metabolism. More than one CYP from CYP classes 1–3 is responsible for
over 80% of the oxidative metabolism and around 50% of all common
clinical drug elimination in humans [[221]145]. All medications possess
the YES characteristics necessary to block the human body’s CYP3A4
membrane. A drug’s toxicity is proportional to its LC50 value; a
smaller number denotes more toxicity (<1.0 log mM). All of our proposed
drugs had LC50 value greater than 1.0 log mM. The toxicity analyses of
our proposed compounds evaluating AMES tests, fatal dose LD50, and
minnow toxicity LC50 indicated that they were inert across all
criteria. Consequently, they are predicted to be non-toxic, displaying
drug-like properties and suitability for oral consumption.
Table 3. ADME/T profile of top-ranked ten drugs.
Compounds Absorption Desorption Metabolism Excretion Toxicity
Caco2 Permeability HIA
(%) P-gpI BBB CNS CYP3A4 TC AMES LC[50]
(log mM) LD[50]
(mole/kg)
(Permeability)
Imatinib 1.09 93.84 Yes -1.37 -2.51 Yes 0.71 No 2.08 2.9
Irinotecan 0.64 99.87 Yes -1.30 -3.23 Yes 0.93 No 1.78 2.81
Telaglenastat 0.77 80.76 Yes -1.15 -3.67 Yes 0.63 No 2.32 2.13
Olaparib 1.08 91.92 Yes -0.85 -2.65 Yes 0.56 No 1.96 2.62
RG-4733 -0.59 68.00 No -1.28 -3.12 Yes 0.97 No 2.83 2.48
Sorafenib 0.76 85.49 Yes -1.47 -2.02 Yes 0.61 No 1.51 2.14
Sitravatinib 0.34 99.71 Yes -1.92 -3.37 Yes 2.09 No 1.94 3.11
Cabozantinib 0.16 100 Yes -0.67 -3.04 Yes 0.75 No 3.37 2.44
Abemaciclib 1.53 83.94 Yes -1.57 -3.21 Yes 0.58 No 1.78 2.37
Dovitinib 0.47 83.63 No -0.70 -2.26 Yes 0.75 No 3.03 2.40
[222]Open in a new tab
4. Discussion
RCC, is often considered as a metabolic disease, especially ccRCC. It’s
target gene mutations associated with metabolic pathways are a clear
characteristic of RCC [[223]146–[224]149]. To investigate the genetic
mechanism of ccRCC, we identified ccRCC-causing key genes (KGs)
highlighting their pathogenetic processes. In order to explore
ccRCC-causing KGs, we analyzed nine microarray gene-expression datasets
([225]GSE16441, [226]GSE53757, [227]GSE66270, [228]GSE66272,
[229]GSE16449, [230]GSE76351, [231]GSE66271, [232]GSE71963 and
[233]GSE36895). To analyze the datasets, we first used the statistical
LIMMA technique for finding the DEGs between ccRCC and control samples
from each of the nine datasets. 133 common DEGs (cDEGs) from nine
DEGs-sets were identified. Then, we selected top-ranked 54 cDEGs by the
screening of cDEGs through the survival analysis. For convenience of
presentation, we denoted this reduced cDEGs-set as the screened cDEGs
(scDEGs). Then the PPI network of scDEGs identified top-ranked eight
scDEGs (PLG, ENO2, ALDOB, UMOD, ALDH6A1, SLC12A3, SLC12A1, SERPINA5) as
the ccRCC-causing KGs ([234]Fig 2). The pan-cancer analysis also
significantly supported their association with ccRCC. Additionally,
several independent research have supported these KGs as the ccRCC
causing KGs
[[235]8,[236]33,[237]36,[238]64,[239]71,[240]73,[241]75,[242]78,[243]80
,[244]99–[245]101,[246]150,[247]151] as displayed in [248]Fig 6.
Fig 6. Supporting articles for the proposed ccRCC-causing KGs and suggested
drug molecules.
[249]Fig 6
[250]Open in a new tab
(A) Supporting articles with the KGs and (B) supporting articles with
the suggested drugs, where the red ellipse indicates FDA-approved drugs
and the experimental drugs are indicated by the green ellipse.
ALDOB, a key enzyme in glycolysis ([251]Table 1), facilitates the
breakdown of fructose-1,6-bisphosphate into glyceraldehyde and
dihydroxyacetone phosphate. This function is crucial for sustaining
high glycolytic activity in ccRCC cells [[252]80]. The downregulation
of ALDOB by STAT3 and GATA2 ([253]Fig 4B) promotes enhanced glycolytic
flux, which is essential for meeting the increased energy demands of
rapidly dividing tumor cells[[254]152]. The growth and development of
ccRCC are promoted by ALDOB downregulation both in vitro and in vivo
[[255]153]. Furthermore, its hypermethylation at CpG sites cg13585586
and cg14316227 is linked to an elevated risk of ccRCC. The HR value of
2.742 indicates that higher methylation levels of ALDOB might be
involved in tumor progression ([256]Table 2). This gene’s positive
correlation with immune cells, like T cells, suggests its involvement
in immune evasion and modulation, which can contribute to tumor
development. This association implies the gene may support tumor growth
by creating a microenvironment conducive to immune escape [[257]152]
(S1 Fig in [258]S1 File). The gene ENO2 plays a pivotal role in the
glycolytic pathway ([259]Table 1), converting 2-phosphoglycerate to
phosphoenolpyruvate. This reaction is a crucial step in the production
of ATP, which is essential for the energy demands of rapidly
proliferating cancer cells. In ccRCC, the upregulation of ENO2 is
regulated by FOXC1 ([260]Fig 4B), which promotes glycolysis and
supports the Warburg effect—a metabolic shift observed in many cancers
where cells rely on glycolysis for energy production even in the
presence of oxygen [[261]154]. This adaptation allows ccRCC cells to
thrive in a hypoxic microenvironment by maintaining energy production
and supporting tumor growth [[262]155]. ENO2 shows protective effects
through hypomethylation, with CpG sites cg02531908 and cg22977254
having hazard ratios of 0.424 and 0.488, indicating a decreased risk of
ccRCC ([263]Table 2). It has negative correlations with immune cells
like B cells and dendritic cells. This suggests it might be involved in
reducing immune cell presence, possibly helping the tumor escape immune
detection detection and promoting tumor progression [[264]156]. UMOD
influences renal illnesses including familial juvenile hyperuricemic
nephropathy and medullary cystic kidney disease, which can lead to
renal cancer [[265]157][.] UMOD is involved in potassium ion
homeostasis and stress response ([266]Table 1) [[267]158]. In ccRCC,
UMOD’s expression is regulated by STAT3 and FOXC1 ([268]Fig 4B), which
enhances its role in maintaining ionic balance and managing cellular
stress [[269]159].UMOD helps ccRCC cells adapt to the challenging tumor
microenvironment by regulating the release of extracellular vesicles,
which can affect neighboring cells and modulate the immune response
[[270]159]. Its exhibits both protective and risk-associated
methylation patterns in ccRCC. Hyper-methylation at CpG site cg07817806
is associated with a higher HR of 2.062, suggesting a potential role in
increased risk, while other sites show complex interactions with
disease risk ([271]Table 2). It positively correlates with CD8+ T cells
and dendritic cells, suggesting a role in supporting cytotoxic T cell
infiltration and antigen presentation, which may help ccRCC cells
modulate the immune environment to survive [[272]160] (S1 Fig in
[273]S1 File). By controlling the metabolism of aldehydes, the key gene
ALDH6A1 might have an impact on detoxification procedures, intermediate
metabolism, and cellular redox balance, all of which might accelerate
the tumor’s development in kidney. In ccRCC, NFIC and GATA2 regulate
ALDH6A1 expression ([274]Fig 4B), enhancing its capacity to manage
oxidative stress and detoxify harmful metabolites, which is crucial in
the tumor environment [[275]161]. This gene modulates the
methylmalonyl-CoA metabolic pathway ([276]Table 1), which impacts
cancer stem cells (CSCs) in ccRCC [[277]162]. It shows a significant
risk association with ccRCC, with hypermethylation at CpG site
cg14849468 yielding a hazard ratio of 3.099. This suggests that
increased methylation of ALDH6A1 could contribute to the development or
worsening of ccRCC ([278]Table 2). ALDH6A1 could be a key player in
immune modulation within the tumor microenvironment. The positive
correlation with T cell CD8+ infiltration hints at a possible role in
enhancing cytotoxic immune responses, which might impact tumor
progression or response to immunotherapies in ccRCC [[279]163] (S1 Fig
in [280]S1 File). According to a study, the key gene SLC12A3 is
implicated in the formation of ccRCC and is suggested as a biomarker
for ccRCC diagnosis and treatment [[281]164]. Additionally, the
expression of SLC12A1 is strongly correlated with patient survival
duration in renal carcinoma [[282]150]. Both SLC12A3 and SLC12A1 are
essential for regulating ion transport in ccRCC ([283]Table 1). SLC12A3
is regulated by STAT3 and GATA2 ([284]Fig 4B), maintains potassium and
chloride balance ([285]Table 1), which is crucial for cellular
homeostasis under stress and hypoxia. Similarly, SLC12A1 is regulated
by NFIC, GATA2 and FOXC1([286]Fig 4B), ensures ionic balance and
cellular volume, thereby supporting tumor cell survival and aggressive
behavior [[287]159]. SLC12A1 shows a protective effect, as
hypomethylation at CpG sites cg07915221 and cg14930674 is associated
with hazard ratios of 0.442 and 0.422, which suggest a reduced risk of
ccRCC. Similarly, SLC12A3 shows hypomethylation associations with
reduced risk of ccRCC. CpG sites cg01191774 and cg03455024 have hazard
ratios of 0.295 and 0.432, suggesting a protective effect against the
disease ([288]Table 2). Furthermore, SLC12A1 Shows strong positive
correlation with CD8+ T cells and negative with macrophages and
neutrophils, indicating its role in promoting cytotoxic T cell activity
and potentially reducing macrophage-mediated immune suppression in
ccRCC. SLC12A3 Positively correlates with both CD8+ and CD4+ T cells
and dendritic cells, pointing to its involvement in enhancing T
cell-mediated immunity and antigen presentation, which could be crucial
for anti-tumor responses in ccRCC [[289]165] (S1 Fig in [290]S1 File).
The Plasminogen (PLG), a blood zymogen produced and involved in
anti-tumor growth because it inhibits angiogenesis [[291]166]. It is
involved in the complement and coagulation cascades ([292]Table 1),
contributing to a pro-thrombotic environment that supports tumor growth
and metastasis. STAT3 regulates PLG expression ([293]Fig 4B), promoting
a thrombotic state that facilitates tumor cell migration and invasion
in ccRCC [[294]167]. It is associated with an increased risk of ccRCC
through hypermethylation of CpG sites. Specifically, methylation at
cg04181478 and cg08531365 results in hazard ratios of 2.109 and 2.834,
respectively, indicating that elevated PLG methylation correlates with
a higher likelihood of disease progression ([295]Table 2). PLG
positively correlates with CD8+ T cells and dendritic cells, suggesting
it supports cytotoxic T cell infiltration and antigen presentation,
aiding ccRCC cells in modulating the immune environment for survival.
Its negative association with macrophages indicates a complex role in
selectively enhancing or suppressing immune pathways, which may
influence the tumor microenvironment and affect disease progression
[[296]168] (S1 Fig in [297]S1 File). SERPINA5, also known as
antithrombin, modulates thrombin activity within the coagulation
cascade ([298]Table 1). According to reports SERPINA5 plays a
protective function against tumor formation, invasiveness, and
metastasis and is dysregulated in renal, breast, prostate, liver, and
ovarian malignancies [[299]169]. Its expression is regulated by NFIC
([300]Fig 4B), which influences its levels and effects coagulation and
immune responses. By contributing to a pro-thrombotic environment,
SERPINA5 supports ccRCC progression and metastasis, aiding in the
tumor’s ability to evade immune surveillance and promote aggressive
behavior. It also shows a potential protective effect against ccRCC,
with hypomethylation at CpG sites cg16937611 and cg18731460 linked to
decreased risk. The HR value of 0.462 suggests that lower methylation
of SERPINA5 could be associated with reduced disease susceptibility
([301]Table 2). SERPINA5 may facilitate immune evasion in ccRCC by
promoting CD4+ T cell and B cell responses while suppressing CD8+ T
cell, macrophage, and neutrophil activity. This imbalance could create
a tumor microenvironment that supports tumor growth and survival
[[302]170] (S1 Fig in [303]S1 File).
In order to explore KGs-guided possible pharmacological treatments for
ccRCC, we estimated the binding affinity scores between 110 meta-drug
molecules and KGs mediated receptors. We then identified the top 10
therapeutic agents (Irinotecan, Imatinib, Telaglenastat, Olaparib,
RG-4733, Sorafenib, Sitravatinib, Cabozantinib, Abemaciclib, and
Dovitinib) as the most viable candidate drugs for repurposing in ccRCC
([304]Fig 5A and S8 Table in [305]S1 File). Subsequently, we compared
the effectiveness of these ten chemical molecules against 30 different
independent receptors that supported our results ([306]Fig 5B and S9
Table in [307]S1 File). Many literature studies individually
corroborated our hypothesis that this potential drug molecules may be
efficacious against the treatment of ccRCC [[308]55,[309]171–[310]187].
Among the ten proposed drugs, seven—namely Irinotecan (DB00762),
Imatinib (DB00619), Sorafenib (DB00398), Cabozantinib (DB08875),
Olaparib (DB09074), Abemaciclib (DB12187), and Dovitinib (DB08911)—are
FDA-approved for various cancer treatments. According to Qi Zhang and
colleagues, Olaparib which is a PARP inhibitor, may be a potentially
useful treatment strategy for kidney cancers and it is FDA approved for
ovarian, breast, and pancreatic cancers [[311]172]. Imatinib, a potent
PDGFR (platelet-derived growth factor receptor) inhibitor, has been
utilized in the treatment of advanced renal cancer. By targeting and
inhibiting the PDGFR, imatinib helps to impede tumor growth and
progression, offering therapeutic benefits in managing this aggressive
form of cancer [[312]188]. The FDA recommends first-line treatments for
ccRCC patients with anti-angiogenic-CPI regimens: axitinib plus
avelumab or pembrolizumab, cabozantinib plus nivolumab, or lenvatinib
plus pembrolizumab. [[313]55,[314]174]. According to the Drug Bank
database, RG-4733 is an experimental medication being investigated for
cancer (accesssion number DB11870). This new gamma secretase inhibitor,
crucial for cleaving and activating Notch, is being investigated as an
anti-cancer drug [[315]175]. The investigation of sitravatinib is being
conducted in the clinical research [316]NCT03680521 (Neoadjuvant
Sitravatinib in Combination With Nivolumab in Patients With Clear Cell
Renal Cell Carcinoma) [[317]176]. In patients with RCC, telaglenastat,
a small-molecule glutaminase inhibitor, shows limited single-agent
efficacy. However, in metastatic RCC (mRCC) patients, cabozantinib or
everolimus—known to affect glucose metabolism—were tested in
combination with telaglenastat in a phase Ib study. For extensively
pretreated mRCC patients, the TelaE and TelaC combinations demonstrated
excellent clinical activity and tolerability [[318]177]. Irinotecan was
initially approved for treating metastatic colorectal cancer
[[319]189]. Sorafenib [[320]190], Cabozantinib [[321]185] and Dovitinib
[[322]186] are both FDA-approved treatments for advanced ccRCC. These
three drugs have been validated through extensive laboratory research
and clinical studies. Abemaciclib is approved for treating some people
with advanced or metastatic hormone receptor (HR)-positive,
HER2-negative breast cancer that has progressed after hormone therapy
[[323]187]. We found that Imatinib, Sorafenib, Cabozantinib, and
Dovitinib are FDA-approved treatments for advanced ccRCC. RG-4733 and
Sitravatinib are currently in clinical trials for ccRCC. Irinotecan,
Olaparib, Abemaciclib, and Telaglenastat are FDA-approved for other
cancers.
Although the drug repurposing approach offers promising avenues for
ccRCC treatment, several challenges and limitations must be
acknowledged. The clinical relevance of binding affinity scores from in
silico analyses does not always directly translate to therapeutic
efficacy in patients due to factors like drug bioavailability,
pharmacokinetics, and off-target effects. The tumor microenvironment in
ccRCC, characterized by hypoxia and immune evasion, may also impact the
effectiveness of repurposed drugs. To validate the drug molecules
computationally, we conducted ADME/T analysis and evaluated their
drug-likeness. All ten medications demonstrated drug-like properties,
meeting at least four of Lipinski’s rule of five criteria (S11 Table in
[324]S1 File), and exhibited favorable ADME/T characteristics, such as
high Human Intestinal Absorption (HIA) levels between 68.00% to 100%,
and no carcinogenic effects and no carcinogenic effects ([325]Table 3).
Toxicity analyses revealed that these compounds are likely non-toxic,
possess favorable drug-like characteristics, and are appropriate for
oral consumption. However, clinical studies are necessary to validate
the efficacy and safety of these repurposed drugs in ccRCC patients.
Therefore, the findings of this study could serve as valuable resources
for the diagnosis and treatment of ccRCC.
5. Conclusion
This in-silico study disclosed ccRCC-causing eight key genes (PLG,
ENO2, ALDOB, UMOD, SLC12A1, SLC12A3, SLC12A1, and SERPINA5) through
statistical LIMMA, survival probability and protein-protein interaction
networks analyses. The pan-cancer analysis with KGs based on TCGA
database also showed the significant association of KGs with different
subtypes of kidney cancers including ccRCC. The gene-set enrichment
study with the GO-term showed that KGs are significantly associated
with some biological processes (potassium ion homeostasis, positive
regulation of fibrinolysis), molecular function (heparin binding,
sodium:potassium:chloride symporter activity), cellular components
(apical plasma membrane, extracellular exosome) and KEGG pathways
(Metabolic pathways, Complement and coagulation cascades) that are also
associated with ccRCC. The gene regulatory network (GRN) analysis
revealed four TFs proteins FOXC1, GATA2, NFIC, and STAT3 as the
top-ranked transcriptional regulatory factors of KGs, and five miRNAs
hsa-mir-124-3p, hsa-mir-34a-5p, hsa-mir-27a-3p, hsa-mir-335-5p, and
hsa-mir-200b-3p as their post-transcriptional regulators. Finally,
KGs-guided ten candidate drug molecules (Irinotecan, Imatinib,
Telaglenastat, Olaparib, RG-4733, Sorafenib, Sitravatinib,
Cabozantinib, Abemaciclib, and Dovitinib) were recommended by using
molecular docking and ADME/T analysis, where seven drugs molecules
(Irinotecan, Imatinib, Olaparib, Dactinomycin, Sorafenib, Nexavar and
Cabozantinib) have been approved by FDA and the rest three molecules
are under investigation for different cancers. However, the fundings of
this study require experimental validation in wet-lab for taking a
better treatment plan against ccRCC.
Supporting information
S1 File. Association of KGs with immune cells in ccRCC.
S1 Table. The gene expression profile datasets that were analyzed in
this study. S2 Table. Collection of ccRCC related candidate drug agents
from published articles and other sources. S3 Table. Collection of
proposed key genes from published articles. S4 Table. List of
upregulated and downregulated common DEGs (cDEGs) of ccRCC in 9
datasets ([326]GSE16441, [327]GSE53757, [328]GSE66270, [329]GSE66272,
[330]GSE16449, [331]GSE76351, [332]GSE66271, [333]GSE71963 and
[334]GSE36895). S5 Table. List of screened cDEGs (scDEGs) from cDEGs
through survival analysis. S6 Table. List of key genes (KGs) from PPI
network based on different topological measures. S7 Table. Pan-cancer
analysis of KGs. S8Table. Docking/ (binding affinity) scores (kcal/mol)
between the proposed target genes/proteins (receptors) and top ordered
50 candidate drugs (out of 327). S9 Table. Docking/ (binding affinity)
scores (kcal/mol) between the proposed drugs and 30 independent
receptors(published). S10 Table. The 3-dimension view of strong binding
interactions between targets and drugs. S11 Table. Drug-likeness
profile of top-ranked ten drugs.
(DOCX)
[335]pone.0310843.s001.docx^ (1.9MB, docx)
Data Availability
All information about the datasets analyzed in this study is provided
in the paper.
Funding Statement
The author(s) received no specific funding for this work.
References