Abstract

   Type 2 diabetes (T2D) and Clear-cell renal cell carcinoma (ccRCC) are
   both complicated diseases which incidence rates gradually increasing.
   Population based studies show that severity of ccRCC might be
   associated with T2D. However, so far, no researcher yet investigated
   about the molecular mechanisms of their association. This study
   explored T2D and ccRCC causing shared key genes (sKGs) from multiple
   transcriptomics profiles to investigate their common pathogenetic
   processes and associated drug molecules. We identified 259 shared
   differentially expressed genes (sDEGs) that can separate both T2D and
   ccRCC patients from control samples. Local correlation analysis based
   on the expressions of sDEGs indicated significant association between
   T2D and ccRCC. Then ten sDEGs (CDC42, SCARB1, GOT2, CXCL8, FN1, IL1B,
   JUN, TLR2, TLR4, and VIM) were selected as the sKGs through the
   protein–protein interaction (PPI) network analysis. These sKGs were
   found significantly associated with different CpG sites of DNA
   methylation that might be the cause of ccRCC. The sKGs-set enrichment
   analysis with Gene Ontology (GO) terms and KEGG pathways revealed some
   crucial shared molecular functions, biological process, cellular
   components and KEGG pathways that might be associated with development
   of both T2D and ccRCC. The regulatory network analysis of sKGs
   identified six post-transcriptional regulators (hsa-mir-93-5p,
   hsa-mir-203a-3p, hsa-mir-204-5p, hsa-mir-335-5p, hsa-mir-26b-5p, and
   hsa-mir-1-3p) and five transcriptional regulators (YY1, FOXL1, FOXC1,
   NR2F1 and GATA2) of sKGs. Finally, sKGs-guided top-ranked three
   repurposable drug molecules (Digoxin, Imatinib, and Dovitinib) were
   recommended as the common treatment for both T2D and ccRCC by molecular
   docking and ADME/T analysis. Therefore, the results of this study may
   be useful for diagnosis and therapies of ccRCC patients who are also
   suffering from T2D.

   Keywords: Clear-cell renal-cell carcinoma, Type-2 diabetes, Shared key
   genes, Molecular mechanisms, Drug repurposing, Bioinformatics analysis

   Subject terms: Cancer, Computational biology and bioinformatics, Drug
   discovery, Genetics, Molecular biology, Biomarkers, Diseases, Molecular
   medicine, Oncology

Introduction

   Cancer is considered as the second leading cause of mortality globally,
   with around 20 million new cases and more than 10 million deaths
   annually. Over 50% of cancer patients ultimately died, despite the
   advancements in the field of diagnosis and therapies^[40]1. Aged
   patients are one of the most crucial factors for increasing the
   cancer-related mortality^[41]2. The clear-cell renal cell carcinoma
   (ccRCC) is a common cancer worldwide. There are several types of kidney
   cancer (KC) including renal cell carcinoma (RCC). The ccRCC is a
   subtype of RCC, which make up about 70–80% of KC^[42]3. It is the 8th
   commonest cancer among women and the 6th most common disease among
   men^[43]4. It had the 17th highest cancer-related mortality in 2018
   with 175,098 deaths worldwide^[44]5. In 2020, the death rate of KC
   patient was around 42%^[45]6. The ccRCC cancer is the most common type
   of KC in adults, and its incidence increases with age. While it can
   occur at any age, the risk of developing ccRCC generally increases
   after the age of 40, and the highest incidence rates are seen in people
   aged 60 and older^[46]7. On the other hand, most of the older peoples
   suffer from type 2 diabetes (T2D). It is typically occurred due to the
   insulin resistance^[47]8. Insulin resistance hinders the body from
   using glucose for energy, blood sugar levels remain consistently
   high^[48]9. A population based study reported that the prevalence of
   diabetes among all age groups is 2.8% in 2000 and is projected to
   increase to 4.4% by 2030^[49]10. However, some other studies have
   reported that T2D is associated with ccRCC^[50]11–[51]14, liver
   cancer^[52]15, colorectal^[53]16, breast, stomach, endometrium,
   pancreas, lymphoid tissue and urinary bladder cancers^[54]11. A study
   has been reported that cancer-related deaths account for approximately
   13% of overall mortality in diabetic patients^[55]12. Achieving various
   kidney problems^[56]13 including microalbuminuria, macroalbuminuria, or
   reduced renal function over time affects about 35% to 50% of T2D
   patients. RCC, is often considered as a metabolic disease, especially
   ccRCC. It’s target gene mutations associated in metabolic pathways are
   a clear characteristic of RCC^[57]17. On the other hand, T2D is also a
   metabolic disease which is characterized by the deregulation of genes,
   glucose and lipid metabolism^[58]18. Insulin resistance also increases
   the insulin levels, insulin-like growth factors as well as
   hyperactivation of protein kinase B (Akt)/mTOR in blood which may
   stimulate the growth and development of tumors^[59]19,[60]20.
   Additionally, raised triglyceride levels, higher blood pressure in men,
   high body mass index (BMI), and T2D in women are distinct risk factors
   for ccRCC^[61]19. Numerous genes or proteins that are mutated or
   methylated for developing cancer, are overexpressed or suppressed,
   resulting in conformational alterations such as post-translational
   modifications (PTM). Its results change the cellular signaling pathways
   and functions which ultimately cause the change of metabolic
   processes^[62]21. Thus, ccRCC might be associated with T2D as displayed
   in Fig. [63]1 and ccRCC patients may suffer from complicated situations
   due to the influence of T2D. Therefore, identification of both ccRCC-
   and T2D-causing shared key genes (sKGs) also known as biomarker genes,
   is essential in order to investigate their genetic association for
   better diagnosis and therapies.

Figure 1.

   [64]Figure 1
   [65]Open in a new tab

   A schematic diagram about the link between T2D and ccRCC.

   During the co-occurrence of T2D and ccRCC, Doctors may be prescribed
   both diseases specific multiple drugs to the patients^[66]22. However,
   drug-drug interaction (DDI) during polypharmacy may create some adverse
   side effects or toxicity to the patients for which patients may reach
   to the severe conditions ^[67]23–[68]26. In that case, Doctors should
   prescribe fewer numbers of common drugs as the representative of those
   multiple drugs in order to reduce the toxicity. However, so far, there
   is no study yet in the literature that has suggested any common drug
   for the treatment of both diseases though aged patients are at high
   risk of DDI due to the prevalence of polypharmacy and changes in
   age-related metabolism. Therefore, it is required to explore potential
   common drugs for ccRCC and T2D as the representative of those disease
   specific multiple drugs. In order to explore common drugs, at first, it
   is required to explore ccRCC- and T2D-causing sKGs as the targets of
   common drugs, since specific disease-causing key genes/proteins are
   widely used as the targets of disease specific drugs ^[69]27–[70]30.
   Nevertheless, it is very difficult to explore ccRCC- and T2D-causing
   top-ranked sKGs and candidate therapeutic ligands/agents from huge
   number of alternatives through the wet-lab experiments only, since
   wet-lab experiments are time consuming, laborious and costly. To
   overcome this issues, in-silico bioinformatics and system biology
   approaches are playing the significant roles^[71]31,[72]32. In the case
   of target selection, genomics/transcriptomics analysis through
   integrated statistics and network-based approaches are widely
   used^[73]31,[74]32. There are some in-silico studies that explored T2D-
   and ccRCC-causing key genes (KGs) and their pathogenetic mechanisms
   individually^[75]33–[76]38. Though, some studies investigated shared
   KGs (sKGs) for T2D with HCC(Hepatocellular-carcinoma)^[77]39,[78]40 and
   CRC (colorectal cancer)^[79]41,[80]42, however, so far, there is no
   study in the literature that has explored T2D- and ccRCC-causing sKGs.
   Therefore, this study aimed to explore both T2D- and ccRCC-causing sKGs
   highlighting their pathogenetic mechanisms and candidate common drug
   molecules for taking a better treatment plan against ccRCC with T2D, by
   using the integrated bioinformatics and system biology approaches.

Materials and methods

Data source and descriptions

   To explore shared key genes (sKGs) between T2D stimulates ccRCC, we
   considered four micro-array gene expression profile datasets for each
   of T2D ([81]GSE25724^[82]43, [83]GSE29221^[84]44, [85]GSE29226^[86]45
   and [87]GSE29231^[88]46) and ccRCC ([89]GSE66270^[90]47,
   [91]GSE66272^[92]48, [93]GSE76351^[94]49 and [95]GSE66271^[96]50) from
   the Gene Expression Omnibus (GEO) platform in the National Center for
   Biotechnology Information (NCBI) database. Table [97]1 provides the
   detailed descriptions of the datasets.

Table 1.

   Data source and descriptions.
   GEO datasets Country Platform Cases Control
   [98]GSE25724 Italy [99]GPL96[HG-U133A] Affymetrix Human Genome U133A
   Array 6(T2D) 7
   [100]GSE29221 India GPL6947Illumina HumanHT-12 V3.0 expression beadchip
   12(T2D) 12
   [101]GSE29226 India GPL6947Illumina HumanHT-12 V3.0 expression beadchip
   12(T2D) 12
   [102]GSE29231 India GPL6947Illumina HumanHT-12 V3.0 expression beadchip
   12 (T2D) 12
   [103]GSE66270 Germany [104]GPL570[HG-U133_Plus_2] Affymetrix Human
   Genome U133 Plus 2.0 Array 14(ccRCC) 14
   [105]GSE66272 Germany [106]GPL570[HG-U133_Plus_2] Affymetrix Human
   Genome U133 Plus 2.0 Array 27(ccRCC) 27
   [107]GSE76351 Russia [108]GPL11532[HuGene-1_1-st] Affymetrix Human Gene
   1.1 ST Array [transcript (gene) version] 12(ccRCC) 12
   [109]GSE66271 Germany [110]GPL570 [HG-U133_Plus_2] Affymetrix Human
   Genome U133 Plus 2.0 Array 13(ccRCC) 13
   [111]Open in a new tab

Identification of Differentially Expressed Genes (DEGs)

   To identify differentially expressed genes (DEGs) between case and
   control groups, we considered the, since it shows good performance in
   the case of small sample sizes also. It produces P.values based on the
   moderated t-statistic^[112]51 to measure the significance of
   differential expressions between two condition. The moderated
   t-statistic is formulated by combining the classical and Bayesian
   estimation of the relevant parameters^[113]51,[114]52. Then gth
   differentially expressed gene (DEG[g]) is defined by combining its
   adjusted P.value and the average of log2 fold-change (aLog[2]FC) values
   as follows,
   [MATH:
   <mrow><mtext>\;DE</mtext><msub><mtext>G</mtext><mi>g</mi></msub><mo>=</
   mo><mfenced open="{"><mrow><mtable><mtr><mtd
   columnalign="left"><mrow><mtext>DEG</mtext><mfenced close=")"
   open="("><mtext>Up</mtext></mfenced><mo>,</mo><mspace
   width="0.277778em"></mspace></mrow></mtd><mtd><mrow><mtext>if\;adj</mte
   xt><mo>.</mo><mi>P</mi><mo>.</mo><mtext>value</mtext><mfenced close="〉"
   open="〈"><mrow><mn>0.05</mn><mspace
   width="0.277778em"></mspace><mtext>and</mtext><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>L</mi><mi>o</mi><msub><mi>g</
   mi><mn>2</mn></msub><mfenced close=")"
   open="("><mrow><mtext>F</mtext><msub><mtext>C</mtext><mi>g</mi></msub><
   /mrow></mfenced></mrow></mfenced><mo>+</mo><mn>1</mn></mrow></mtd></mtr
   ><mtr><mtd
   columnalign="left"><mrow><mrow></mrow><mrow><mtext>DEG</mtext><mfenced
   close=")"
   open="("><mtext>Down</mtext></mfenced><mo>,</mo></mrow></mrow></mtd><mt
   d><mrow><mtext>if\;adj</mtext><mo>.</mo><mi>P</mi><mo>.</mo><mtext>valu
   e</mtext><mo><</mo><mn>0.05</mn><mspace
   width="0.277778em"></mspace><mtext>and</mtext><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>L</mi><mi>o</mi><msub><mi>g</
   mi><mn>2</mn></msub><mfenced close=")"
   open="("><mrow><mtext>F</mtext><msub><mtext>C</mtext><mi>g</mi></msub><
   /mrow></mfenced><mo><</mo><mo>-</mo><mn>1</mn></mrow></mtd></mtr></mtab
   le></mrow></mfenced></mrow> :MATH]

   where alog[2]FC value for gth gene is computed as
   [MATH:
   <mrow><mi>a</mi><mi>L</mi><mi>o</mi><msub><mi>g</mi><mn>2</mn></msub><m
   i>F</mi><msub><mi>C</mi><mi>g</mi></msub><mo>=</mo><mfenced
   open="{"><mrow><mtable><mtr><mtd
   columnalign="left"><mrow><mfrac><mn>1</mn><msub><mi>n</mi><mn>1</mn></m
   sub></mfrac><munderover><mo
   movablelimits="false">∑</mo><mi>i</mi><msub><mi>n</mi><mn>1</mn></msub>
   </munderover><mi>l</mi><mi>o</mi><msub><mi>g</mi><mn>2</mn></msub><mrow
   ><mo stretchy="false">(</mo><msubsup><mi>z</mi><mrow><mi
   mathvariant="italic">gi</mi></mrow><mi>D</mi></msubsup><mo
   stretchy="false">)</mo></mrow><mo>-</mo><mfrac><mn>1</mn><msub><mi>n</m
   i><mn>2</mn></msub></mfrac><munderover><mo
   movablelimits="false">∑</mo><mi>j</mi><msub><mi>n</mi><mn>2</mn></msub>
   </munderover><mi>l</mi><mi>o</mi><msub><mi>g</mi><mn>2</mn></msub><mfen
   ced close=")" open="("><msubsup><mi>z</mi><mrow><mi
   mathvariant="italic">gj</mi></mrow><mi>C</mi></msubsup></mfenced><mo>,<
   /mo></mrow></mtd><mtd><mrow><mi>i</mi><mi>f</mi><mspace
   width="0.166667em"></mspace><msub><mi>n</mi><mn>1</mn></msub><mo>≠</mo>
   <msub><mi>n</mi><mn>2</mn></msub></mrow></mtd></mtr><mtr><mtd
   columnalign="left"><mrow><mrow></mrow><mrow><mfrac><mn>1</mn><mi>n</mi>
   </mfrac><munderover><mo
   movablelimits="false">∑</mo><mi>i</mi><mi>n</mi></munderover><mi>l</mi>
   <mi>o</mi><msub><mi>g</mi><mn>2</mn></msub><mfenced close=")"
   open="("><mfrac><msubsup><mi>z</mi><mrow><mi
   mathvariant="italic">gi</mi></mrow><mi>D</mi></msubsup><msubsup><mi>z</
   mi><mrow><mi
   mathvariant="italic">gj</mi></mrow><mi>C</mi></msubsup></mfrac></mfence
   d><mo>,</mo></mrow></mrow></mtd><mtd><mrow><mi>i</mi><mi>f</mi><mspace
   width="0.166667em"></mspace><msub><mi>n</mi><mn>1</mn></msub><mo>=</mo>
   <msub><mi>n</mi><mn>2</mn></msub><mspace
   width="0.277778em"></mspace><mo>=</mo><mi>n</mi><mspace
   width="0.277778em"></mspace></mrow></mtd></mtr></mtable></mrow></mfence
   d></mrow> :MATH]
   1

   Here
   [MATH: <msubsup><mi>z</mi><mrow><mi
   mathvariant="italic">gi</mi></mrow><mi>D</mi></msubsup> :MATH]
   and
   [MATH: <msubsup><mi>z</mi><mrow><mi
   mathvariant="italic">gj</mi></mrow><mi>C</mi></msubsup> :MATH]
   are the responses/expressions for the gth gene with the ith disease and
   jth control samples, respectively. We utilized the limma
   R-package^[115]53 for calculating the P.values and Log[2]FC values to
   select the DEGs, significantly for both T2D and ccRCC patients.

Identification of shared DEGs (sDEGs)

   At first, we detected DEGs between ccRCC and control samples based on
   four datasets with NCBI accession ID [116]GSE66270, [117]GSE66272,
   [118]GSE76351, and [119]GSE66271. Then detected DEGs for T2D vs.
   control samples were detected based on four datasets with accession ID
   [120]GSE25724, [121]GSE29221, [122]GSE29226 and [123]GSE29231. Then
   shared DEGs (sDEGs) that are able to separate both T2D and ccRCC
   samples from the control samples, were selected.

Local genetic association between T2D and ccRCC through sDEGs

   Though average of log[2]FC (alog[2]FC) values were calculated for T2D
   and ccRCC from independent datasets by Eq. [124]1, but these values
   were calculated from the same unit of sDEGs for each of T2D and ccRCC.
   A shared DEG (sDEG) is called upregulated for two or more diseases if
   alog[2]FC > 0 and downregulated if alog[2]FC < 0. If we assume that the
   function of a gene is almost same for all control patients, we may
   measure the genetic association between any two diseases X and Y based
   on their alog[2]FC values corresponding to the expressions of sDEGs
   through the Pearson’s correlation coefficient which is defined as
   [MATH: <mrow><msub><mi>r</mi><mrow><mi
   mathvariant="italic">xy</mi></mrow></msub><mo>=</mo><mfrac><mrow><mo>∑<
   /mo><mfenced close=")"
   open="("><mrow><msub><mi>x</mi><mi>g</mi></msub><mo>-</mo><mover
   accent="true"><mrow><mi>x</mi></mrow><mrow><mo
   stretchy="false">¯</mo></mrow></mover></mrow></mfenced><mfenced
   close=")"
   open="("><mrow><msub><mi>y</mi><mi>g</mi></msub><mo>-</mo><mover
   accent="true"><mrow><mi>y</mi></mrow><mrow><mo
   stretchy="false">¯</mo></mrow></mover></mrow></mfenced></mrow><msqrt><m
   row><mo>∑</mo><msup><mrow><mo
   stretchy="false">(</mo><msub><mi>x</mi><mi>g</mi></msub><mo>-</mo><move
   r accent="true"><mrow><mi>x</mi></mrow><mrow><mo
   stretchy="false">¯</mo></mrow></mover><mo
   stretchy="false">)</mo></mrow><mn>2</mn></msup><msup><mrow><mfenced
   close=")"
   open="("><mrow><msub><mi>y</mi><mi>g</mi></msub><mo>-</mo><mover
   accent="true"><mrow><mi>y</mi></mrow><mrow><mo
   stretchy="false">¯</mo></mrow></mover></mrow></mfenced></mrow><mn>2</mn
   ></msup></mrow></msqrt></mfrac></mrow> :MATH]
   2

   where
   [MATH:
   <mrow><msub><mi>x</mi><mi>g</mi></msub><mo>=</mo><mi>a</mi><mi>l</mi><m
   i>o</mi><msub><mi>g</mi><mn>2</mn></msub><mi>F</mi><mi>C</mi><mfenced
   close=")" open="("><mi>X</mi></mfenced></mrow> :MATH]
   and
   [MATH:
   <mrow><msub><mi>y</mi><mi>g</mi></msub><mo>=</mo><mi>a</mi><mi>l</mi><m
   i>o</mi><msub><mi>g</mi><mn>2</mn></msub><mi>F</mi><mi>C</mi><mfenced
   close=")" open="("><mi>Y</mi></mfenced></mrow> :MATH]
   are the alog[2]FC values of the g^th gene for the two diseases X and Y;
   [MATH: <mrow><mover accent="true"><mrow><mi>x</mi></mrow><mrow><mo
   stretchy="false">¯</mo></mrow></mover><mspace
   width="0.277778em"></mspace></mrow> :MATH]
   and
   [MATH: <mrow><mover accent="true"><mrow><mi>y</mi></mrow><mrow><mo
   stretchy="false">¯</mo></mrow></mover><mspace
   width="0.277778em"></mspace></mrow> :MATH]
   are the means of
   [MATH:
   <mrow><msup><mrow><msub><mi>x</mi><mi>g</mi></msub></mrow><mo>′</mo></m
   sup><mi>s</mi></mrow> :MATH]
   and
   [MATH:
   <mrow><msup><mrow><msub><mi>y</mi><mi>g</mi></msub></mrow><mo>′</mo></m
   sup><mi>s</mi></mrow> :MATH]
   , respectively.

Identification of shared key genes (sKGs) from sDEGs

   Proteins interact with other proteins in the cell to carry out their
   tasks, and information generated by the protein–protein interaction
   (PPI) network is used to select the key genes^[125]54,[126]55. In order
   to generate PPI network, the distance matrix ‘D’ is calculated as
   [MATH: <mrow><mtext>D</mtext><mfenced close=")"
   open="("><mrow><mtext>i</mtext><mo>,</mo><mtext>j</mtext></mrow></mfenc
   ed><mo>=</mo><mfrac><mrow><mn>2</mn><mfenced close="|"
   open="|"><mrow><msub><mi>N</mi><mi>i</mi></msub><mo>∩</mo><msub><mi>N</
   mi><mi>j</mi></msub></mrow></mfenced></mrow><mrow><mrow><mo
   stretchy="false">|</mo></mrow><msub><mi>N</mi><mi>i</mi></msub><mfenced
   close="|" open="|"><mrow><mrow><mo>+</mo><mo
   stretchy="false">|</mo></mrow><msub><mi>N</mi><mi>j</mi></msub></mrow><
   /mfenced></mrow></mfrac><mo>,</mo></mrow> :MATH]

   where N[i] is the neighbor set of ith protein and N[j] is the neighbor
   set of jth protein. In order to identify shared key genes (sKGs), a
   PPI-network of sDEGs was constructed using the STRING database^[127]56.
   To select the sKGs from the PPI network, we used different topological
   measures (Betweenness^[128]57, Degree^[129]58, BottleNeck^[130]59,
   Closeness^[131]60 MNC^[132]61, Radiality^[133]62 and Stress^[134]63) by
   using CytoHubba plugin-in Cytoscape software^[135]64.

In-silico validation of sKGs using independent datasets and databases

   The differential expression patterns of sKGs were validated in both
   disorders (ccRCC & T2D) by Box plots analysis with the independent
   datasets from NCBI, TCGA and GTEx databases. We used the TCGA and GTEx
   databases in the GEPIA2^[136]65 web-tool to confirm the differential
   expression patterns of sKGs between ccRCC and control samples. In
   ordered to validate the differential expression patterns of sKGs
   between T2D and control samples, we used two independent datasets with
   accession IDs [137]GSE15932^[138]66 and [139]GSE20966^[140]67 from NCBI
   database.

Regulatory network analysis of sKGs

   A gene regulatory network (GRN) displays molecular regulators that
   interact with each other in the cell to control the gene expressions.
   The transcription factors (TFs) and microRNAs (miRNAs) are considered
   as the transcriptional and post-transcriptional regulators of protein
   coding genes. To select the top-ordered TFs as the key transcriptional
   regulators of sKGs, the TFs versus sKGs interaction network analysis
   was performed by using JASPAR^[141]68 databases with the NetworkAnalyst
   web-tool^[142]69. Similarly, to identify top-ordered miRNAs as the key
   post-transcriptional regulators of sKGs, the sKGs versus miRNAs
   interaction network analysis was performed by using the TarBase
   database^[143]70 databases with the NetworkAnalyst web-tool^[144]69.

The sKGs-set enrichment analysis with GO-terms and KEGG-pathways

   The sKGs-set enrichment studies with gene ontology (GO) terms and Kyoto
   encyclopedia of genes and genomes (KEGG) pathways^[145]71 were
   performed to explore biological processes (BP), molecular functions
   (MF), cellular components (CC) and pathways of sKGs. In order to
   identify significantly enriched GO terms (BPs, MF, CCs) or
   KEGG-pathways by the sKGs-set, a 2 × 2 contingency table was
   constructed (see Table [146]2).

Table 2.

   A 2 × 2 Contingency table.
   Annotated genes sKGs (proposed) Not-sKGs Marginal total
   Annotated gene-set in i^th GO term/KEGG pathway (A[i]) k[i] M[i]—k[i]
   M[i]
   Complement gene-set of A[i] (
   [MATH: <msubsup><mi>A</mi><mrow><mrow><mi
   mathvariant="bold-italic">i</mi></mrow></mrow><mi>c</mi></msubsup>
   :MATH]
   ) n—k[i] N—M[i] – n + k[i] N—M[i]
   Marginal total n N—n N (Grand total)
   [147]Open in a new tab

   where A[i]: annotated genes in the i^th BPs/MFs/CCs/KEGG-pathways in
   the database, M[i]: total number of annotated genes in A[i] (i = 1,
   2,…,r); N: total number of annotated genes in
   [MATH:
   <mrow><mi>A</mi><mo>=</mo><munderover><mo>∪</mo><mrow><mi>i</mi><mo>=</
   mo><mn>1</mn></mrow><mi>r</mi></munderover><msub><mi>A</mi><mtext>i</mt
   ext></msub><mo>=</mo><msub><mi>A</mi><mi>i</mi></msub><mo>∪</mo><msubsu
   p><mi>A</mi><mi>i</mi><mi>c</mi></msubsup></mrow> :MATH]
   such that
   [MATH: <mrow><mi>N</mi><mo>⩽</mo><munderover><mo
   movablelimits="false">∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow>
   <mi>r</mi></munderover><msub><mi>M</mi><mi>i</mi></msub><mo>.</mo></mro
   w> :MATH]
   Here n: total number of sKGs, k[i]: number of sKGs belonging to A[i].
   To detect the significantly enriched GO-terms or KEGG-pathways with
   sKGs, the database for annotation, visualization
   and integrated discovery (DAVID)^[148]72 was used to calculate the
   p-value by the Fisher exact test statistic based on hypergeometric
   distribution^[149]73.

DNA methylation analysis

   Development of many diseases including cancers, obesity and T2D are
   associated the aberrant DNA methylation. DNA methylation analysis is
   used to gain relevant knowledge about gene regulation and detect
   potential biomarkers. In this study, MethSurv^[150]74 and
   UALCAN^[151]75 were employed to investigate the DNA methylation status
   of sKGs. DNA methylation level was expressed as β-values (which ranged
   from 0 to 1). Using the equation M / (M + U + 100), the -values are
   determined. Here, M and U are stand for fully methylated and totally
   unmethylated intensities, respectively.

Exploring sKGs-guided repurposable common drug molecules for both T2D and
ccRCC

   There are two in-silico ways (de-novo and repurposing) of exploring
   drug molecules for diseases, where de-novo approach is time consuming,
   costly and laborious compared to the drug repurposing (DR) approach,
   since the DR approach explores existing drugs for a disease of interest
   that drugs are already approved for other diseases^[152]76. However, in
   both-approaches, molecular docking analysis with the synthetic
   molecules ^[153]30,[154]77 as well as phytocompounds^[155]78–[156]80
   are widely used in order to explore potential ligands/agents. In order
   to explore sKGs-guided repurposable common drug molecules for T2D and
   ccRCC, we collected 148 candidate molecules from published articles
   associated with T2D and ccRCC, and online databases as given in Table
   [157]S1. The Protein Data Bank(PDB)^[158]81, SWISS-MODEL^[159]82 and
   AlphaFold databases were utilized to obtain the three-dimensional
   configurations of every sKGs-mediated receptor proteins. Using Swiss
   PDB view^[160]83 and AutoDock Vina^[161]84, receptor-proteins were
   pre-processed by including charges and reducing energy, respectively.
   All 148 potential drug compounds' 3D structures were downloaded from
   the PubChem database^[162]85 and ready for molecular docking simulation
   by using AutoDock tools 1.5.7 to set the ligand's
   rotatable/non-rotatable links and torsion tree. Then AutoDock
   Vina^[163]84 was used to compute the binding affinities between the
   drugs and the target proteins. The docked complexes were examined using
   PLIP^[164]86, PyMol^[165]87, and Discovery Studio Visualizer (BIOVIA
   2021) software^[166]88 to determine the types, distances, and surface
   complexes of non-covalent bonds. Let B[ij] indicates the BAS (binding
   affinity score) between i^th receptors (i = 1, 2, …, p) and j^th
   ligands/agents (j = 1, 2, …, q). Then receptors were arranged according
   to the decreasing-order of row average
   [MATH: <mfenced close=")"
   open="("><mrow><mfrac><mn>1</mn><mi>p</mi></mfrac><munderover><mo
   movablelimits="false">∑</mo><mrow><mtext>j</mtext><mo>=</mo><mn>1</mn><
   /mrow><mtext>q</mtext></munderover><msub><mtext>B</mtext><mrow><mi
   mathvariant="italic">ij</mi></mrow></msub><mo>,</mo><mspace
   width="0.277778em"></mspace><mi>i</mi><mo>=</mo><mn>1</mn><mo>,</mo><mn
   >2</mn><mo>…</mo><mi>p</mi></mrow></mfenced> :MATH]
   and ligands/agents according to the decreasing-order of column average
   [MATH: <mfenced close=")"
   open="("><mrow><mfrac><mn>1</mn><mi>q</mi></mfrac><munderover><mo
   movablelimits="false">∑</mo><mrow><mtext>i</mtext><mo>=</mo><mn>1</mn><
   /mrow><mtext>p</mtext></munderover><msub><mtext>B</mtext><mtext>ij</mte
   xt></msub><mo>,</mo><mspace
   width="0.277778em"></mspace><mi>j</mi><mo>=</mo><mn>1</mn><mo>,</mo><mn
   >2</mn><mo>,</mo><mo>…</mo><mo>,</mo><mi>q</mi><mtext>\;</mtext></mrow>
   </mfenced> :MATH]
   to select the top-ranked few agents/ligands as the potential candidate
   drug molecules.

In-silico validation of candidate drug molecules by ADME/T analysis

   The drug-like characteristics and ADMET (absorption, distribution,
   metabolism, excretion, and toxicity) properties were determined of
   top-ranked 3 drug compounds in order to learn more about their
   structural characteristics and chemical descriptors. We use SCFBio
   ([167]http://www.scfbio-iitd.res.in/software/drugdesign/lipinski.jsp)
   web tool for evaluating their Lipinski rule satisfaction of drug
   likeness properties (including molecular weight, number of hydrogen
   donor and acceptor bonds, rotatable bond, octanol/water partition
   coefficient or LogP value, etc.)^[168]89. The ADMET properties were
   then predicted by using the online databases SwissADME^[169]90 and, and
   pkCSM^[170]91. The ADME/T calculations of medicinal compounds were
   performed using their optimal structures in SMILES formats.

Results

Identification of Differentially Expressed Genes (DEGs)

   At first, we identified DEGs for both T2D and ccRCC patients by using
   LIMMA with an r-package. The cut-off at adjusted P.values > 0.05 and
   |Log[2]FC|> 1 was used to select the DEGs as mentioned in section
   "[171]Identification of Differentially Expressed Genes (DEGs)". For
   ccRCC, we detected 15,348, 14,472, 1820 and 1576 downregulated DEGs,
   and 8150, 8166, 563 and 759 upregulated DEGs, for the NCBI datasets
   with accession ID [172]GSE66270, [173]GSE66272, [174]GSE76351, and
   [175]GSE66271, respectively. Then, 738 upregulated and 47 downregulated
   DEGs (Table [176]S2) were detected as common DEGs (cDEGs) for ccRCC.
   From the NCBI datasets with accession ID [177]GSE25724, [178]GSE29221,
   [179]GSE29226, and [180]GSE29231, we identified 2651, 459, 839, and
   2854 upregulated DEGs, and 3032, 1875, 2173 and 1569 downregulated DEGs
   respectively, for T2D patients. We found 252 downregulated and 498
   upregulated cDEGs for T2D (Table [181]S3).

Identification of shared DEGs (sDEGs) between T2D and ccRCC

   In the previous section, we found 738 upregulated and 47 downregulated
   DEGs for ccRCC based on four transcriptomics datasets. Similarly, 252
   downregulated and 498 upregulated DEGs for T2D based on another four
   transcriptomics datasets. Then we detected 194 as upregulated shared
   DEGs (sDEGs) and 65 as downregulated sDEGs for both T2D and ccRCC
   (Table [182]S4 & [183]S5). Thus, we considered in total 259 sDEGs for
   both T2D and ccRCC.

Local association between T2D and ccRCC through sDEGs

   To find the link between T2D and ccRCC, we computed local correlation
   coefficient between T2D and ccRCC based on the aLog[2]FC values of
   sDEGs by using Eq. [184]2. The correlation coefficient was found with a
   value of 0.82, which indicates that T2D and ccRCC are locally
   associated with each other through the expressions of sDEGs.

Identification of shared Key Genes (sKGs)

   The STRING database was used to build the PPI network of sDEGs which
   has 259 nodes and 773 edges (Fig. [185]2) By using seven topological
   measures (Betweenness, BottleNeck, Closeness, Degree, MNC, Radiality
   and Stress) in the PPI network, we chose the top 10 cHubGs (VIM, CDC42,
   SCARB1, CXCL8, FN1, IL1B, JUN, TLR2, TLR4 and GOT2) (Table [186]S6).

Figure 2.

   Figure 2
   [187]Open in a new tab

   Protein–protein interaction (PPI) network of sDEGs to identify sKGs,
   where the chartreuse color nodes indicated the sKGs.

In-silico validation of sKGs using independent datasets and databases

   We investigated the differential expression patterns of sKGs between
   ccRCC and control samples through Box-plot analysis based on the
   independent gene expression profiles from TCGA and GTEx databases that
   contained 523 ccRCC and 100 control samples. From Figure S1A, we
   observed that 3 sKGs (CDC42, GOT2, CXCL8) are downregulated and the
   remaining 7 sKGs (TLR4, IL1B, TLR2, FN1, JUN, VIM, SCARB1) are
   upregulated, which supported the proposed results. We also investigated
   the differential expression patterns of sKGs between T2D and control
   samples through Box-plot analysis based on the independent gene
   expression profiles from the NCBI database with accession ID
   [188]GSE15932 and [189]GSE20966, where the dataset with accession ID
   [190]GSE15932 contains 8 pancreatic cancer, 8 T2D and 8 control
   samples. In our analysis, we considered only T2D and control samples.
   Figure S1B shows that 3 sKGs (CDC42, GOT2, and CXCL8) are downregulated
   in T2D, while the rest 7 sKGs (TLR4, IL1B, TLR2, FN1, JUN, VIM, and
   SCARB1) are upregulated, which also supported the proposed results.

The regulatory network analysis of sKGs

   The top-ranked five significant TFs proteins (FOXL1, FOXC1, NR2F1, YY1
   and GATA2) and micro-RNAs (hsa-mir-93-5p, hsa-mir-203a-3p,
   hsa-mir-204-5p, hsa-mir-335-5p, hsa-mir-26b-5p, and hsa-mir-1-3p) were
   identified as the key transcriptional and post-transcriptional
   regulators of sKGs by using the TFs-sKGs-miRNAs interaction network
   analysis (see Fig. [191]3).

Figure 3.

   [192]Figure 3
   [193]Open in a new tab

   (A) The sKGs-TFs interaction network based on JASPAR database (B) The
   miRNA-sKGs interaction network based on TarBase database.

sDEGs-set enrichment analysis with GO-terms and KEGG pathway

   We carried out GO and KEGG pathway enrichment analysis for 10 sKGs to
   look into the shared pathogenetic mechanisms between T2D and ccRCC. The
   top five MFs, BPs, CCs, and KEGG pathways are listed in Table [194]3.
   Significantly enhanced KEGG pathways and GO terms with sDEGs through
   the involvement of sKGs linked to the pathogenetic mechanisms of T2D on
   ccRCC.

Table 3.

   Significantly enriched GO-terms and KEGG pathways that are associated
   with T2D and ccRCC.
   GO ID GO-Terms sDEGs (counts) P.value Associated sKGs
   Biological process (BPs) GO:0,006,915 apoptotic process 22 3.07E−04
   IL1B, TLR2
   GO:0,007,165 signal transduction 28 1.95E−06 CXCL8, IL1B, TLR2
   GO:0,006,954 inflammatory response 45 1.25E−21 CXCL8, TLR2, TLR4
   GO:0,010,628 positive regulation of gene expression 19 7.46E−04 FOXL1,
   FOXC1, CXCL8, FN1, IL1B, TLR2, TLR4
   GO:0,034,976 Response to endoplasmic reticulum stress 8 6.08E−04 JUN,
   CXCL8
   Molecular Function (MFs) GO:0,005,178 integrin binding 15 2.26E−06 FN1,
   IL1B
   GO:0,042,802 identical protein binding 44 0.001422588 JUN, FN1, VIM,
   CDC42, TLR2, TLR4
   GO:0,008,201 heparin binding 10 0.001969 CXCL8, FN1
   GO:0,001,875 lipopolysaccharide receptor activity 3 0.003454 SCARB1,
   TLR4, TLR2
   GO:0,005,515 Protein binding 248 JUN, FN1, TLR2, TLR4, IL1B, SCARB1,
   CXCL8, CDC42, VIM,
   Cellular Components (CC) GO:0,070,062 extracellular exosome 64 3.45E−07
   SCARB1, VIM, FN1, GOT2
   GO:0,005,829 cytosol 114 5.04E−05 CDC42, IL1B, VIM, NR2F1
   GO:0,009,986 cell surface 19 0.007489 SCARB1, TLR2, TLR4,
   GO:0,016,020 membrane 113 1.31E−04 TLR2, TLR4, FN1
   GO:0,005,604 Basement membrane 9 5.59E−05 FN1
   Hsa ID KEGG terms sDEGs (counts) P.value Associated sKGs
   KEGG Pathway hsa05169 Epstein-Barr virus 14 1.65E−04 TLR2, VIM, JUN
   hsa004621 NOD-like receptor signaling pathway 13 2.87E−04 JUN, CXCL8,
   IL1B, TLR4
   hsa05323 Rheumatoid arthritis 9 4.61E−04 JUN, CXCL8, IL1B, TLR2, TLR4
   hsa05417 Lipid and atherosclerosis 13 0.00105 JUN, CXCL8, CDC42, IL1B,
   TLR2
   hsa05165 Human papillomavirus infection 18 2.60E−04 FN1, CDC42
   [195]Open in a new tab

Disease enrichment analysis with sKGs

   We performed disease enrichment analysis with sKGs by using the Enrichr
   web-tool with DisGeNET database to investigate the association of sKGs
   with different diseases. This analysis significantly detected
   top-ranked 10 diseases including Diabetic Nephropathy, Kidney Failure
   and Kidney Disease (see Fig. [196]4) that are associated with sKGs.

Figure 4.

   Figure 4
   [197]Open in a new tab

   Results of disease enrichment analysis with sKGs, where red box
   indicates significant association (p-value < 0.05).

DNA methylation analysis of sKGs in ccRCC

   DNA methylation is an epigenetic mechanism which regulates gene
   expression by recruiting proteins involved in gene repression or by
   inhibiting the binding of transcription factor(s) to DNA^[198]75.
   Significant tumor suppressor gene silencing is facilitated by DNA
   hypermethylation, which primarily happens at the CpG islands within a
   gene's promoter region. On the other hand, oncogenes are upregulated
   when DNA hypomethylation occurs ^[199]92.Therefore, we examined DNA
   methylation status at CpG sites for the sKGs (CDC42, SCARB1, GOT2,
   CXCL8, FN1, IL1B, JUN, TLR2, TLR4, and VIM) by methsurv. We observed
   that ten sKGs had significant CpG sites (p-value of ≤ 0.001)
   Table-[200]S7. Additionally, UALCAN was also utilized to visualize
   promoter methylation status of the 10 sKGs in ccRCC. From Box whisker
   plot it was found that seven sKGs (SCARB1, FN1, IL1B, JUN, TLR2, TLR4,
   and VIM)) were hypomethylated according to β-values (ranging from 0
   (that means completely unmethylated) to 1 (that means highly
   methylated)) which is strong evidence that, seven sKGs were upregulated
   in ccRCC.

Drug repurposing by molecular docking

   To explore candidate ligands (drug molecules) for the treatment against
   T2D and ccRCC, we considered our proposed 10 sKGs and their regulatory
   5 TFs proteins as the receptors. We collected the data from two
   distinct sources in order to obtain the 3D structures of these
   receptors. The Protein Data Bank (PDB) was searched for the structures
   of 10 receptors (CDC42, SCARB1, GOT2, CXCL8, FN1, IL1B, JUN, TLR2,
   TLR4, and VIM) using the following PDB codes:1a4r, 5ktf, 5ax8, 3il8,
   1e88, 2KH2, 1jun, 2z80, 2z62 and 1gk7. The "AlphaFold Protein Structure
   Database" was used to collect the remaining five targets (FOXL1, FOXC1,
   NR2F1, YY1, and GATA2). We computed the binding affinity scores (BAS),
   between the proposed receptors and the candidate drug molecules (Table
   [201]S1) by using molecular docking analysis. To select the top-ranked
   therapeutic candidates, drug molecules were ordered based on the
   average BAS across the receptors (Table [202]S8). Similarly, receptors
   were ordered based on the average BAS across the drug molecules. Figure
   [203]5 displayed the top-order 30 drug molecules corresponding to the
   ordered receptors. We observed that 3 molecules Digoxin, Imatinib and
   Dovitinib produces average BAS < -7.7 kcal/mol, but the other molecules
   satisfy BAS > -7.7 kcal/mol. Therefore, we considered Digoxin, Imatinib
   and Dovitinib as the top-ranked drug molecules to inhibit the proposed
   sKGs. It is seen that top two molecules Digoxin and Imatinib strongly
   binds (BAS < -7.0 kcal/mol) to all of the receptor proteins. The third
   top-ranked molecule Dovitinib also strongly binds to all of the
   receptor proteins except JUN. Therefore, the proposed three drug
   molecules might be the potential inhibitors against the T2D- and
   ccRCC-causing genes.

Figure 5.

   [204]Figure 5
   [205]Open in a new tab

   Molecular docking scores, where red color indicates strong drug-target
   binding. Image of score matrix, where X-axis indicates top-ordered 30
   drug agents (out of 148) and Y-axis indicates ordered proposed
   receptors.

In-silico drug validation

   The ADME properties of a drug molecule were used to evaluate its
   absorption, distribution, metabolism, excretion, and toxicity. The drug
   likeness properties of a drug molecules explore its physicochemical
   descriptor to describe its different kind of chemical properties (Table
   [206]4). According to the Lipinski rule we found that, two drugs
   (Imatinib and Dovitinib) follow all rule-of-five (ROF), the rest one
   (Digoxin) violates three rules (MW, HBA, and HBD) from ROF. The
   lipophilicity (LogP value) of these 3 drugs supports the standard range
   (1—less than equal 5)^[207]89 of Lipinski’s rule. These are found as
   lipophilic compounds based on their LogP values compared with the
   standard values. Thus, our suggested top-ranked 3 drug molecules have
   fulfilled almost all the drug-likeness criteria (Table [208]4A). The
   ADME and toxicity analysis of proposed compounds can be examined
   through various parameter for evaluating its effectiveness and
   indemnification. The compounds are predicted to have sufficient
   absorption in the gastrointestinal tract, making them a promising oral
   drug candidate. A compound with a high Human Intestinal Absorption
   (HIA) score HIA ≥ 30%, is considered to be highly absorbed in the human
   intestine^[209]93,[210]94. In our proposed 3 drugs we found that, all
   the 3 compounds have high HIA score ≥ 50% which indicate that they have
   good absorption properties by the human body and also our top-ranked 2
   drugs have the ability to inhibit the P-glycoprotein inhibitor (P-gpI)
   except the Dovitinib. The ability of a compound to cross the
   blood–brain barrier (BBB) is determined by BBB-permeability index.
   Compounds having a LogBB ≥ 0.3 can cross the BBB easily and potentially
   while the value with LogBB < -1 are considered to be poorly distributed
   through to the BB barrier. All the three compounds evaluated and found
   that all proposed drugs poorly able to cross the BBB (TTable [211]4B).
   As well as, according to the value of LogPS (CNS) are considered to
   partly penetrate the central nervous system. Membrane-bound
   hemoproteins called human cytochrome P 450 (CYP) enzymes play an
   essential role for homeostasis, drug detoxification, and cellular
   metabolism. About 50% of all common clinical medication elimination in
   humans and almost 80% of the oxidative metabolism are attributed to
   more than one CYPs from CYP classes 1–3^[212]95. All the 3 drugs except
   Digoxin have the YES properties to inhibit the CYP3A4 membrane of our
   human body. The highest value of oral toxicity or lethal dose (LD[50])
   were found as 3.7 mol/kg for digoxin, whereas 2.9 and 2.4 mol/kg was
   identified as lowest value for imatinib and dovitinib respectively. The
   greater the value of LC[50] the lower the toxicity of a drug molecules,
   while a smaller value indicates higher toxicity. Thus, from the given
   table we can observed that the order of the LC[50] value for the
   top-ranked drugs are digoxin, imatinib, and dovitinib respectively.
   Toxicity analyses (AMES, Minnow toxicity (LC[50]), and lethal dose
   LD[50]) of our suggested revealed that these compounds were inactive
   for all of the toxicity prediction parameters utilized, and
   consequently were predicted to be non-toxic.

Table 4.

   Drug likeness and ADME/T analysis results. (A) Drug likeness profile of
   candidate drug molecules. (B) ADME and Toxicity (ADME/T) profile of
   top-ranked 3 drug molecules.
   A. Drug likeness profile of candidate drug molecules
   Compounds Molecular weight Log P H-bond Acceptor (HBA) H-bond donor
   (HBD) Polar Surface area (Å^2) No of rotatable bond
   Digoxin 780 4.84 14 6 203.06 7
   Imatinib 493 4.04 6 2 86.28 8
   Dovitinib 393.43 2.26 4 3 94.04 2
   B. ADME and Toxicity (ADME/T) profile of top-ranked 3 drug molecules
   Compounds Absorption Distribution Metabolism Excretion Toxicity
   Caco2 Permeability HIA (%) P-gpI BBB (LogBB) CNS LogPS CYP3A4 Inhibitor
   TC AMES LC[50] (log mM) LD[50] (mole/kg)
   (Permeability)
   Digoxin 0.59 68.58 Yes − 1.39 − 3.81 No 3.67 No 4.35 3.7
   Imatinib 1.09 93.85 Yes − 1.37 − 2.51 Yes 0.72 No 2.08 2.9
   Dovitinib 0.47 83.63 No − 0.71 − 2.27 Yes 0.76 Yes 3.04 2.4
   [213]Open in a new tab

Discussion

   Ttype-2 diabetes (T2D) is considered as one of the risk factors for
   clear-cell renal cell carcinoma (ccRCC)^[214]96,[215]97. Therefore,
   identification of both diseases-causing shared key-genes (sKGs) is
   essential in order to investigate their common pathogenetic mechanisms
   and candidate drugs for better diagnosis and therapies during their
   co-occurrence. However, there was no study in the literature that has
   explored sKGs highlighting their pathogenic mechanisms, and candidate
   drug molecules as the common treatment for both T2D and ccRCC though
   diseases specific multiple drugs may create adverse side effects or
   toxicity to the patients due to drug-drug interaction^[216]23–[217]26.
   In order to explore, common drugs as the representative of both disease
   specific multiple drugs, this study investigated the genetic
   relationship between T2D and ccRCC by detecting shared DEGs (sDEGs)
   that can separate both T2D and ccRCC patients from the control samples.
   We identified 259 sDEGs, where 194 upregulated and 65 downregulated. We
   computed local correlation coefficient between T2D and ccRCC based on
   the aLog[2]FC values of sDEGs by using Eq. [218]2, which is found as
   r[XY] = 0.82, which indicates that both diseases are locally associated
   with each other through the expressions of sDEGs. Then we detected
   top-ranked 10 sDEGs (CDC42, SCARB1, GOT2, CXCL8, FN1, IL1B, JUN, TLR2,
   TLR4, and VIM) as sKGs (Fig. [219]2) for exploring their pathogenetic
   mechanisms and candidate common drugs for both diseases. The summary of
   this study are given in Figure S2 and the association of these 10 sKGs
   with both T2D and ccRCC also supported by some previous individual
   studies including CDC42^[220]33,[221]98,[222]99,
   SCARB1^[223]100–[224]102, VIM^[225]103–[226]105,
   IL1B^[227]106–[228]110, GOT2^[229]111,[230]112, JUN^[231]113,[232]114,
   TLR4^[233]115–[234]118, FN1^[235]119–[236]121, TLR2^[237]122–[238]124
   and CXCL8^[239]125–[240]129 as displayed in Fig. [241]6A. A study
   claimed that the gene ‘CDC42’ stimulates insulin secretion which is
   connected to the diabetes-related diseases, such as Diabetic
   Nephropathy(DN), ccRCC and various cancers^[242]98. Disorder of CDC42,
   can prevent healthy insulin secretion and promote diabetes. Most people
   agree that insulin resistance is the main factor for T2D^[243]99. As a
   result, CDC42 plays a significant role in the development of T2D, and
   treating T2D and associated disorders may benefit from targeted therapy
   for CDC42 ^[244]33. The polyligand membrane receptor protein SCARB1
   involved in the glucose and lipid metabolism disturbance associated
   with T2D. A higher risk of T2D is linked to genetic variations of
   SCARB1^[245]100,[246]101. Another study claimed that, SCARB1 is serve
   as both a therapeutic target and a diagnostic biomarker for
   ccRCC^[247]102. Vimentin, or VIM, has been identified as a key mediator
   of T2D linked to obesity^[248]103,[249]104. On the other hand, VIM is
   considerably overexpressed in ccRCC cells^[250]105. According to the
   article GOT2 might be a useful prognostic indicator and therapeutic
   target for people with ccRCC^[251]111. Its expression in T2D and ccRCC
   is notably low^[252]112. A glycoprotein produced by fibronectin 1 (FN1)
   gene, is involved in host defenses, wound healing, blood coagulation,
   embryogenesis, and metastasis including other activities associated
   with cell adhesion. The up-regulation of FN1 is directly associated
   with the development of renal cell carcinoma (RCC)^[253]121 as well as
   DN^[254]119. The up-regulation of CXCL8 exhibits elevated serum levels
   in kidney cancer (KC) patients^[255]128 and early regulations in DN
   patients^[256]129. The protumor genes as well as RCC tumors are
   influenced by the up-regulation of IL1B gene^[257]108. Another two
   studies found the up-regulation of IL1B gene in T2D
   patients^[258]109,[259]110. The expressions of TLR2 gene are found
   considerably higher in ccRCC tumor tissues^[260]124. Activation of TLR2
   and TLR4 genes, and their overexpression are closely correlated with
   the severity of renal damage, according to a number of studies in
   different experimental models of kidney disease^[261]130. Creely et al.
   have found higher TLR2 expression in the adipose tissues of
   T2D patients^[262]122. In KC, renal inflammation and chronic fibrosis
   are significantly influenced by TLR4^[263]118. The long-term
   inflammatory condition cause of insulin resistance and the development
   of T2D^[264]115.

Figure 6.

   [265]Figure 6
   [266]Open in a new tab

   Verification of the proposed drug-targets (shared KGs) and drug-agents
   for T2D and CRC by the literature review. (A) Verification of the
   proposed drug-targets, (B) Verification of the proposed drug-agents.

   To explore transcriptional and post-transcriptional regulators of sKGs
   from TFs and miRNAs respectively, we performed sKGs-TFs and sKGs-miRNAs
   co-regulatory network analysis (Fig. [267]3) which detected five TFs
   proteins (YY1, FOXL1, FOXC1, NR2F1, and GATA2) and six miRNAs
   (hsa-mir-93-5p, hsa-mir-203a-3p, hsa-mir-204-5p, hsa-mir-335-5p,
   hsa-mir-26b-5p, and hsa-mir-1-3p) as the key regulators of sKGs. A
   transcriptional coregulator, Yin Yang 1 (YY1) stimulates the
   transcription of several long noncoding RNAs and its highly expressed
   in ccRCC^[268]131. YY1 in T2D^[269]132 and its role in the different
   symptoms as well as its interaction with signaling pathways that
   control the disease. Numerous disorders, including ccRCC ^[270]133,
   have been discovered to be regulated by the transcription factor known
   as YY1. FOXL1 and FOXC1 are members of the same family and perform
   similar functions. Thus, these genes might play significant roles in
   the ccRCC pathogenesis^[271]134. Another article shows that FOXL1 and
   FOXC1 are highly associated with T2D^[272]135.The development of ccRCC
   toward more aggressive molecular subtype is influenced by GATA2
   proteins^[273]136. Another study reported that GATA2 is an important
   risk factor for T2D^[274]137, dyslipidemia and hypertension (HTN).
   Hsa-mir-335-5p has been shown to be involved in the management of RCC
   progression in a number of investigations^[275]138. Another article
   shows that, hsa-miR-335-5p appears to be associated in T2D by possibly
   influencing the expression of several candidate genes^[276]139. The
   Kaplan–Meier Plotter datasets show that miR-93-5p is highly expressed
   in ccRCC^[277]140.On the other hand, hsa-mir-93-5p which plays
   significant roles in post-transcriptional regulatory genes, especially
   in T2D^[278]135. Numerous kinds of cancer, including hepatocellular
   carcinoma (miR-9-5p), renal cell carcinoma (miR-1-3p)^[279]141, and
   thyroid cancer (miR-1-3p), may be impacted by the majority of the
   miRNAs examined. By examining the sKGs-set enrichment analysis,
   correlation of sKGs with distinct methylation of DNA, and regulatory
   analysis of sKGs using a variety of databases, we investigated the
   shared pathogenetic mechanism of ccRCC and T2D. We investigated
   critical biological processes (BP), molecular functions (MF), cellular
   components (CC), and KEGG pathways that are connected to the onset of
   T2D and ccRCC using enrichment analysis of the sKGs-set (Table [280]3).
   The ccRCC and T2D that were extensively enriched and caused important
   BP, MF, CC, and KEGG pathways were, Apoptotic process, inflammatory
   response, signal transduction, positive regulation of gene expression,
   identical protein binding, extracellular exosome, Lipid and
   atherosclerosis etc. these are disfunction and progression of T2D to
   ccRCC. Among them, Apoptosis, also known as programmed cell death, is
   the principal biological mechanism by which mammals destroy DNA-damaged
   cells and preserve tissue homeostasis. The failure of apoptosis
   increases the lifespan of tumor cells time and develops mutations,
   which can improve spreading during tumor cell development, improve
   tumor angiogenesis, and encourage cell proliferation (Fig. [281]1).
   Apoptosis is directly related to the control of T cells in ccRCC, and
   this must be considered in ccRCC immunotherapy^[282]142. Changes in
   signal transduction are always present in cancer such as, RCC. The
   dysregulated signal transduction that results from changes in
   proto-oncogenes and tumor suppressor genes ultimately promotes the
   abnormal development and proliferating of cancer cells^[283]143. In
   patients with T2D, hyperglycemia (also known as "glucose toxicity") may
   play a significant role in the development of insulin resistance and
   impaired signal transduction in the skeletal muscle^[284]144. Exosomes
   are crucial in the onset, detection, and management of kidney^[285]145,
   prostate, bladder, T2D^[286]146 and breast malignancies. The KEGG term,
   NOD-like receptors (NLRs) are widely used for pathogen identification
   receptors. NLRs play an important role in the cause of
   inflammation-induced insulin resistance(T2D), which leads to additional
   metabolic problems^[287]147. NLRs are divided into four subfamilies
   according to the type of N-terminal domains: NLRA, NLRB, NLRP, NLRC (C
   for CARD): NOD1, NOD2, NLRC3, NLRC4, NLRC5. Between, NOD1 and NOD2
   inhibition has potential for treatment in acute kidney injury (AKI)
   ^[288]148. We also investigated using the DNA-methylation information
   with T2D and ccRCC shared KGs. An epigenetic process called DNA
   methylation involves adding a methyl group to cytosine bases,
   especially at CpG sites. Hypermethylation of CpG islands within
   promoter regions of tumor suppressor genes is widely recognized as a
   key mechanism leading to gene inactivation in various cancers^[289]149.
   In our investigation, we found that ten sKGs (CDC42, SCARB1, GOT2,
   CXCL8, FN1, IL1B, JUN, TLR2, TLR4, and VIM) were notably (p-value
   of < 0.001) within seven sKGs (SCARB1, FN1, IL1B, JUN, TLR2, TLR4, and
   VIM)) were hypomethylated at various CpG locations (Table [290]S7).
   Therefore, it can be concluded that these ten hypomethylated sKGs have
   a substantial association with the growth and progression of ccRCC and
   the survival of the apoptotic process^[291]150.

   In order to explore sKGs-guided common drug molecules for both T2D and
   CRC, we used molecular docking analysis and identified top-ranked four
   molecules (Digoxin, Imatinib and Dovitinib) that showed strong binding
   affinities with the sKGs-mediated target proteins. Then these molecules
   were verified for T2D and CRC by the literature review as displayed in
   Fig. [292]6B. To validate the drug molecules computationally, we
   conducted an ADME/T analysis and evaluated their drug-likeness. Two of
   the three medications that were shown to have drug-like properties were
   imatinib and dovitinib, which fit at least four of Lipinski's rule of
   five criteria. The chosen substances showed favorable ADME/T
   characteristics, possessing sufficient water solubility and high Human
   Intestinal Absorption (HIA) levels between 68.58% to 93.85%, and no
   carcinogenic effects. Among the top three identified candidate drugs
   molecules, Digoxin^[293]151,[294]152 and Imatinib^[295]153,[296]154
   received support as the common candidate molecules for both T2D and
   ccRCC by the individual studies on T2D and ccRCC. It should be noted
   here that both drug molecules are already approved by FDA for the
   treatment of heart failure, atrial fibrillation (Digoxin),
   dermatofibrosarcoma protuberans, leukemias, systemic mastocytosis,
   myelodysplastic/myeloproliferative case, gastrointestinal stromal
   tumors and hyper eosinophilic syndrome (Imatinib), which can be found
   with Drug Bank (DB) accession ID DB00390 and DB00619, respectively.
   According to the reference article indicate that, Digoxin therapy
   significantly reduced cancer cell migration and proliferation in RCC
   cells and it is unique medicinal target for treating ccRCC
   patients^[297]151. It is claimed that, digoxin is a kind of cardiac
   glycoside that is used to treat heart failure as well as cardiac
   arrhythmias, that are both of common complication of T2D. Indeed, it is
   believed that up to 18% of diabetes patients receive digoxin^[298]152.
   In the animal model, imatinib effectively reduces blood sugar levels
   and treats the T2D^[299]153. Another experimental study reported that
   imatinib is a potent inhibitor against ccRCC^[300]154. The third
   proposed drug ‘dovitinib’ acts as an antagonist of some RCC-causing
   genes (VEGFR1, VEGFR2, VEGFR3, FGFR1, FGFR2, and FGFR3) according to an
   experimental study report^[301]155. Thus, we found that only Imatinib
   is experimentally validated in wet-lab for T2D and ccRCC individually,
   but not simultaneously. On the other hand, dovitinib molecule was
   experimentally validated with RCC only. However, the top-ranked drug
   molecule ‘Digoxin’ is not yet experimentally validated either with T2D
   or ccRCC though it is approved for other disease. Therefore,
   experimental validation is required for dovitinib molecule with T2D and
   the Digoxin molecule with both T2D and ccRCC. On the other hand, the
   proposed sKGs might be useful prognostic biomarkers in the development
   of immune therapy for ccRCC with T2D as discussed in different articles
   for other diseases^[302]2,[303]156,[304]157.

Conclusion

   This study detected ten shared key genes (CDC42, SCARB1, GOT2, CXCL8,
   FN1, IL1B, JUN, TLR2, TLR4, and VIM) that are able to differentiate
   both T2D and ccRCC patients from the control groups. The differential
   expression patterns of sKGs were also confirmed by some independent
   datasets from NCBI, TCGA and GTAx databases. Some significant shared
   biological processes, molecular roles, and pathways that are connected
   to the development of both T2D and ccRCC were identified by the shared
   key gene (sKGs) set enrichment analysis. The sKGs regulatory network
   analysis detected some TFs proteins and miRNAs as the transcriptional
   and post-transcriptional regulators of sKGs. The DNA methylation
   analysis detected some crucial hypo-methylated CpG sites that might
   stimulate the ccRCC development. Finally, sKGs-guided top-ranked three
   candidate drug agents (Digoxin, Imatinib, and Dovitinib) were
   discovered through molecular docking, drug-likeness and ADME/T
   analysis. The pipeline of this study might be a guideline to explore
   common pathogenetic processes and candidate drug molecules for taking a
   common treatment plan against other multiple diseases also. The output
   of this study might be potential inputs to the wet-lab researchers for
   further investigation in developing sKGs-guided effective common drugs
   against T2D and ccRCC.

Supplementary Information

   [305]Supplementary Information.^ (1.1MB, docx)

Acknowledgements