Abstract Type 2 diabetes (T2D) and Clear-cell renal cell carcinoma (ccRCC) are both complicated diseases which incidence rates gradually increasing. Population based studies show that severity of ccRCC might be associated with T2D. However, so far, no researcher yet investigated about the molecular mechanisms of their association. This study explored T2D and ccRCC causing shared key genes (sKGs) from multiple transcriptomics profiles to investigate their common pathogenetic processes and associated drug molecules. We identified 259 shared differentially expressed genes (sDEGs) that can separate both T2D and ccRCC patients from control samples. Local correlation analysis based on the expressions of sDEGs indicated significant association between T2D and ccRCC. Then ten sDEGs (CDC42, SCARB1, GOT2, CXCL8, FN1, IL1B, JUN, TLR2, TLR4, and VIM) were selected as the sKGs through the protein–protein interaction (PPI) network analysis. These sKGs were found significantly associated with different CpG sites of DNA methylation that might be the cause of ccRCC. The sKGs-set enrichment analysis with Gene Ontology (GO) terms and KEGG pathways revealed some crucial shared molecular functions, biological process, cellular components and KEGG pathways that might be associated with development of both T2D and ccRCC. The regulatory network analysis of sKGs identified six post-transcriptional regulators (hsa-mir-93-5p, hsa-mir-203a-3p, hsa-mir-204-5p, hsa-mir-335-5p, hsa-mir-26b-5p, and hsa-mir-1-3p) and five transcriptional regulators (YY1, FOXL1, FOXC1, NR2F1 and GATA2) of sKGs. Finally, sKGs-guided top-ranked three repurposable drug molecules (Digoxin, Imatinib, and Dovitinib) were recommended as the common treatment for both T2D and ccRCC by molecular docking and ADME/T analysis. Therefore, the results of this study may be useful for diagnosis and therapies of ccRCC patients who are also suffering from T2D. Keywords: Clear-cell renal-cell carcinoma, Type-2 diabetes, Shared key genes, Molecular mechanisms, Drug repurposing, Bioinformatics analysis Subject terms: Cancer, Computational biology and bioinformatics, Drug discovery, Genetics, Molecular biology, Biomarkers, Diseases, Molecular medicine, Oncology Introduction Cancer is considered as the second leading cause of mortality globally, with around 20 million new cases and more than 10 million deaths annually. Over 50% of cancer patients ultimately died, despite the advancements in the field of diagnosis and therapies^[40]1. Aged patients are one of the most crucial factors for increasing the cancer-related mortality^[41]2. The clear-cell renal cell carcinoma (ccRCC) is a common cancer worldwide. There are several types of kidney cancer (KC) including renal cell carcinoma (RCC). The ccRCC is a subtype of RCC, which make up about 70–80% of KC^[42]3. It is the 8th commonest cancer among women and the 6th most common disease among men^[43]4. It had the 17th highest cancer-related mortality in 2018 with 175,098 deaths worldwide^[44]5. In 2020, the death rate of KC patient was around 42%^[45]6. The ccRCC cancer is the most common type of KC in adults, and its incidence increases with age. While it can occur at any age, the risk of developing ccRCC generally increases after the age of 40, and the highest incidence rates are seen in people aged 60 and older^[46]7. On the other hand, most of the older peoples suffer from type 2 diabetes (T2D). It is typically occurred due to the insulin resistance^[47]8. Insulin resistance hinders the body from using glucose for energy, blood sugar levels remain consistently high^[48]9. A population based study reported that the prevalence of diabetes among all age groups is 2.8% in 2000 and is projected to increase to 4.4% by 2030^[49]10. However, some other studies have reported that T2D is associated with ccRCC^[50]11–[51]14, liver cancer^[52]15, colorectal^[53]16, breast, stomach, endometrium, pancreas, lymphoid tissue and urinary bladder cancers^[54]11. A study has been reported that cancer-related deaths account for approximately 13% of overall mortality in diabetic patients^[55]12. Achieving various kidney problems^[56]13 including microalbuminuria, macroalbuminuria, or reduced renal function over time affects about 35% to 50% of T2D patients. RCC, is often considered as a metabolic disease, especially ccRCC. It’s target gene mutations associated in metabolic pathways are a clear characteristic of RCC^[57]17. On the other hand, T2D is also a metabolic disease which is characterized by the deregulation of genes, glucose and lipid metabolism^[58]18. Insulin resistance also increases the insulin levels, insulin-like growth factors as well as hyperactivation of protein kinase B (Akt)/mTOR in blood which may stimulate the growth and development of tumors^[59]19,[60]20. Additionally, raised triglyceride levels, higher blood pressure in men, high body mass index (BMI), and T2D in women are distinct risk factors for ccRCC^[61]19. Numerous genes or proteins that are mutated or methylated for developing cancer, are overexpressed or suppressed, resulting in conformational alterations such as post-translational modifications (PTM). Its results change the cellular signaling pathways and functions which ultimately cause the change of metabolic processes^[62]21. Thus, ccRCC might be associated with T2D as displayed in Fig. [63]1 and ccRCC patients may suffer from complicated situations due to the influence of T2D. Therefore, identification of both ccRCC- and T2D-causing shared key genes (sKGs) also known as biomarker genes, is essential in order to investigate their genetic association for better diagnosis and therapies. Figure 1. [64]Figure 1 [65]Open in a new tab A schematic diagram about the link between T2D and ccRCC. During the co-occurrence of T2D and ccRCC, Doctors may be prescribed both diseases specific multiple drugs to the patients^[66]22. However, drug-drug interaction (DDI) during polypharmacy may create some adverse side effects or toxicity to the patients for which patients may reach to the severe conditions ^[67]23–[68]26. In that case, Doctors should prescribe fewer numbers of common drugs as the representative of those multiple drugs in order to reduce the toxicity. However, so far, there is no study yet in the literature that has suggested any common drug for the treatment of both diseases though aged patients are at high risk of DDI due to the prevalence of polypharmacy and changes in age-related metabolism. Therefore, it is required to explore potential common drugs for ccRCC and T2D as the representative of those disease specific multiple drugs. In order to explore common drugs, at first, it is required to explore ccRCC- and T2D-causing sKGs as the targets of common drugs, since specific disease-causing key genes/proteins are widely used as the targets of disease specific drugs ^[69]27–[70]30. Nevertheless, it is very difficult to explore ccRCC- and T2D-causing top-ranked sKGs and candidate therapeutic ligands/agents from huge number of alternatives through the wet-lab experiments only, since wet-lab experiments are time consuming, laborious and costly. To overcome this issues, in-silico bioinformatics and system biology approaches are playing the significant roles^[71]31,[72]32. In the case of target selection, genomics/transcriptomics analysis through integrated statistics and network-based approaches are widely used^[73]31,[74]32. There are some in-silico studies that explored T2D- and ccRCC-causing key genes (KGs) and their pathogenetic mechanisms individually^[75]33–[76]38. Though, some studies investigated shared KGs (sKGs) for T2D with HCC(Hepatocellular-carcinoma)^[77]39,[78]40 and CRC (colorectal cancer)^[79]41,[80]42, however, so far, there is no study in the literature that has explored T2D- and ccRCC-causing sKGs. Therefore, this study aimed to explore both T2D- and ccRCC-causing sKGs highlighting their pathogenetic mechanisms and candidate common drug molecules for taking a better treatment plan against ccRCC with T2D, by using the integrated bioinformatics and system biology approaches. Materials and methods Data source and descriptions To explore shared key genes (sKGs) between T2D stimulates ccRCC, we considered four micro-array gene expression profile datasets for each of T2D ([81]GSE25724^[82]43, [83]GSE29221^[84]44, [85]GSE29226^[86]45 and [87]GSE29231^[88]46) and ccRCC ([89]GSE66270^[90]47, [91]GSE66272^[92]48, [93]GSE76351^[94]49 and [95]GSE66271^[96]50) from the Gene Expression Omnibus (GEO) platform in the National Center for Biotechnology Information (NCBI) database. Table [97]1 provides the detailed descriptions of the datasets. Table 1. Data source and descriptions. GEO datasets Country Platform Cases Control [98]GSE25724 Italy [99]GPL96[HG-U133A] Affymetrix Human Genome U133A Array 6(T2D) 7 [100]GSE29221 India GPL6947Illumina HumanHT-12 V3.0 expression beadchip 12(T2D) 12 [101]GSE29226 India GPL6947Illumina HumanHT-12 V3.0 expression beadchip 12(T2D) 12 [102]GSE29231 India GPL6947Illumina HumanHT-12 V3.0 expression beadchip 12 (T2D) 12 [103]GSE66270 Germany [104]GPL570[HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 14(ccRCC) 14 [105]GSE66272 Germany [106]GPL570[HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 27(ccRCC) 27 [107]GSE76351 Russia [108]GPL11532[HuGene-1_1-st] Affymetrix Human Gene 1.1 ST Array [transcript (gene) version] 12(ccRCC) 12 [109]GSE66271 Germany [110]GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array 13(ccRCC) 13 [111]Open in a new tab Identification of Differentially Expressed Genes (DEGs) To identify differentially expressed genes (DEGs) between case and control groups, we considered the, since it shows good performance in the case of small sample sizes also. It produces P.values based on the moderated t-statistic^[112]51 to measure the significance of differential expressions between two condition. The moderated t-statistic is formulated by combining the classical and Bayesian estimation of the relevant parameters^[113]51,[114]52. Then gth differentially expressed gene (DEG[g]) is defined by combining its adjusted P.value and the average of log2 fold-change (aLog[2]FC) values as follows, [MATH: \;DEGg=DEGUp,if\;adj.P.value0.05andaLog2FCg< /mrow>+1DEGDown,if\;adj.P.valu e<0.05andaLog2FCg< /mrow><-1 :MATH] where alog[2]FC value for gth gene is computed as [MATH: aLog2FCg=1n1in1 log2(zgiD)-1n2jn2 log2zgjC,< /mo>ifn1 n21n inl og2zgiDzgjC,ifn1= n2=n :MATH] 1 Here [MATH: zgiD :MATH] and [MATH: zgjC :MATH] are the responses/expressions for the gth gene with the ith disease and jth control samples, respectively. We utilized the limma R-package^[115]53 for calculating the P.values and Log[2]FC values to select the DEGs, significantly for both T2D and ccRCC patients. Identification of shared DEGs (sDEGs) At first, we detected DEGs between ccRCC and control samples based on four datasets with NCBI accession ID [116]GSE66270, [117]GSE66272, [118]GSE76351, and [119]GSE66271. Then detected DEGs for T2D vs. control samples were detected based on four datasets with accession ID [120]GSE25724, [121]GSE29221, [122]GSE29226 and [123]GSE29231. Then shared DEGs (sDEGs) that are able to separate both T2D and ccRCC samples from the control samples, were selected. Local genetic association between T2D and ccRCC through sDEGs Though average of log[2]FC (alog[2]FC) values were calculated for T2D and ccRCC from independent datasets by Eq. [124]1, but these values were calculated from the same unit of sDEGs for each of T2D and ccRCC. A shared DEG (sDEG) is called upregulated for two or more diseases if alog[2]FC > 0 and downregulated if alog[2]FC < 0. If we assume that the function of a gene is almost same for all control patients, we may measure the genetic association between any two diseases X and Y based on their alog[2]FC values corresponding to the expressions of sDEGs through the Pearson’s correlation coefficient which is defined as [MATH: rxy=∑< /mo>xg-x¯yg-y¯(xg-x¯)2yg-y¯2 :MATH] 2 where [MATH: xg=alog2FCX :MATH] and [MATH: yg=alog2FCY :MATH] are the alog[2]FC values of the g^th gene for the two diseases X and Y; [MATH: x¯ :MATH] and [MATH: y¯ :MATH] are the means of [MATH: xgs :MATH] and [MATH: ygs :MATH] , respectively. Identification of shared key genes (sKGs) from sDEGs Proteins interact with other proteins in the cell to carry out their tasks, and information generated by the protein–protein interaction (PPI) network is used to select the key genes^[125]54,[126]55. In order to generate PPI network, the distance matrix ‘D’ is calculated as [MATH: Di,j=2NiNj|Ni+|Nj< /mfenced>, :MATH] where N[i] is the neighbor set of ith protein and N[j] is the neighbor set of jth protein. In order to identify shared key genes (sKGs), a PPI-network of sDEGs was constructed using the STRING database^[127]56. To select the sKGs from the PPI network, we used different topological measures (Betweenness^[128]57, Degree^[129]58, BottleNeck^[130]59, Closeness^[131]60 MNC^[132]61, Radiality^[133]62 and Stress^[134]63) by using CytoHubba plugin-in Cytoscape software^[135]64. In-silico validation of sKGs using independent datasets and databases The differential expression patterns of sKGs were validated in both disorders (ccRCC & T2D) by Box plots analysis with the independent datasets from NCBI, TCGA and GTEx databases. We used the TCGA and GTEx databases in the GEPIA2^[136]65 web-tool to confirm the differential expression patterns of sKGs between ccRCC and control samples. In ordered to validate the differential expression patterns of sKGs between T2D and control samples, we used two independent datasets with accession IDs [137]GSE15932^[138]66 and [139]GSE20966^[140]67 from NCBI database. Regulatory network analysis of sKGs A gene regulatory network (GRN) displays molecular regulators that interact with each other in the cell to control the gene expressions. The transcription factors (TFs) and microRNAs (miRNAs) are considered as the transcriptional and post-transcriptional regulators of protein coding genes. To select the top-ordered TFs as the key transcriptional regulators of sKGs, the TFs versus sKGs interaction network analysis was performed by using JASPAR^[141]68 databases with the NetworkAnalyst web-tool^[142]69. Similarly, to identify top-ordered miRNAs as the key post-transcriptional regulators of sKGs, the sKGs versus miRNAs interaction network analysis was performed by using the TarBase database^[143]70 databases with the NetworkAnalyst web-tool^[144]69. The sKGs-set enrichment analysis with GO-terms and KEGG-pathways The sKGs-set enrichment studies with gene ontology (GO) terms and Kyoto encyclopedia of genes and genomes (KEGG) pathways^[145]71 were performed to explore biological processes (BP), molecular functions (MF), cellular components (CC) and pathways of sKGs. In order to identify significantly enriched GO terms (BPs, MF, CCs) or KEGG-pathways by the sKGs-set, a 2 × 2 contingency table was constructed (see Table [146]2). Table 2. A 2 × 2 Contingency table. Annotated genes sKGs (proposed) Not-sKGs Marginal total Annotated gene-set in i^th GO term/KEGG pathway (A[i]) k[i] M[i]—k[i] M[i] Complement gene-set of A[i] ( [MATH: Aic :MATH] ) n—k[i] N—M[i] – n + k[i] N—M[i] Marginal total n N—n N (Grand total) [147]Open in a new tab where A[i]: annotated genes in the i^th BPs/MFs/CCs/KEGG-pathways in the database, M[i]: total number of annotated genes in A[i] (i = 1, 2,…,r); N: total number of annotated genes in [MATH: A=i=1rAi=AiAic :MATH] such that [MATH: Ni=1 rMi. :MATH] Here n: total number of sKGs, k[i]: number of sKGs belonging to A[i]. To detect the significantly enriched GO-terms or KEGG-pathways with sKGs, the database for annotation, visualization and integrated discovery (DAVID)^[148]72 was used to calculate the p-value by the Fisher exact test statistic based on hypergeometric distribution^[149]73. DNA methylation analysis Development of many diseases including cancers, obesity and T2D are associated the aberrant DNA methylation. DNA methylation analysis is used to gain relevant knowledge about gene regulation and detect potential biomarkers. In this study, MethSurv^[150]74 and UALCAN^[151]75 were employed to investigate the DNA methylation status of sKGs. DNA methylation level was expressed as β-values (which ranged from 0 to 1). Using the equation M / (M + U + 100), the -values are determined. Here, M and U are stand for fully methylated and totally unmethylated intensities, respectively. Exploring sKGs-guided repurposable common drug molecules for both T2D and ccRCC There are two in-silico ways (de-novo and repurposing) of exploring drug molecules for diseases, where de-novo approach is time consuming, costly and laborious compared to the drug repurposing (DR) approach, since the DR approach explores existing drugs for a disease of interest that drugs are already approved for other diseases^[152]76. However, in both-approaches, molecular docking analysis with the synthetic molecules ^[153]30,[154]77 as well as phytocompounds^[155]78–[156]80 are widely used in order to explore potential ligands/agents. In order to explore sKGs-guided repurposable common drug molecules for T2D and ccRCC, we collected 148 candidate molecules from published articles associated with T2D and ccRCC, and online databases as given in Table [157]S1. The Protein Data Bank(PDB)^[158]81, SWISS-MODEL^[159]82 and AlphaFold databases were utilized to obtain the three-dimensional configurations of every sKGs-mediated receptor proteins. Using Swiss PDB view^[160]83 and AutoDock Vina^[161]84, receptor-proteins were pre-processed by including charges and reducing energy, respectively. All 148 potential drug compounds' 3D structures were downloaded from the PubChem database^[162]85 and ready for molecular docking simulation by using AutoDock tools 1.5.7 to set the ligand's rotatable/non-rotatable links and torsion tree. Then AutoDock Vina^[163]84 was used to compute the binding affinities between the drugs and the target proteins. The docked complexes were examined using PLIP^[164]86, PyMol^[165]87, and Discovery Studio Visualizer (BIOVIA 2021) software^[166]88 to determine the types, distances, and surface complexes of non-covalent bonds. Let B[ij] indicates the BAS (binding affinity score) between i^th receptors (i = 1, 2, …, p) and j^th ligands/agents (j = 1, 2, …, q). Then receptors were arranged according to the decreasing-order of row average [MATH: 1pj=1< /mrow>qBij,i=1,2p :MATH] and ligands/agents according to the decreasing-order of column average [MATH: 1qi=1< /mrow>pBij,j=1,2,,q\; :MATH] to select the top-ranked few agents/ligands as the potential candidate drug molecules. In-silico validation of candidate drug molecules by ADME/T analysis The drug-like characteristics and ADMET (absorption, distribution, metabolism, excretion, and toxicity) properties were determined of top-ranked 3 drug compounds in order to learn more about their structural characteristics and chemical descriptors. We use SCFBio ([167]http://www.scfbio-iitd.res.in/software/drugdesign/lipinski.jsp) web tool for evaluating their Lipinski rule satisfaction of drug likeness properties (including molecular weight, number of hydrogen donor and acceptor bonds, rotatable bond, octanol/water partition coefficient or LogP value, etc.)^[168]89. The ADMET properties were then predicted by using the online databases SwissADME^[169]90 and, and pkCSM^[170]91. The ADME/T calculations of medicinal compounds were performed using their optimal structures in SMILES formats. Results Identification of Differentially Expressed Genes (DEGs) At first, we identified DEGs for both T2D and ccRCC patients by using LIMMA with an r-package. The cut-off at adjusted P.values > 0.05 and |Log[2]FC|> 1 was used to select the DEGs as mentioned in section "[171]Identification of Differentially Expressed Genes (DEGs)". For ccRCC, we detected 15,348, 14,472, 1820 and 1576 downregulated DEGs, and 8150, 8166, 563 and 759 upregulated DEGs, for the NCBI datasets with accession ID [172]GSE66270, [173]GSE66272, [174]GSE76351, and [175]GSE66271, respectively. Then, 738 upregulated and 47 downregulated DEGs (Table [176]S2) were detected as common DEGs (cDEGs) for ccRCC. From the NCBI datasets with accession ID [177]GSE25724, [178]GSE29221, [179]GSE29226, and [180]GSE29231, we identified 2651, 459, 839, and 2854 upregulated DEGs, and 3032, 1875, 2173 and 1569 downregulated DEGs respectively, for T2D patients. We found 252 downregulated and 498 upregulated cDEGs for T2D (Table [181]S3). Identification of shared DEGs (sDEGs) between T2D and ccRCC In the previous section, we found 738 upregulated and 47 downregulated DEGs for ccRCC based on four transcriptomics datasets. Similarly, 252 downregulated and 498 upregulated DEGs for T2D based on another four transcriptomics datasets. Then we detected 194 as upregulated shared DEGs (sDEGs) and 65 as downregulated sDEGs for both T2D and ccRCC (Table [182]S4 & [183]S5). Thus, we considered in total 259 sDEGs for both T2D and ccRCC. Local association between T2D and ccRCC through sDEGs To find the link between T2D and ccRCC, we computed local correlation coefficient between T2D and ccRCC based on the aLog[2]FC values of sDEGs by using Eq. [184]2. The correlation coefficient was found with a value of 0.82, which indicates that T2D and ccRCC are locally associated with each other through the expressions of sDEGs. Identification of shared Key Genes (sKGs) The STRING database was used to build the PPI network of sDEGs which has 259 nodes and 773 edges (Fig. [185]2) By using seven topological measures (Betweenness, BottleNeck, Closeness, Degree, MNC, Radiality and Stress) in the PPI network, we chose the top 10 cHubGs (VIM, CDC42, SCARB1, CXCL8, FN1, IL1B, JUN, TLR2, TLR4 and GOT2) (Table [186]S6). Figure 2. Figure 2 [187]Open in a new tab Protein–protein interaction (PPI) network of sDEGs to identify sKGs, where the chartreuse color nodes indicated the sKGs. In-silico validation of sKGs using independent datasets and databases We investigated the differential expression patterns of sKGs between ccRCC and control samples through Box-plot analysis based on the independent gene expression profiles from TCGA and GTEx databases that contained 523 ccRCC and 100 control samples. From Figure S1A, we observed that 3 sKGs (CDC42, GOT2, CXCL8) are downregulated and the remaining 7 sKGs (TLR4, IL1B, TLR2, FN1, JUN, VIM, SCARB1) are upregulated, which supported the proposed results. We also investigated the differential expression patterns of sKGs between T2D and control samples through Box-plot analysis based on the independent gene expression profiles from the NCBI database with accession ID [188]GSE15932 and [189]GSE20966, where the dataset with accession ID [190]GSE15932 contains 8 pancreatic cancer, 8 T2D and 8 control samples. In our analysis, we considered only T2D and control samples. Figure S1B shows that 3 sKGs (CDC42, GOT2, and CXCL8) are downregulated in T2D, while the rest 7 sKGs (TLR4, IL1B, TLR2, FN1, JUN, VIM, and SCARB1) are upregulated, which also supported the proposed results. The regulatory network analysis of sKGs The top-ranked five significant TFs proteins (FOXL1, FOXC1, NR2F1, YY1 and GATA2) and micro-RNAs (hsa-mir-93-5p, hsa-mir-203a-3p, hsa-mir-204-5p, hsa-mir-335-5p, hsa-mir-26b-5p, and hsa-mir-1-3p) were identified as the key transcriptional and post-transcriptional regulators of sKGs by using the TFs-sKGs-miRNAs interaction network analysis (see Fig. [191]3). Figure 3. [192]Figure 3 [193]Open in a new tab (A) The sKGs-TFs interaction network based on JASPAR database (B) The miRNA-sKGs interaction network based on TarBase database. sDEGs-set enrichment analysis with GO-terms and KEGG pathway We carried out GO and KEGG pathway enrichment analysis for 10 sKGs to look into the shared pathogenetic mechanisms between T2D and ccRCC. The top five MFs, BPs, CCs, and KEGG pathways are listed in Table [194]3. Significantly enhanced KEGG pathways and GO terms with sDEGs through the involvement of sKGs linked to the pathogenetic mechanisms of T2D on ccRCC. Table 3. Significantly enriched GO-terms and KEGG pathways that are associated with T2D and ccRCC. GO ID GO-Terms sDEGs (counts) P.value Associated sKGs Biological process (BPs) GO:0,006,915 apoptotic process 22 3.07E−04 IL1B, TLR2 GO:0,007,165 signal transduction 28 1.95E−06 CXCL8, IL1B, TLR2 GO:0,006,954 inflammatory response 45 1.25E−21 CXCL8, TLR2, TLR4 GO:0,010,628 positive regulation of gene expression 19 7.46E−04 FOXL1, FOXC1, CXCL8, FN1, IL1B, TLR2, TLR4 GO:0,034,976 Response to endoplasmic reticulum stress 8 6.08E−04 JUN, CXCL8 Molecular Function (MFs) GO:0,005,178 integrin binding 15 2.26E−06 FN1, IL1B GO:0,042,802 identical protein binding 44 0.001422588 JUN, FN1, VIM, CDC42, TLR2, TLR4 GO:0,008,201 heparin binding 10 0.001969 CXCL8, FN1 GO:0,001,875 lipopolysaccharide receptor activity 3 0.003454 SCARB1, TLR4, TLR2 GO:0,005,515 Protein binding 248 JUN, FN1, TLR2, TLR4, IL1B, SCARB1, CXCL8, CDC42, VIM, Cellular Components (CC) GO:0,070,062 extracellular exosome 64 3.45E−07 SCARB1, VIM, FN1, GOT2 GO:0,005,829 cytosol 114 5.04E−05 CDC42, IL1B, VIM, NR2F1 GO:0,009,986 cell surface 19 0.007489 SCARB1, TLR2, TLR4, GO:0,016,020 membrane 113 1.31E−04 TLR2, TLR4, FN1 GO:0,005,604 Basement membrane 9 5.59E−05 FN1 Hsa ID KEGG terms sDEGs (counts) P.value Associated sKGs KEGG Pathway hsa05169 Epstein-Barr virus 14 1.65E−04 TLR2, VIM, JUN hsa004621 NOD-like receptor signaling pathway 13 2.87E−04 JUN, CXCL8, IL1B, TLR4 hsa05323 Rheumatoid arthritis 9 4.61E−04 JUN, CXCL8, IL1B, TLR2, TLR4 hsa05417 Lipid and atherosclerosis 13 0.00105 JUN, CXCL8, CDC42, IL1B, TLR2 hsa05165 Human papillomavirus infection 18 2.60E−04 FN1, CDC42 [195]Open in a new tab Disease enrichment analysis with sKGs We performed disease enrichment analysis with sKGs by using the Enrichr web-tool with DisGeNET database to investigate the association of sKGs with different diseases. This analysis significantly detected top-ranked 10 diseases including Diabetic Nephropathy, Kidney Failure and Kidney Disease (see Fig. [196]4) that are associated with sKGs. Figure 4. Figure 4 [197]Open in a new tab Results of disease enrichment analysis with sKGs, where red box indicates significant association (p-value < 0.05). DNA methylation analysis of sKGs in ccRCC DNA methylation is an epigenetic mechanism which regulates gene expression by recruiting proteins involved in gene repression or by inhibiting the binding of transcription factor(s) to DNA^[198]75. Significant tumor suppressor gene silencing is facilitated by DNA hypermethylation, which primarily happens at the CpG islands within a gene's promoter region. On the other hand, oncogenes are upregulated when DNA hypomethylation occurs ^[199]92.Therefore, we examined DNA methylation status at CpG sites for the sKGs (CDC42, SCARB1, GOT2, CXCL8, FN1, IL1B, JUN, TLR2, TLR4, and VIM) by methsurv. We observed that ten sKGs had significant CpG sites (p-value of ≤ 0.001) Table-[200]S7. Additionally, UALCAN was also utilized to visualize promoter methylation status of the 10 sKGs in ccRCC. From Box whisker plot it was found that seven sKGs (SCARB1, FN1, IL1B, JUN, TLR2, TLR4, and VIM)) were hypomethylated according to β-values (ranging from 0 (that means completely unmethylated) to 1 (that means highly methylated)) which is strong evidence that, seven sKGs were upregulated in ccRCC. Drug repurposing by molecular docking To explore candidate ligands (drug molecules) for the treatment against T2D and ccRCC, we considered our proposed 10 sKGs and their regulatory 5 TFs proteins as the receptors. We collected the data from two distinct sources in order to obtain the 3D structures of these receptors. The Protein Data Bank (PDB) was searched for the structures of 10 receptors (CDC42, SCARB1, GOT2, CXCL8, FN1, IL1B, JUN, TLR2, TLR4, and VIM) using the following PDB codes:1a4r, 5ktf, 5ax8, 3il8, 1e88, 2KH2, 1jun, 2z80, 2z62 and 1gk7. The "AlphaFold Protein Structure Database" was used to collect the remaining five targets (FOXL1, FOXC1, NR2F1, YY1, and GATA2). We computed the binding affinity scores (BAS), between the proposed receptors and the candidate drug molecules (Table [201]S1) by using molecular docking analysis. To select the top-ranked therapeutic candidates, drug molecules were ordered based on the average BAS across the receptors (Table [202]S8). Similarly, receptors were ordered based on the average BAS across the drug molecules. Figure [203]5 displayed the top-order 30 drug molecules corresponding to the ordered receptors. We observed that 3 molecules Digoxin, Imatinib and Dovitinib produces average BAS < -7.7 kcal/mol, but the other molecules satisfy BAS > -7.7 kcal/mol. Therefore, we considered Digoxin, Imatinib and Dovitinib as the top-ranked drug molecules to inhibit the proposed sKGs. It is seen that top two molecules Digoxin and Imatinib strongly binds (BAS < -7.0 kcal/mol) to all of the receptor proteins. The third top-ranked molecule Dovitinib also strongly binds to all of the receptor proteins except JUN. Therefore, the proposed three drug molecules might be the potential inhibitors against the T2D- and ccRCC-causing genes. Figure 5. [204]Figure 5 [205]Open in a new tab Molecular docking scores, where red color indicates strong drug-target binding. Image of score matrix, where X-axis indicates top-ordered 30 drug agents (out of 148) and Y-axis indicates ordered proposed receptors. In-silico drug validation The ADME properties of a drug molecule were used to evaluate its absorption, distribution, metabolism, excretion, and toxicity. The drug likeness properties of a drug molecules explore its physicochemical descriptor to describe its different kind of chemical properties (Table [206]4). According to the Lipinski rule we found that, two drugs (Imatinib and Dovitinib) follow all rule-of-five (ROF), the rest one (Digoxin) violates three rules (MW, HBA, and HBD) from ROF. The lipophilicity (LogP value) of these 3 drugs supports the standard range (1—less than equal 5)^[207]89 of Lipinski’s rule. These are found as lipophilic compounds based on their LogP values compared with the standard values. Thus, our suggested top-ranked 3 drug molecules have fulfilled almost all the drug-likeness criteria (Table [208]4A). The ADME and toxicity analysis of proposed compounds can be examined through various parameter for evaluating its effectiveness and indemnification. The compounds are predicted to have sufficient absorption in the gastrointestinal tract, making them a promising oral drug candidate. A compound with a high Human Intestinal Absorption (HIA) score HIA ≥ 30%, is considered to be highly absorbed in the human intestine^[209]93,[210]94. In our proposed 3 drugs we found that, all the 3 compounds have high HIA score ≥ 50% which indicate that they have good absorption properties by the human body and also our top-ranked 2 drugs have the ability to inhibit the P-glycoprotein inhibitor (P-gpI) except the Dovitinib. The ability of a compound to cross the blood–brain barrier (BBB) is determined by BBB-permeability index. Compounds having a LogBB ≥ 0.3 can cross the BBB easily and potentially while the value with LogBB < -1 are considered to be poorly distributed through to the BB barrier. All the three compounds evaluated and found that all proposed drugs poorly able to cross the BBB (TTable [211]4B). As well as, according to the value of LogPS (CNS) are considered to partly penetrate the central nervous system. Membrane-bound hemoproteins called human cytochrome P 450 (CYP) enzymes play an essential role for homeostasis, drug detoxification, and cellular metabolism. About 50% of all common clinical medication elimination in humans and almost 80% of the oxidative metabolism are attributed to more than one CYPs from CYP classes 1–3^[212]95. All the 3 drugs except Digoxin have the YES properties to inhibit the CYP3A4 membrane of our human body. The highest value of oral toxicity or lethal dose (LD[50]) were found as 3.7 mol/kg for digoxin, whereas 2.9 and 2.4 mol/kg was identified as lowest value for imatinib and dovitinib respectively. The greater the value of LC[50] the lower the toxicity of a drug molecules, while a smaller value indicates higher toxicity. Thus, from the given table we can observed that the order of the LC[50] value for the top-ranked drugs are digoxin, imatinib, and dovitinib respectively. Toxicity analyses (AMES, Minnow toxicity (LC[50]), and lethal dose LD[50]) of our suggested revealed that these compounds were inactive for all of the toxicity prediction parameters utilized, and consequently were predicted to be non-toxic. Table 4. Drug likeness and ADME/T analysis results. (A) Drug likeness profile of candidate drug molecules. (B) ADME and Toxicity (ADME/T) profile of top-ranked 3 drug molecules. A. Drug likeness profile of candidate drug molecules Compounds Molecular weight Log P H-bond Acceptor (HBA) H-bond donor (HBD) Polar Surface area (Å^2) No of rotatable bond Digoxin 780 4.84 14 6 203.06 7 Imatinib 493 4.04 6 2 86.28 8 Dovitinib 393.43 2.26 4 3 94.04 2 B. ADME and Toxicity (ADME/T) profile of top-ranked 3 drug molecules Compounds Absorption Distribution Metabolism Excretion Toxicity Caco2 Permeability HIA (%) P-gpI BBB (LogBB) CNS LogPS CYP3A4 Inhibitor TC AMES LC[50] (log mM) LD[50] (mole/kg) (Permeability) Digoxin 0.59 68.58 Yes − 1.39 − 3.81 No 3.67 No 4.35 3.7 Imatinib 1.09 93.85 Yes − 1.37 − 2.51 Yes 0.72 No 2.08 2.9 Dovitinib 0.47 83.63 No − 0.71 − 2.27 Yes 0.76 Yes 3.04 2.4 [213]Open in a new tab Discussion Ttype-2 diabetes (T2D) is considered as one of the risk factors for clear-cell renal cell carcinoma (ccRCC)^[214]96,[215]97. Therefore, identification of both diseases-causing shared key-genes (sKGs) is essential in order to investigate their common pathogenetic mechanisms and candidate drugs for better diagnosis and therapies during their co-occurrence. However, there was no study in the literature that has explored sKGs highlighting their pathogenic mechanisms, and candidate drug molecules as the common treatment for both T2D and ccRCC though diseases specific multiple drugs may create adverse side effects or toxicity to the patients due to drug-drug interaction^[216]23–[217]26. In order to explore, common drugs as the representative of both disease specific multiple drugs, this study investigated the genetic relationship between T2D and ccRCC by detecting shared DEGs (sDEGs) that can separate both T2D and ccRCC patients from the control samples. We identified 259 sDEGs, where 194 upregulated and 65 downregulated. We computed local correlation coefficient between T2D and ccRCC based on the aLog[2]FC values of sDEGs by using Eq. [218]2, which is found as r[XY] = 0.82, which indicates that both diseases are locally associated with each other through the expressions of sDEGs. Then we detected top-ranked 10 sDEGs (CDC42, SCARB1, GOT2, CXCL8, FN1, IL1B, JUN, TLR2, TLR4, and VIM) as sKGs (Fig. [219]2) for exploring their pathogenetic mechanisms and candidate common drugs for both diseases. The summary of this study are given in Figure S2 and the association of these 10 sKGs with both T2D and ccRCC also supported by some previous individual studies including CDC42^[220]33,[221]98,[222]99, SCARB1^[223]100–[224]102, VIM^[225]103–[226]105, IL1B^[227]106–[228]110, GOT2^[229]111,[230]112, JUN^[231]113,[232]114, TLR4^[233]115–[234]118, FN1^[235]119–[236]121, TLR2^[237]122–[238]124 and CXCL8^[239]125–[240]129 as displayed in Fig. [241]6A. A study claimed that the gene ‘CDC42’ stimulates insulin secretion which is connected to the diabetes-related diseases, such as Diabetic Nephropathy(DN), ccRCC and various cancers^[242]98. Disorder of CDC42, can prevent healthy insulin secretion and promote diabetes. Most people agree that insulin resistance is the main factor for T2D^[243]99. As a result, CDC42 plays a significant role in the development of T2D, and treating T2D and associated disorders may benefit from targeted therapy for CDC42 ^[244]33. The polyligand membrane receptor protein SCARB1 involved in the glucose and lipid metabolism disturbance associated with T2D. A higher risk of T2D is linked to genetic variations of SCARB1^[245]100,[246]101. Another study claimed that, SCARB1 is serve as both a therapeutic target and a diagnostic biomarker for ccRCC^[247]102. Vimentin, or VIM, has been identified as a key mediator of T2D linked to obesity^[248]103,[249]104. On the other hand, VIM is considerably overexpressed in ccRCC cells^[250]105. According to the article GOT2 might be a useful prognostic indicator and therapeutic target for people with ccRCC^[251]111. Its expression in T2D and ccRCC is notably low^[252]112. A glycoprotein produced by fibronectin 1 (FN1) gene, is involved in host defenses, wound healing, blood coagulation, embryogenesis, and metastasis including other activities associated with cell adhesion. The up-regulation of FN1 is directly associated with the development of renal cell carcinoma (RCC)^[253]121 as well as DN^[254]119. The up-regulation of CXCL8 exhibits elevated serum levels in kidney cancer (KC) patients^[255]128 and early regulations in DN patients^[256]129. The protumor genes as well as RCC tumors are influenced by the up-regulation of IL1B gene^[257]108. Another two studies found the up-regulation of IL1B gene in T2D patients^[258]109,[259]110. The expressions of TLR2 gene are found considerably higher in ccRCC tumor tissues^[260]124. Activation of TLR2 and TLR4 genes, and their overexpression are closely correlated with the severity of renal damage, according to a number of studies in different experimental models of kidney disease^[261]130. Creely et al. have found higher TLR2 expression in the adipose tissues of T2D patients^[262]122. In KC, renal inflammation and chronic fibrosis are significantly influenced by TLR4^[263]118. The long-term inflammatory condition cause of insulin resistance and the development of T2D^[264]115. Figure 6. [265]Figure 6 [266]Open in a new tab Verification of the proposed drug-targets (shared KGs) and drug-agents for T2D and CRC by the literature review. (A) Verification of the proposed drug-targets, (B) Verification of the proposed drug-agents. To explore transcriptional and post-transcriptional regulators of sKGs from TFs and miRNAs respectively, we performed sKGs-TFs and sKGs-miRNAs co-regulatory network analysis (Fig. [267]3) which detected five TFs proteins (YY1, FOXL1, FOXC1, NR2F1, and GATA2) and six miRNAs (hsa-mir-93-5p, hsa-mir-203a-3p, hsa-mir-204-5p, hsa-mir-335-5p, hsa-mir-26b-5p, and hsa-mir-1-3p) as the key regulators of sKGs. A transcriptional coregulator, Yin Yang 1 (YY1) stimulates the transcription of several long noncoding RNAs and its highly expressed in ccRCC^[268]131. YY1 in T2D^[269]132 and its role in the different symptoms as well as its interaction with signaling pathways that control the disease. Numerous disorders, including ccRCC ^[270]133, have been discovered to be regulated by the transcription factor known as YY1. FOXL1 and FOXC1 are members of the same family and perform similar functions. Thus, these genes might play significant roles in the ccRCC pathogenesis^[271]134. Another article shows that FOXL1 and FOXC1 are highly associated with T2D^[272]135.The development of ccRCC toward more aggressive molecular subtype is influenced by GATA2 proteins^[273]136. Another study reported that GATA2 is an important risk factor for T2D^[274]137, dyslipidemia and hypertension (HTN). Hsa-mir-335-5p has been shown to be involved in the management of RCC progression in a number of investigations^[275]138. Another article shows that, hsa-miR-335-5p appears to be associated in T2D by possibly influencing the expression of several candidate genes^[276]139. The Kaplan–Meier Plotter datasets show that miR-93-5p is highly expressed in ccRCC^[277]140.On the other hand, hsa-mir-93-5p which plays significant roles in post-transcriptional regulatory genes, especially in T2D^[278]135. Numerous kinds of cancer, including hepatocellular carcinoma (miR-9-5p), renal cell carcinoma (miR-1-3p)^[279]141, and thyroid cancer (miR-1-3p), may be impacted by the majority of the miRNAs examined. By examining the sKGs-set enrichment analysis, correlation of sKGs with distinct methylation of DNA, and regulatory analysis of sKGs using a variety of databases, we investigated the shared pathogenetic mechanism of ccRCC and T2D. We investigated critical biological processes (BP), molecular functions (MF), cellular components (CC), and KEGG pathways that are connected to the onset of T2D and ccRCC using enrichment analysis of the sKGs-set (Table [280]3). The ccRCC and T2D that were extensively enriched and caused important BP, MF, CC, and KEGG pathways were, Apoptotic process, inflammatory response, signal transduction, positive regulation of gene expression, identical protein binding, extracellular exosome, Lipid and atherosclerosis etc. these are disfunction and progression of T2D to ccRCC. Among them, Apoptosis, also known as programmed cell death, is the principal biological mechanism by which mammals destroy DNA-damaged cells and preserve tissue homeostasis. The failure of apoptosis increases the lifespan of tumor cells time and develops mutations, which can improve spreading during tumor cell development, improve tumor angiogenesis, and encourage cell proliferation (Fig. [281]1). Apoptosis is directly related to the control of T cells in ccRCC, and this must be considered in ccRCC immunotherapy^[282]142. Changes in signal transduction are always present in cancer such as, RCC. The dysregulated signal transduction that results from changes in proto-oncogenes and tumor suppressor genes ultimately promotes the abnormal development and proliferating of cancer cells^[283]143. In patients with T2D, hyperglycemia (also known as "glucose toxicity") may play a significant role in the development of insulin resistance and impaired signal transduction in the skeletal muscle^[284]144. Exosomes are crucial in the onset, detection, and management of kidney^[285]145, prostate, bladder, T2D^[286]146 and breast malignancies. The KEGG term, NOD-like receptors (NLRs) are widely used for pathogen identification receptors. NLRs play an important role in the cause of inflammation-induced insulin resistance(T2D), which leads to additional metabolic problems^[287]147. NLRs are divided into four subfamilies according to the type of N-terminal domains: NLRA, NLRB, NLRP, NLRC (C for CARD): NOD1, NOD2, NLRC3, NLRC4, NLRC5. Between, NOD1 and NOD2 inhibition has potential for treatment in acute kidney injury (AKI) ^[288]148. We also investigated using the DNA-methylation information with T2D and ccRCC shared KGs. An epigenetic process called DNA methylation involves adding a methyl group to cytosine bases, especially at CpG sites. Hypermethylation of CpG islands within promoter regions of tumor suppressor genes is widely recognized as a key mechanism leading to gene inactivation in various cancers^[289]149. In our investigation, we found that ten sKGs (CDC42, SCARB1, GOT2, CXCL8, FN1, IL1B, JUN, TLR2, TLR4, and VIM) were notably (p-value of < 0.001) within seven sKGs (SCARB1, FN1, IL1B, JUN, TLR2, TLR4, and VIM)) were hypomethylated at various CpG locations (Table [290]S7). Therefore, it can be concluded that these ten hypomethylated sKGs have a substantial association with the growth and progression of ccRCC and the survival of the apoptotic process^[291]150. In order to explore sKGs-guided common drug molecules for both T2D and CRC, we used molecular docking analysis and identified top-ranked four molecules (Digoxin, Imatinib and Dovitinib) that showed strong binding affinities with the sKGs-mediated target proteins. Then these molecules were verified for T2D and CRC by the literature review as displayed in Fig. [292]6B. To validate the drug molecules computationally, we conducted an ADME/T analysis and evaluated their drug-likeness. Two of the three medications that were shown to have drug-like properties were imatinib and dovitinib, which fit at least four of Lipinski's rule of five criteria. The chosen substances showed favorable ADME/T characteristics, possessing sufficient water solubility and high Human Intestinal Absorption (HIA) levels between 68.58% to 93.85%, and no carcinogenic effects. Among the top three identified candidate drugs molecules, Digoxin^[293]151,[294]152 and Imatinib^[295]153,[296]154 received support as the common candidate molecules for both T2D and ccRCC by the individual studies on T2D and ccRCC. It should be noted here that both drug molecules are already approved by FDA for the treatment of heart failure, atrial fibrillation (Digoxin), dermatofibrosarcoma protuberans, leukemias, systemic mastocytosis, myelodysplastic/myeloproliferative case, gastrointestinal stromal tumors and hyper eosinophilic syndrome (Imatinib), which can be found with Drug Bank (DB) accession ID DB00390 and DB00619, respectively. According to the reference article indicate that, Digoxin therapy significantly reduced cancer cell migration and proliferation in RCC cells and it is unique medicinal target for treating ccRCC patients^[297]151. It is claimed that, digoxin is a kind of cardiac glycoside that is used to treat heart failure as well as cardiac arrhythmias, that are both of common complication of T2D. Indeed, it is believed that up to 18% of diabetes patients receive digoxin^[298]152. In the animal model, imatinib effectively reduces blood sugar levels and treats the T2D^[299]153. Another experimental study reported that imatinib is a potent inhibitor against ccRCC^[300]154. The third proposed drug ‘dovitinib’ acts as an antagonist of some RCC-causing genes (VEGFR1, VEGFR2, VEGFR3, FGFR1, FGFR2, and FGFR3) according to an experimental study report^[301]155. Thus, we found that only Imatinib is experimentally validated in wet-lab for T2D and ccRCC individually, but not simultaneously. On the other hand, dovitinib molecule was experimentally validated with RCC only. However, the top-ranked drug molecule ‘Digoxin’ is not yet experimentally validated either with T2D or ccRCC though it is approved for other disease. Therefore, experimental validation is required for dovitinib molecule with T2D and the Digoxin molecule with both T2D and ccRCC. On the other hand, the proposed sKGs might be useful prognostic biomarkers in the development of immune therapy for ccRCC with T2D as discussed in different articles for other diseases^[302]2,[303]156,[304]157. Conclusion This study detected ten shared key genes (CDC42, SCARB1, GOT2, CXCL8, FN1, IL1B, JUN, TLR2, TLR4, and VIM) that are able to differentiate both T2D and ccRCC patients from the control groups. The differential expression patterns of sKGs were also confirmed by some independent datasets from NCBI, TCGA and GTAx databases. Some significant shared biological processes, molecular roles, and pathways that are connected to the development of both T2D and ccRCC were identified by the shared key gene (sKGs) set enrichment analysis. The sKGs regulatory network analysis detected some TFs proteins and miRNAs as the transcriptional and post-transcriptional regulators of sKGs. The DNA methylation analysis detected some crucial hypo-methylated CpG sites that might stimulate the ccRCC development. Finally, sKGs-guided top-ranked three candidate drug agents (Digoxin, Imatinib, and Dovitinib) were discovered through molecular docking, drug-likeness and ADME/T analysis. The pipeline of this study might be a guideline to explore common pathogenetic processes and candidate drug molecules for taking a common treatment plan against other multiple diseases also. The output of this study might be potential inputs to the wet-lab researchers for further investigation in developing sKGs-guided effective common drugs against T2D and ccRCC. Supplementary Information [305]Supplementary Information.^ (1.1MB, docx) Acknowledgements