Abstract

   Type 2 diabetes (T2D) is a crucial risk factor for both pancreatic
   cancer (PC) and kidney cancer (KC). However, effective common drugs for
   treating PC and/or KC patients who are also suffering from T2D are
   currently lacking, despite the probability of their co-occurrence.
   Taking disease-specific multiple drugs during the co-existence of
   multiple diseases may lead to adverse side effects or toxicity to the
   patients due to drug-drug interactions. This study aimed to identify
   T2D-, PC and KC-causing common genomic biomarkers (cGBs) highlighting
   their pathogenetic mechanisms to explore effective drugs as their
   common treatment. We analyzed transcriptomic profile datasets, applying
   weighted gene co-expression network analysis (WGCNA) and
   protein-protein interaction (PPI) network analysis approaches to
   identify T2D-, PC-, and KC-causing cGBs. We then disclosed common
   pathogenetic mechanisms through gene ontology (GO) terms, KEGG
   pathways, regulatory networks, and DNA methylation of these cGBs.
   Initially, we identified 78 common differentially expressed genes
   (cDEGs) that could distinguish T2D, PC, and KC samples from controls
   based on their transcriptomic profiles. From these, six top-ranked
   cDEGs (TOP2A, BIRC5, RRM2, ALB, MUC1, and E2F7) were selected as cGBs
   and considered targets for exploring common drug molecules for each of
   three diseases. Functional enrichment analyses, including GO terms,
   KEGG pathways, and regulatory network analyses involving transcription
   factors (TFs) and microRNAs, along with DNA methylation and immune
   infiltration studies, revealed critical common molecular mechanisms
   linked to PC, KC, and T2D. Finally, we identified six top-ranked drug
   molecules (NVP.BHG712, Irinotecan, Olaparib, Imatinib, RG-4733, and
   Linsitinib) as potential common treatments for PC, KC and T2D during
   their co-existence, supported by the literature reviews. Thus, this
   bioinformatics study provides valuable insights and resources for
   developing a genome-guided common treatment strategy for PC and/or KC
   patients who are also suffering from T2D.

   Keywords: Type-2 diabetes, Kidney and pancreatic cancers, Genetic
   association, Transcriptomics profiles and shared key genes, Common
   drugs and toxicity, Statistics and bioinformatics analysis

   Subject terms: Cancer, Computational biology and bioinformatics, Drug
   discovery, Biomarkers, Diseases, Health care

Introduction

   Type-2 diabetes (T2D) is a chronic metabolic disorder that is gradually
   increasing worldwide^[34]1. The International Diabetes Federation (IDF)
   estimated that there would be around 629 million adult diabetes
   patients worldwide by 2045^[35]2. A population-based study reported
   that the prevalence of diabetes in all age groups is 2.8% in 2000 and
   is projected to increase to 4.4% by 2030^[36]3. It is characterized by
   β-cell dysfunction, excessive glucose production from the liver and
   insulin resistance which impairs the ability of glucose to bind with
   insulin in blood^[37]4. T2D leads to serious health complications due
   to insulin resistance and hyperinsulinemia. It is also associated with
   obesity, cardiovascular disease, and cancers, including kidney cancer
   (KC)^[38]5,[39]6 and pancreatic cancer (PC)^[40]7,[41]8. Some studies
   showed that T2D stimulates about 80% of PC patients^[42]9 and 40% of KC
   patients^[43]10. Pancreatic cancer (PC) remains one of the most
   difficult cancers to diagnose and treat. In 2018, it was the 7th
   leading cause of cancer-related deaths worldwide, with approximately
   466,000 deaths and a 5-year survival rate of just 10%^[44]11.
   Clear-cell renal cell carcinoma (ccRCC) is one of the most prevalent
   cancers worldwide. Kidney cancer (KC) includes various types, with
   renal cell carcinoma (RCC) being the most common. ccRCC, a subtype of
   RCC, accounts for approximately 70–80% of all kidney cancers^[45]12. It
   had the 17th highest cancer-related mortality in 2018 with 175,098
   deaths worldwide^[46]13. In 2020, the death rate of KC patients was
   around 42%^[47]14. By 2030, pancreatic cancer (PC) is projected to
   become the second leading cause of cancer-related deaths^[48]15, while
   KC is expected to be the 10th most common cancer^[49]16. In PC,
   hyperinsulinemia associated with T2D elevates IGF-1 levels, which
   upregulates IGF-1R expression on pancreatic cells. This activation
   triggers cellular proliferation, suppresses apoptosis, and promotes
   genomic instability, fostering genetic mutations that may lead to
   cancer development^[50]17–[51]19. Similarly, for KC, hyperinsulinemia
   stimulates IGF-1 production which developed renal cell proliferation
   and inhibits apoptosis^[52]19. Additionally, chronic inflammation and
   oxidative stress associated with T2D contribute to DNA damage, further
   increasing the risk of cancer development^[53]18,[54]20. Thus, a
   schematic diagram about the link of PC and/or KC with T2D is given in
   Fig. [55]1.

Fig. 1.

   [56]Fig. 1
   [57]Open in a new tab

   A schematic diagram about the link of PC and/or KC with T2D

   Mainly, PC can disrupt insulin regulation, leading to insulin
   resistance and hyperinsulinemia, which contribute to the development of
   T2D. Once T2D is established, chronic hyperinsulinemia and elevated
   IGF-1 levels promote cellular proliferation and inhibit apoptosis in
   various tissues, including the kidneys. Additionally, T2D is associated
   with chronic inflammation and oxidative stress, which cause DNA damage
   and genomic instability in renal cells. This environment fosters the
   development of KC, creating a cascade where pancreatic cancer leads to
   T2D, which in turn increases the risk of kidney cancer^[58]21. Doctors
   often prescribe disease-specific medications for patients with multiple
   conditions^[59]22, which can lead to polypharmacy and potential
   drug-drug interactions (DDIs) causing adverse effects or
   toxicity^[60]23–[61]25. To mitigate this risk, it is preferable to
   prescribe a smaller number of common drugs that effectively address the
   multiple conditions. However, no study has yet proposed a common drug
   for patients with prostate cancer (PC) and/or kidney cancer (KC) who
   also suffer T2D. This bioinformatics-based study aims to: (i) identify
   common genomic biomarkers (cGBs) associated with T2D, PC, and KC,
   highlighting their shared pathogenetic mechanisms, and (ii) find
   cGBs-guided repurposable drugs for treating PC and/or KC in T2D
   patients.

Methodology

   To explore repurposable drugs for treating PC and KC in patients with
   T2D, it is essential to identify common genomic biomarkers (cGBs) that
   can serve as drug targets. However, selecting top-ranked cGBs and
   potential therapeutic agents from numerous alternatives solely through
   wet-lab experiments is challenging due to the time, effort, and cost
   involved. To address these challenges, bioinformatics analysis plays a
   crucial role in streamlining and enhancing the drug discovery process.
   Transcriptomics profile analysis through bioinformatics tools is a
   popular approach to detect disease-causing or genomic biomarkers (GBs)
   as the targets of drug molecules^[62]26–[63]34. The detailed
   methodology of this study is given in the following subsections
   2.1–2.8.

Data source and descriptions

   We considered gene-expression profiles for exploring T2D-, KC- and
   PC-causing cGBs, as well as meta-drugs to identify common drug
   molecules for these three diseases. It should be noted here that a
   group of drugs is considered as meta-drugs, which are already
   recommended for T2D, KC, or PC by individual studies.

Transcriptomics profiles collection (case/control)

   There are several individual studies in the literature that explored
   multiple diseases causing cGBs from different independent
   transcriptomics datasets^[64]35–[65]38. In this study, we used three
   independent transcriptomic datasets from the GEO database
   ([66]https://www.ncbi.nlm.nih.gov/geo/), [67]GSE36895^[68]39 for KC,
   [69]GSE16515^[70]40 for PC, and [71]GSE76896^[72]41 for T2D. Where,
   [73]GSE36895 includes 29 KC and 23 controls, [74]GSE16515 includes 36
   PC and 16 controls, and [75]GSE76896 includes 55 T2D and 116 controls.
   Datasets were carefully selected based on their larger sample sizes in
   both case and control groups, ensuring statistical robustness and
   reliable results. Priority was given to datasets generated using the
   same platform, specifically the Affymetrix Human Genome U133 Plus 2.0
   Array ([76]GPL570), to maintain uniformity in data quality and
   experimental design. The inclusion of larger sample sizes helped reduce
   variability and enhance the power to detect true differences in gene
   expression between case and control groups, thereby supporting the
   identification of meaningful biomarkers and molecular mechanisms.

Collection of meta-drugs

   A total of 110 KC-associated meta-drug agents (Table [77]S1), 103
   PC-associated meta-drug agents (Table S2), and 224 T2D-associated
   meta-drug agents (Tables S3) were collected to identify the potential
   common drug agents for each of T2D, KC, and PC. Specifically, the drug
   data were obtained from peer-reviewed published articles (Table [78]S1,
   S2 & S3) and reliable online databases, including Drug Bank^[79]42, the
   National cancer institute^[80]43, and Drug.com^[81]44. Also, we
   individually selected the top-ranked 10 publicly available T2D -causing
   KGs (S4 Table), PC -causing KGs (S5 Table), and KC-causing KGs
   (S6Table) by the literature review to verify the performance of the
   proposed candidate drug agents via molecular docking analysis against
   the independent receptors.

Identification of DEGs by weighted gene Co-expression network analysis
(WGCNA)

     At first differentially expressed genes (DEGs) between cases
     (T2D/PC/KC) and control groups were identified by using
     “CEMiTool”^[82]45 from three datasets: [83]GSE36895 for KC,
     [84]GSE16515 for PC and [85]GSE76896 for T2D. The detail discussion
     can be found in our previous study about how CEMiTool provides DEGs
     between disease and control groups^[86]46. Then weighted gene
     co-expression network analysis (WGCNA)^[87]47 technique was used for
     further filtering of those DEGs by removing the clusters (modules)
     of less correlated DEGs. Module-trait relationships were determined
     by computing the Pearson correlation coefficient between module
     eigengenes (MEs) and traits. In this study, disease status was
     treated as a clinical trait, where each disease group—T2D, PC, and
     KC was encoded using a binary classification system. Specifically, a
     value of “1” was assigned to individuals in the disease group, while
     a value of “0” was assigned to the control/ non-disease group. For
     example, for T2D, “T2D = 1” represented patients diagnosed with the
     disease, and “T2D = 0” represented healthy controls. The same
     approach was applied to PC and KC. Modules with significant
     correlations (|r| ≥ 0.6, p-value < 0.001) were selected for further
     analysis. We detected differentially expressed gene-set (DEGs-set)
     by combining all genes from all significant modules for each of T2D,
     PC and KC, separately. Subsequently, we separated the up- and
     down-regulated DEGs by satisfying the criterion alog[2]FC[i] > 1 and
     alog[2]FC[i] < -1, respectively, where alog[2]FC values indicates
     average of log[2] fold-change values and is computed as.

   graphic file with name M1.gif 1

   Here Inline graphic and Inline graphic are the responses/expressions
   for the gth gene with the ith disease and jth control samples,
   respectively.

Identification of common differentially expressed genes (cDEGs)

   We identified shared up- and down-regulated DEGs separately that were
   common across three datasets ([88]GSE36895 for KC, [89]GSE16515 for PC,
   and [90]GSE76896 for T2D). Then we combined shared up- and
   down-regulated DEG-sets to create a unified set of common
   differentially expressed genes (cDEGs) among T2D, KC, and PC.

Local genetic association among T2D, PC and KC through cDEGs

   Although the average log2 fold change (aLog2FC) values were calculated
   for each of T2D, PC, and KC using independent datasets as per
   Eq. [91]1, these values were derived from the same set of cDEGs across
   T2D, PC, and KC. A cDEG is considered upregulated for two or more
   diseases if aLog2FC > 0 and downregulated if aLog2FC < 0. Assuming that
   a gene functions similarly across individuals, the genetic association
   between any two diseases, A and B, can be assessed using their aLog2FC
   values corresponding to the cDEGs through Pearson’s correlation
   coefficient, defined as:
   graphic file with name M4.gif 2

   where, Inline graphic and Inline graphic are the aLog[2]FC values of
   the i^th gene for the two diseases A and B, respectively; Inline
   graphic and Inline graphic are the means of Inline graphic and Inline
   graphic , respectively.

Identification of three disease-causing common genomic biomarkers (cGBs)

   To explore common genomic biomarkers (cGBs), an online database and
   analysis tool STRINGv11.5 ([92]https://string-db.org/) was used to
   create the protein-protein interaction (PPI) network of cDEGs. The
   network was visualized using the Cytoscape3.10.2 software
   ([93]https://manual.cytoscape.org/en/3.10.2/)^[94]48. The CytoHubba
   plugin in Cytoscape^[95]49 was employed to identify cGBs by applying
   six topological criteria: Closeness, Degree, EPC, MCC, MNC and DMNC.
   During the analysis, we identified genes that ranked highly across all
   six measures, as these represent nodes of critical importance to the
   network’s structure and function. Only genes that demonstrated
   significant scores in all six measures were considered key candidates.
   This integrative approach ensured the robustness of the selection
   process, as it minimized bias from any single metric and highlighted
   genes that are universally central to the network.

Verification of association of cGBs with cGBs and T2D, PC, and KC using
independent datasets and databases

   To verify the association of cGBs with T2D, PC and KC through the
   independent datasets and databases, we performed disease- cGBs
   interaction analysis and expression analysis cGBs with T2D, PC and KC
   as discussed in the following subsections 2.6.1–2.6.2.

Disease-cGBs interaction analysis

   We considered GeneCodis 4^[96]50 web tool based DisGeNET
   database^[97]51 to perform disease- cGBs enrichment analysis for
   exploring the association of cGBs with different diseases including
   differents T2D, KC and PC.

Expression analysis of cGBs with T2D, PC and KC based on independent datasets

   The differential expression patterns of cGBs in T2D, PC, and KC were
   verified using box plot analysis with independent datasets from NCBI.
   We used the The Cancer Genome Atlas (TCGA) and Genotype-Tissue
   Expression (GTEx) databases through the GEPIA2 web tool
   ([98]http://gepia2.cancer-pku.cn/)^[99]52 to confirm the differential
   expression of cGBs between KC/PC and control samples. For these
   analyses, the cutoff thresholds were set at a p-value of 0.01 and a
   log2FC of 1. For validating the differential expression of cGBs between
   T2D and control samples, we used the independent dataset [100]GSE15932.
   Box plots were constructed to compare cGBs expression between T2D/KC/PC
   and control groups. Additionally, we developed a prediction model based
   on random forest (RF) using three independent expression profiles
   ([101]GSE36895 for KC, [102]GSE16515 for PC, and [103]GSE76896 for T2D)
   from the NCBI database and evaluated the predictive performance using
   ROC curves generated with the R-package “ROCR”
   [104]https://www.rdocumentation.org/packages/ROCR/versions/1.0-11^[105]
   53.

Disclosing common pathogenetic mechanisms of PC and KC with T2D

   In order to disclosing common pathogenetic mechanisms of PC and KC with
   T2D, functional enrichment analysis with gene ontology (GO)-terms and
   Kyoto Encyclopedia of Genes and Genomes (KEGG)-pathways, regulatory
   network analysis with transcription factors (TFs) and microRNAs and DNA
   methylation analysis were performed as discussed in the following
   subsections 2.6.1–2.6.3.

Regulatory network analysis of cGBs

   We conducted regulatory network analysis with transcription factors
   (TFs) and micro-RNAs (miRNA) to investigate the regulators of cGBs. In
   order to determine the primary TFs connected with cGBs, we analyzed the
   TFs-cGBs connection network with JASPAR database^[106]54. By examining
   the links between miRNA and cGBs using the TarBase^[107]55 databases,
   it was possible to identify the significant miRNAs that have an impact
   on cGBs at the post-transcriptional stage. NetworkAnalyst^[108]47 was
   used to replicate these interactions. The post-transcriptional
   regulators of cGBs were selected from top-ranked miRNAs. We used
   Cytoscape^[109]56 to visualize the networks of their interactions.

The cGBs -set enrichment analysis with GO-terms and KEGG-pathways

   The Gene Ontology (GO) project is a bioinformatics tool that uses
   domain-specific ontologies to provide a complete source of functional
   data on gene products and descriptions of activities^[110]57. To
   investigate the Gene Ontology and KEGG pathway of cGBs, we considered
   GeneCodis 4^[111]58 database, and a P-value of 0.03 was chosen as the
   threshold.

DNA methylation analysis of cGBs in PC and KC

   DNA methylation of cGBs involves adding methyl groups to their DNA,
   influencing gene expression and regulation. Statistically, analyzing
   these methylation patterns helps to identify significant changes in
   gene activity associated with diseases, providing insights into gene
   regulation mechanisms and contributing to understanding disease
   processes. MethSurv web tools ([112]https://biit.cs.ut.ee/methsurv/)
   with TCGA-KIRC methylation data were used to investigate DNA
   methylation, a complex epigenetic process that controls gene expression
   in both normal and malignant cells^[113]59. DNA methylation values
   (ranging from 0 to 1) were represented by β values, which were computed
   as M/ (M + U + 100) for every CpG site. The intensities of methylation
   and unmethylation are represented by M and U, respectively. We
   classified the methylation levels into two groups based on the
   difference in methylation β value between the cut-off point and higher
   (methylation β value above the cut-off point) in order to assess the
   impact on patient survival. One can use data quantiles or the means to
   get the grouping cut-off point.

The cGBs -guided drug repurposing

   To explore cGBs-guided repurposable common drug molecule for T2D, PC
   and KC, we performed molecular docking and ADME/T analysis as discussed
   in the subsections 2.8.1–2.8.2.

Exploring candidate drugs by molecular Docking

   To explore potential repurposable drug molecules, we performed
   molecular docking between six cGBs and their top two associated
   transcription factors (TFs) as target receptors and repurposable drug
   molecules using AutoDock Vina^[114]60. To identify cGBs-guided drug
   molecules, we gathered 434 candidate molecules from published articles
   and online databases related to T2D, PC, and KC, as detailed in Tables
   S1, S2 & S3. Receptor proteins’ 3D structures were obtained from
   Protein Data Bank^[115]61 and AlphaFold databases^[116]62. All 434 T2D,
   PC and KC-related meta-drug candidates’ 3D structures were taken from
   the PubChem database^[117]63. Following this, the binding affinity
   scores (in kcal/mol) between receptors and ligands (drug molecules)
   were determined through molecular docking. The receptor proteins were
   arranged in descending order based on the average binding affinity
   scores for each row, while the drug candidates were ranked according to
   the average scores in each column of the score matrix. This method
   allowed for the selection of the top-ranked drug molecules for further
   analysis.

ADME/T analysis

   ADME/T analysis evaluates a drug candidate’s absorption, distribution,
   metabolism, excretion, and toxicity to predict its safety and efficacy.
   In drug repurposing, a compound may demonstrate strong binding to a new
   target during docking studies, indicating potential for a new
   therapeutic application. However, if it does not meet ADMET
   criteria—such as poor absorption, quick metabolism, or high toxicity—it
   is unlikely to succeed in its new role. By filtering out unsuitable
   candidates, ADMET analysis ensures that only those with favorable
   pharmacokinetic and safety profiles progress further in the repurposing
   process. We analyzed the drug-like properties and ADME/T (absorption,
   distribution, metabolism, excretion, and toxicity) profiles of the top
   six ranked drug compounds to better understand their structural
   features and chemical descriptors. The SCFBio web application^[118]64
   was used to evaluate compliance with Lipinski’s rule. ADME/T parameters
   were then predicted using the online databases SwissADME^[119]65 and
   pkCSM^[120]66, utilizing the optimal structures of the drug compounds
   in SMILES format for the calculations.

Results

Identification of DEGs by weighted gene Co-expression network analysis
(WGCNA)

   Differentially expressed genes (DEGs) for T2D, PC, and KC were
   identified using the “CEMiTool” and “WGCNA” (Weighted correlation
   network analysis) approaches across three datasets: [121]GSE9348,
   [122]GSE121248, and [123]GSE76896. Initially, the “CEMiTool” analysis
   identified 4,589 DEGs from [124]GSE16515, 4,233 DEGs from
   [125]GSE16515, and 7,548 DEGs from [126]GSE76896. These DEGs were
   further refined using the “WGCNA” approach. For each filtered gene
   expression matrices, a soft threshold β value was chosen (14 for
   [127]GSE76896, 11 for [128]GSE16515 and 15 for [129]GSE36895) based on
   a cutoff R^2 value of 0.85 (Figure [130]S1). Following that, some
   modules were found by hierarchical clustering with the minimal module
   size 30. To merge the modules, cut height of module eigengene was set
   0.15 for [131]GSE76896, 0.09 for [132]GSE16515 and 0.13 for
   [133]GSE36895 (Figure S2). To uncover the relationship between modules
   and clinical traits (KC/PC/T2D and non-KC/non-PC/non-T2D samples and
   control), We selected six modules from [134]GSE76896, four modules from
   [135]GSE16515 and six modules from [136]GSE36895 based on the
   module-trait relationship (correlation greater than 0.6 or less than
   − 0.6 along with p-value < 4e-08) (Figure S3).

Identification of common differentially expressed genes (cDEGs)

   We identified a total of 85 common differentially expressed genes
   (cDEGs), including 55 upregulated and 30 downregulated genes, across
   three comparisons: control vs. T2D, control vs. PC, and control vs. KC.
   These cDEGs were visualized using a Venn diagram, as shown in
   Fig. [137]2 (see also Table S7). The Venn diagrams were made using the
   Venn Web Tool at
   [138]https://bioinformatics.psb.ugent.be/webtools/Venn/. DEGs-sets for
   each disease were pasted into the tool, and it generated the diagram
   showing overlaps and differences between the sets. The diagram was then
   customized and downloaded for further use.

Fig. 2.

   [139]Fig. 2
   [140]Open in a new tab

   Common DEGs and the trends of their aLog[2]FC values in T2D, PC and KC.
   (A) Upregulated cDEGs; (B) Downregulated cDEGs and, (C) The trends of
   aLog[2]FC vales of cDEGs in T2D, PC and KC.

Local genetic association among T2D, PC and KC through cDEGs

   To understand the link of T2D with PC and KC, we computed pairwise
   local correlation coefficients using Eq. [141]2 for T2D, PC and KC
   based on the aLog[2]FC values of cDEGs (see Fig. [142]2C; Table [143]1
   and Table S8). The correlation coefficient between each pair of the
   three diseases was ≥ 0.83 (see Table [144]1), indicating that T2D, PC
   and KC are locally associated with each other through the expressions
   of cDEGs.

Table 1.

   [Local association among T2D, PC and KC through cDEGs].
         [T2D]         [PC]         [KC]
   [T2D] [1]           [0.917438]   [0.875406875]
   [PC]  [0.917438]    [1]          [0.83596749]
   [KC]  [0.875406875] [0.83596749] [1]
   [145]Open in a new tab

Identification of three disease-causing common genomic biomarkers (cGBs)

   The protein-protein interaction (PPI) network of cGBs was built that
   consist of 76 nodes and 459 edges. We select the top-ranked 6 cGBs
   (ALB, MUC1, TOP2A, BIRC5, RRM2 and E2F7) based on six topological
   methods with threshold Degree = 20, Closeness = 46.16, EPC = 14.35,
   MNC = 20, Betweenness = 245.0877 and Radiality = 3.37 in the PPI
   network. cGBs that ranked highly across all six measures were
   identified as key candidates, ensuring their critical importance to the
   network. This integrative approach minimized bias and highlighted genes
   central to the network’s structure and function. Here, six cGBs were
   upregulated and one cGBs was downregulated. Scores of six topological
   measures in six cGBs were shown in Table S9. And the graphical
   representation of PPI-network was displayed in Fig. [146]3.

Fig. 3.

   [147]Fig. 3
   [148]Open in a new tab

   Protein-protein interactions (PPIs) network analysis of cDEGs. Nodes in
   an Octagon shape with yellow color indicate the cGBs.

Verification of association of cGBs with cGBs and T2D, PC, and KC using
independent datasets and databases

   To verify the association of cGBs with T2D, PC and KC through the
   independent datasets and databases, we performed disease- cGBs
   interaction analysis and expression analysis cGBs with T2D, PC and KC
   as discussed in the following subsections 3.5.1–3.5.2.

Disease-cGBs interaction analysis

   The disease- cGBs interaction analysis showed that the top-ranked 25
   diseases (including T2D, PC and KC) these are significantly associated
   with cGBs (Table S10). The p-value < 0.03 was chosen as cutoff for the
   significance test.

Expression analysis of cGBs with T2D, PC and KC based on independent datasets

   Then, we verified the differential expression patterns of cGBs in two
   independent databases (GTEx and TCGA) for PC and KC that combinedly
   contained 171 controls and 179 PC samples, and 523 controls and 100 KC
   samples through box plot analysis (Figure S4A.). We examined the
   differential expression patterns of cGBs between T2D and control
   samples using box plot analysis. This analysis was based on the gene
   expression profiles from the NCBI dataset with accession ID
   [149]GSE15932, which includes 8 T2D and 8 control samples. We excluded
   the pancreatic cancer samples from our analysis (Figure S4B). We found
   that 5 cGBs are upregulated, and one is downregulated in these three
   diseases that support our findings. To assess the prediction
   performance of cGBs, we developed a Random Forest (RF) based prediction
   model using 60% samples as training data. The remaining 40% of the data
   was used as test data. We also considered 3 additional independent test
   datasets ([150]GSE71989 for PC, [151]GSE66272 for KC and [152]GSE19420
   for T2D) from NCBI database. For each of these three diseases, we
   constructed ROC curves for both test datasets (Figure S5) and
   calculated some performance scores (AUC, TPR, TNR, and Accuracy) (Table
   S11). The performance of the cGBs in both prediction models was strong,
   with an AUC greater than 0.98 and an accuracy (ACC) exceeding 0.83.

Disclosing common pathogenetic mechanisms of PC and KC with T2D

   In order to disclosing common pathogenetic mechanisms of PC and KC,
   with T2D, functional enrichment analysis with gene ontology (GO)-terms
   and Kyoto Encyclopedia of Genes and Genomes (KEGG)-pathways, regulatory
   network analysis with transcription factors (TFs) and microRNAs
   (miRNAs) and DNA methylation analysis were performed as discussed in
   the following subsections 3.6.1–3.6.3.

The cGBs-set enrichment analysis with GO-terms and KEGG-pathways

   We performed GO and KEGG pathway enrichment analysis for 6 cGBs to
   explore the shared pathogenetic processes of T2D, KC and PC by Enrichr.
   Table [153]2 gives the top-ranked five BPs, MFs, and KEGG pathways.

Table 2.

   The top five significantly (p-value < 0.03) enriched GO terms and KEGG
   pathways with cGBs by genecodis 4.
   Biological Process
   GO ID Description Adj. p-value Annotated cGBs
   GO:0030330 DNA Damage Response 4.70E-05 MUC1, E2F7
   GO:2,000,045 Regulation of G1/S Transition of Mitotic Cell Cycle
   4.12E-04 RRM2, E2F7
   GO:0090068 Positive Regulation of Cell Cycle Process 5.10E-04 MUC1,
   BIRC5, E2F7
   GO:0042981 Regulation of Apoptotic Process 8.05E-04 TOP2A, ALB, BIRC5
   GO:1,903,490 Positive Regulation of Mitotic Cytokinesis 0.001499 BIRC5
   Molecular Function
   GO ID GO function Adj. p-value Associated cGBs
   GO:0008301 DNA Binding, Bending 0.00509 TOP2A
   GO:0030170 Pyridoxal Phosphate Binding 0.00598 ALB
   GO:0008170 N-methyltransferase Activity 0.00807 MUC1
   GO:0005080 Protein Kinase C Binding 0.01075 TOP2A
   GO:0005507 Copper Ion Binding 0.01342 ALB
   KEGG Pathways
   KEGG ID KEGG function Adj. p-value Associated cGBs
   hsa00240 Pyrimidine metabolism 0.01668 RRM2, ALB, BIRC5
   hsa00480 Glutathione metabolism 0.01698 RRM2, MUC1
   hsa04115 p53 signaling pathway 0.02170 RRM2, TOP2A
   hsa00983 Drug metabolism 0.03196 RRM2
   hsa05200 Pathways in cancer 0.03808 BIRC5
   [154]Open in a new tab

Regulatory network analysis of cGBs

   We employed cGBs-TF-miRNA coregulatory network to detect the regulators
   of cGBs at transcriptional and post-transcriptional levels. We found
   the top-ranked three TFs (NFIC, GATA2 and KLF5) based on degree and
   betweenness with a threshold degree of 3 and betweenness of 44.38, and
   two miRNAs (hsa-mir-16-5p and hsa-mir-103a-3p) based on degree and
   betweenness with a threshold degree of 5 and betweenness of 632.36
   (Fig. [155]4).

Fig. 4.

   [156]Fig. 4
   [157]Open in a new tab

   cGBs-TF-miRNA coregulatory networks. Here KGs were marked as red color
   with octagon shapes, TFs were marked as blue color with ellipse and the
   miRNAs were marked as green color with demand shape. Key TF and miRNAs
   are in the in the right-hand side of this figure.

DNA methylation of cGBs in KC and PC

   DNA methylation is an epigenetic mechanism that regulates gene
   expression by recruiting repressive proteins or inhibiting
   transcription factor binding to DNA^[158]61. We examined the DNA
   methylation status at CpG sites for all cGBs (ALB, BIRC5, E2F7, MUC1,
   TOP2A, and RRM2) using MethSurv. Our analysis revealed that ALB, BIRC5,
   and E2F7 were hypomethylated, while MUC1, TOP2A, and RRM2 were
   hypermethylated, with all cGBs showing significant CpG sites
   (p-value ≤ 0.01) (Table [159]3).

Table 3.

   The significant prognostic value of CpG in cGBs.
   cGBs  Cancer Gene_Group    CpG_Island CPG Name   HR    P-Value
   ALB   KIRC   Body          Open_Sea   cg04450599 2.689 0.000423928
   ALB   KIRC   3’UTR         Open_Sea   cg08368094 0.364 1.58E-06
   ALB   KIRC   Body          Open_Sea   cg24656976 0.617 0.023928486
   ALB   PAAD   3’UTR         Open_Sea   cg08368094 0.555 0.013474885
   BIRC5 KIRC   Body          S_Shelf    cg04972436 1.839 0.012255
   BIRC5 PAAD   Body          Island     cg17515702 0.664 0.0012672
   E2F7  KIRC   5’UTR         Island     cg04410862 0.521 0.001255244
   E2F7  KIRC   TSS1500       Island     cg06422060 0.501 0.005080302
   E2F7  KIRC   5’UTR;1stExon Island     cg12408293 0.44  0.00080281
   E2F7  KIRC   TSS1500       Island     cg16975973 0.502 0.00523938
   E2F7  KIRC   3’UTR         Open_Sea   cg22480875 2.349 0.001012358
   E2F7  KIRC   5’UTR;1stExon Island     cg24594830 0.468 0.001701919
   E2F7  PAAD   5’UTR         Island     cg04410862 0.396 0.000789579
   E2F7  PAAD   TSS1500       Island     cg18463599 2.013 0.002314531
   MUC1  KIRC   Body          N_Shore    cg18804777 0.467 0.003702
   MUC1  KIRC   Body          N_Shore    cg20949223 0.315 5.26E-05
   MUC1  KIRC   TSS200        N_Shore    cg22531371 0.503 0.00053
   MUC1  KIRC   Body          N_Shore    cg24512973 0.465 0.000158
   MUC1  PAAD   Body          N_Shore    cg18804777 0.587 0.009881
   MUC1  PAAD   Body          N_Shore    cg20949223 0.288 6.76E-06
   MUC1  PAAD   Body          N_Shore    cg24512973  0.4  0.000374
   RRM2  KIRC   Body          Island     cg00506866 0.407 0.000728
   RRM2  KIRC   1stExon       Island     cg16504939 2.653 0.000222
   RRM2  KIRC   TSS1500       N_Shore    cg18623836 2.177 0.00245
   RRM2  KIRC   TSS1500       Island     cg18639038 0.583 0.007558
   RRM2  PAAD   Body          Island     cg02237186 0.551 0.006875
   TOP2A KIRC   TSS1500       S_Shore    cg09273772 2.208 0.000119
   TOP2A KIRC   TSS1500       S_Shore    cg17504397 1.995 0.000455
   TOP2A KIRC   Body          Island     cg25581784 0.455 0.000364
   TOP2A PAAD   TSS1500       S_Shore    cg26007540 1.881 0.007965
   [160]Open in a new tab

The cGBs -guided drug repurposing

   To identify cGBs-guided repurposable drug molecules for T2D, PC, and
   KC, we conducted molecular docking and ADME/T analyses as detailed in
   subsections 3.7.1–3.7.2.

Exploring candidate drugs by molecular Docking

   To identify cGBs-guided repurposable drug molecules, we performed
   molecular docking between cGBs-mediated receptors and candidate drug
   molecules. We obtained the 3D structures of six receptors (TOP2A, RRM2,
   ALB, MUC1, GATA2, and KLF5) from the Protein Data Bank (PDB) using the
   following PDB codes 1zxm, 3olj, 2Z5V, 2acm, 5o9b, and 2ebt,
   respectively. The remaining targets (BIRC5, NFIC, and E2F7) were
   sourced from the AlphaFold Protein Structure Database with UniProt IDs
   [161]Q5RAH9, [162]P08651, and [163]Q96AV8. Through molecular docking
   analysis, we calculated the binding affinity scores (BAS) between the
   receptors and candidate drugs. We ranked the receptors and drug agents
   based on the row and column averages of the BAS matrix (Table S12). Out
   of 434 drugs, the top six potential candidates (NVP.BHG712, Irinotecan,
   Olaparib, Imatinib, RG-4733, and Linsitinib) were selected as
   multi-targeted drugs, all showing significant BAS values of <
   -7.0 kcal/mol with all target proteins (Fig. [164]5A).

Fig. 5.

   [165]Fig. 5
   [166]Open in a new tab

   Image of drug-target binding affinity matrices (A) X-axis indicates the
   top-ordered 50 drug agents (out of 434) and Y-axis indicates ordered
   proposed receptor proteins. (B) the top 6 proposed and published drugs
   (out of 434) in the Y-axis with a red tint. In the X-axis, blue color
   indicated the 10 independent receptors of T2D, red color indicated the
   10 independent receptors of PC, black color indicated the 10
   independent receptors of KC.

   To verify their binding performance against other independent
   receptors, we considered top-ranked six key-genes for each of T2D, PC
   and KC separately, by the literature review. We observed that the
   proposed drug molecules significantly bind (BAS<-7.0 kcal/mol) with
   most of the independent receptors (Fig. [167]5B & Table S13).
   Therefore, the proposed cGBs-guided these top-ranked six lead drugs
   could be promising as the candidate drug molecules for the treatment of
   KC and PC with T2D. To provide more information about the proposed
   drugs and targets, top-ranked three drug-target complexes highlighting
   their 3-dimension (3D) view and interacting residues were given in
   Table S14.

ADME /T analysis

   Among the detected six drugs by molecular docking analysis, six
   suggested (NVP.BHG712, Irinotecan, Olaparib, Imatinib, RG-4733 and
   Linsitinib) were satisfied at least four parameters of Lipinski’s rule
   of five, indicating their drug-like characteristics (Table S15). To
   assess the efficacy and indemnity of the suggested drugs, many
   parameters can be used to analyze their toxicity and ADME (absorption,
   distribution, metabolism and excretion) analyses (Table [168]4). All
   drug molecules exhibited high gastrointestinal absorption (HIA ≥ 68%),
   strong P-glycoprotein inhibition (P-gpI), and almost all were
   classified as non-toxic based on AMES, LD50, and LC50 tests, with LC50
   values > 1.0 log mM. None effectively crossed the blood-brain barrier
   (LogBB < -1), but this limitation is not critical for non-CNS
   therapies. All candidates efficiently inhibited CYP3A4, supporting
   their metabolic suitability. Overall, these drugs demonstrate strong
   absorption, safety, and drug-like profiles, making them promising
   candidates for oral consumption. Among these six drugs, their
   effectiveness and potential were evaluated based on ADME/T properties,
   toxicity profiles, and drug-likeness characteristics, highlighting
   distinct advantages and limitations for each. Imatinib emerged as a
   highly effective candidate due to its excellent gastrointestinal
   absorption (HIA: 93.84%), strong Caco-2 permeability (1.09), low
   toxicity (LC50: 2.08 log mM), and efficient CYP3A4 inhibition,
   demonstrating its therapeutic promise. Similarly, NVP-BHG712 showed
   exceptional absorption (HIA: 95.71%), moderate permeability (0.51), low
   toxicity (LC50: 1.73 log mM), and favorable metabolic properties,
   reinforcing its safety and suitability. Olaparib also displayed strong
   absorption (HIA: 91.93%), excellent Caco-2 permeability (1.08), low
   toxicity (LC50: 1.97 log mM), and efficient metabolism, further
   highlighting its potential as a drug candidate. RG-4733, while having a
   relatively lower HIA (68.00%) and negative Caco-2 permeability (-0.59),
   demonstrated minimal toxicity with the highest LC50 value (2.83 log
   mM), ensuring a strong safety profile. In contrast, Irinotecan showed
   mixed results, with the highest absorption score (HIA: 99.88%) and
   moderate permeability (0.648), but its higher toxicity (LC50: 0.79 log
   mM) presents a challenge for its therapeutic application. Linsitinib,
   although exhibiting strong absorption (HIA: 93.26%) and the highest
   permeability (Caco-2: 1.188), was hindered by significant toxicity
   (negative LC50: -1.10 log mM), limiting its suitability as a safe drug
   candidate. Overall, Imatinib, NVP-BHG712, Olaparib, and RG-4733
   displayed a favorable balance of absorption, safety, and metabolism,
   making them promising candidates for further exploration, while
   Irinotecan and Linsitinib may require optimization to improve their
   profiles.

Table 4.

   ADME/T profile of top-ranked six drugs.
   Compounds Absorption Desorption Metabolism Excretion Toxicity
   Caco2 Permeability HIA (%) P-gpI BBB CNS CYP3A4 TC hERGI LC[50]
   (log mM) LD[50] (mole/kg)
   (Permeability)
   NVP-BHG712 0.51 95.71 Yes -1.05 -2.17 Yes -0.08 No 1.73 3.27
   Irinotecan 0.648 99.88 Yes -1.31 -3.23 Yes 0.93 No 0.79 2.81
   Olaparib 1.08 91.93 No -0.85 -2.66 Yes 0.56 No 1.97 2.62
   Imatinib 1.09 93.84 Yes -1.37 -2.51 Yes 0.71 No 2.08 2.9
   RG-4733 -0.59 68.00 No -1.28 -3.12 Yes 0.37 No 2.83 2.48
   Linsitinib 1.188 93.26 Yes -0.07 -1.86 Yes 0.73 No 1.10 2.68
   [169]Open in a new tab

   Among the eight drugs identified through molecular docking analysis,
   **NVP-BHG712, Olaparib, Imatinib, and RG-4733** were the most effective
   candidates based on their ADME/T profiles, while **Irinotecan** and
   **Linsitinib** were less effective due to higher toxicity levels. The
   more effective drugs demonstrated superior absorption, metabolism, and
   toxicity characteristics, making them promising oral medication
   candidates. For instance, NVP-BHG712 exhibited a high HIA score of
   95.71%, moderate Caco-2 permeability (0.51), low toxicity with an LC50
   value of 1.73 log mM, and effective CYP3A4 inhibition. Similarly,
   Olaparib showed the highest Caco-2 permeability (1.08), excellent HIA
   (91.93%), low toxicity (LC50: 1.97 log mM), and favorable metabolic
   properties. Imatinib achieved a high HIA score (93.84%), strong
   permeability (1.09), low toxicity (LC50: 2.08 log mM), and effective
   CYP3A4 inhibition. RG-4733, despite a relatively lower HIA score
   (68.00%) and Caco-2 permeability (-0.59), demonstrated minimal toxicity
   with the highest LC50 value of 2.83 log mM, ensuring its strong safety
   profile. In contrast, while Irinotecan had the highest HIA score
   (99.88%) and moderate permeability (0.648), its higher toxicity (LC50:
   0.79 log mM) and Linsitinib, with strong absorption (HIA: 93.26%,
   Caco-2 permeability: 1.188), but a negative LC50 value (-1.10 log mM),
   significantly reduced their suitability. None of the drugs could
   effectively cross the blood-brain barrier (LogBB < -1), which limits
   central nervous system exposure, although this is not critical for
   non-CNS-targeted therapies. Overall, NVP-BHG712, Olaparib, Imatinib,
   and RG-4733 displayed a superior balance of absorption, metabolism, and
   low toxicity, positioning them as more effective drug candidates, while
   Irinotecan and Linsitinib require further optimization to improve their
   therapeutic profiles. The chemicals are a possible oral medication
   candidate since it is expected that they would be sufficiently absorbed
   in the gastrointestinal tract. In the human intestines, a chemical is
   considered well absorbed if its Human Intestinal Absorption (HIA) score
   is > 30%^[170]62,[171]63. Our study revealed that, of the six
   medications we suggested, all had a high HIA score of ≥ 68%, indicating
   strong absorption characteristics by the human body. Additionally, our
   all candidate drugs molecules were able to block the P-glycoprotein
   inhibitor (P-gpI). The blood-brain barrier (BBB) permeability index
   calculates a compound’s capacity to pass the BBB. When a compound’s
   LogBB value is less than − 1, it is thought to be poorly disseminated
   over the BBB barrier, but compounds with a LogBB value of 0.3 or above
   can possibly pass the BBB. After evaluating all substances, it was
   found that none of the suggested medications could effectively cross
   the blood-brain barrier, as all had BBB values below 0.3
   (Table [172]4). Additionally, they are thought to partially reach the
   central nervous system based on the LogPS (CNS) value. Human cytochrome
   P 450 (CYP) enzymes are hemoproteins that are membrane-bound and are
   crucial to homeostasis, drug detoxification, and cellular metabolism.
   More than one CYP from CYP classes 1–3 is responsible for over 80% of
   the oxidative metabolism and around 50% of all common clinical drug
   elimination in humans [144]. All medications possess the YES
   characteristics necessary to block the human body’s CYP3A4 membrane. A
   drug’s toxicity is proportional to its LC50 value; a smaller number
   denotes more toxicity (< 1.0 log mM). All of our proposed drugs had
   LC50 value greater than 1.0 log mM. The toxicity analyses of our
   proposed compounds evaluating AMES tests, fatal dose LD50, and minnow
   toxicity LC50 indicated that they were inert across all criteria.
   Consequently, they are predicted to be non-toxic, displaying drug-like
   properties and suitability for oral consumption.

Discussion

   Population-based studies have shown that type-2 diabetes (T2D) is
   linked to kidney cancer (KC)^[173]67 and pancreatic cancer
   (PC)^[174]68. In this study, we aimed to explore the genetic connection
   between these diseases by identifying 78 common differentially
   expressed genes (cDEGs) that distinguish T2D, PC, and KC samples from
   control groups through transcriptomic analysis (see Figs. [175]1 and
   [176]2). We then identified the top 6 cDEGs (ALB, BIRC5, E2F7, MUC1,
   TOP2A, and RRM2) as candidate genetic biomarkers (cGBs) by constructing
   a protein-protein interaction (PPI) network for the cDEGs
   (Fig. [177]3), aimed at finding common drug molecules. The association
   of cGBs with T2D, PC, and KC was further validated through literature
   review, disease-cGB interaction analysis, and expression analysis of
   cGBs in these diseases. Additionally, functional enrichment analysis
   using Gene Ontology (GO) terms and KEGG pathways, regulatory network
   analysis involving transcription factors (TFs) and microRNAs, and DNA
   methylation analysis based on independent databases were also
   conducted. A graphical summary of this study is provided in Figure S6.
   The association of cGBs-with T2D, PC and KC also supported by some
   previous individual studies including ALB^[178]69–[179]71,
   TOP2A^[180]72–[181]75, BIRC5^[182]76–[183]79, MUC1^[184]80–[185]82,
   RRM2^[186]83–[187]85 and E2F7^[188]86–[189]88, as displayed in
   Fig. [190]6A. We investigated the shared pathogenetic mechanisms of KC,
   PC, and T2D by analyzing cGBs and identified some key enriched
   biological processes, molecular functions, and KEGG pathways include
   DNA damage response, cell cycle regulation, apoptosis, DNA binding,
   pyridoxal phosphate binding, N-methyltransferase activity, protein
   kinase C binding, copper ion binding, pyrimidine metabolism,
   glutathione metabolism, p53 signaling, drug metabolism, and cancer
   pathways (Table [191]2).

Fig. 6.

   [192]Fig. 6
   [193]Open in a new tab

   Verification of proposed cGBs and common drug molecules for T2D, PC and
   KC through literature review. (A) Verification of cGBs and (B)
   Verification of drug molecules, ellipse in green represent FDA-approved
   drugs, yellow represent investigational drugs, and red represent
   non-FDA approved drugs.

   Among them, BIRC5 is a key inhibitor of apoptosis, involved in
   regulating the cell cycle and inhibiting programmed cell death^[194]89.
   This gene is essential for cell cycle regulation and mitotic
   cytokinesis, both of which are vital for tumor progression in KC and
   PC^[195]76. It’s overexpression is a well-established marker for poor
   prognosis in various cancers, including KC and PC. In T2D, its
   association with the regulation of apoptotic processes and positive
   regulation of cell cycle might reflect the dysregulation of cell
   survival and proliferation in insulin-resistant tissues^[196]79. The
   presence of BIRC5 in the p53 signaling pathway further connects its
   role in apoptosis regulation, highlighting its potential as both a
   prognostic marker and therapeutic target for tumors and diabetic
   complications, particularly kidney dysfunction^[197]90. MUC1 is a
   membrane-associated glycoprotein involved in several cellular
   functions, such as adhesion, signaling, and protecting against
   oxidative stress. It plays an essential role in regulating apoptosis
   and responding to DNA damage, which is vital for maintaining cell
   integrity^[198]91. In the context of T2D, MUC1 has been implicated in
   insulin signaling and glucose homeostasis, and its dysregulation may
   contribute to insulin resistance and dysfunction in pancreatic
   beta-cells^[199]82. In KC and PC, MUC1 aids in cancer cell migration
   and invasion, facilitating metastasis^[200]80,[201]81. Its
   participation in the glutathione metabolism pathway also highlights its
   role in mitigating oxidative damage, a common feature in both diabetes
   and cancer^[202]92. Albumin (ALB), primarily synthesized in the liver,
   is a key protein that helps maintain osmotic pressure and facilitates
   the transport of various molecules in the bloodstream. In T2D, ALB is
   closely associated with kidney dysfunction, with its presence in urine
   (albuminuria) serving as an early indicator of diabetic nephropathy.
   The involvement of albumin in regulating apoptotic processes and DNA
   damage response pathways suggests it could serve as a biomarker for
   renal cell injury in diabetes-related kidney disease^[203]69. In KC,
   albumin plays a significant role in the glutathione metabolism and drug
   metabolism pathways, underscoring its role in cellular defense
   mechanisms and response to chemotherapy, especially in the context of
   oxidative stress^[204]71. Research on PC further emphasizes ALB’s
   involvement in regulating cell survival and proliferation, pointing to
   its potential as a therapeutic target for both diabetes complications
   and cancer treatment^[205]72. E2F7 is a member of the E2F family of
   transcription factors and plays a role in controlling the G1/S
   transition of the cell cycle, a key checkpoint in cell division. It is
   involved in the DNA damage response and has been implicated in
   maintaining genomic stability. In T2D, the aberrant regulation of cell
   cycle progression could contribute to the pathological changes observed
   in tissues like the kidney^[206]88. In KC, E2F7’s regulatory role in
   the G1/S transition and cell cycle processes suggests its involvement
   in tumor proliferation and resistance to apoptosis, which is critical
   for cancer progression^[207]87. The connection to positive regulation
   of mitotic cytokinesis in PC underscores its potential contribution to
   uncontrolled cellular division and tumor metastasis^[208]86. TOP2A is
   an enzyme involved in DNA replication and mitotic chromosome
   segregation, essential for cell division. Its overexpression is
   frequently associated with rapid cell proliferation and is a known
   marker of cell cycle regulation in cancer cells. In T2D, it may
   influence the regulation of apoptotic processes and contribute to
   tissue remodeling and fibrosis in the kidney. In KC, it is often
   upregulated in tumor tissues, where it enhances DNA damage repair and
   cell survival, promoting cancer progression^[209]73,[210]74. The
   overexpression of TOP2A in pancreatic cancer (PC) has been linked to
   increased metastasis and poorer patient survival outcomes^[211]93. Its
   association with p53 signaling pathways further suggests its role in
   modulating responses to DNA damage, making it a valuable target for
   therapies aimed at both cancer treatment and preventing diabetic
   nephropathy^[212]94. RRM2 is a critical enzyme for the synthesis of
   deoxyribonucleotides, essential for DNA replication and repair. In the
   context of T2D, it’s regulation of the pyrimidine metabolism pathway
   could influence DNA replication in tissues affected by chronic
   hyperglycemia, such as in diabetic nephropathy^[213]85. In KC and PC,
   it is overexpressed in many tumor types, suggesting its role in
   sustaining the high proliferative rate of cancer cells^[214]83,[215]84.
   Its involvement in glutathione metabolism and the p53 signaling pathway
   underscores its potential role in cellular stress responses, especially
   in cancers that depend on rapid cell division and survival under
   oxidative stress conditions. The enzyme’s regulation in drug metabolism
   pathways also points to its potential involvement in drug resistance
   mechanisms^[216]95. These findings demonstrate that the cGBs identified
   in this study—ALB, BIRC5, E2F7, MUC1, TOP2A, and RRM2—are intricately
   connected to key biological processes that are dysregulated in T2D, KC,
   and PC. Their involvement in cell cycle regulation, apoptosis, DNA
   damage response, and metabolism pathways highlight their potential as
   biomarkers and therapeutic targets for these diseases.

   The Random Forest (RF) prediction model using cGBs effectively
   separated the three diseases (T2D, PC, and KC) from the control groups,
   with an AUC greater than 0.91 and an accuracy (ACC) over 0.83. This
   highlights the significance of cGBs in classifying these diseases
   (Figure S5, Table S11). Box-plot analysis revealed that five cGBs were
   significantly upregulated and one was downregulated in T2D, PC, and KC
   compared to the control groups, supporting our findings (Figure S4).
   Additionally, three transcription factors (NFIC, GATA2, and KLF5) and
   two miRNAs (hsa-mir-16-5p and hsa-mir-103a-3p) were identified as key
   regulators and post-regulatory elements connected with the cGBs
   (Fig. [217]4). NFIC enhances binding accessibility in all malignancies
   and is significantly overexpressed in cells of KC^[218]96.In a study,
   GATA2, NFIC were identified as the best transcriptional regulatory
   signatures in PC patients^[219]97. The progression of KC toward a more
   aggressive stage is influenced by GATA2 proteins^[220]98. Another study
   reported that GATA2 is an important risk factor for T2D^[221]99.
   According to recent study, KLF5 is linked to the earliest stages and
   development of KC^[222]100. In low-grade primary pancreatic tumors and
   pre-neoplastic lesions, KLF5, a regulator of PC differentiation, is
   expressed, maintaining epithelial gene expression and promoting
   glandular epithelial organization in xenografts^[223]101. KLF5 is
   linked to renal issue related to T2D^[224]102.

   We used molecular docking analysis to find possible medication
   candidates for the therapy of T2D, PC and KC (Fig. [225]5A, Table S12)
   and found six of the top-ranked drug agents (NVP.BHG712, Irinotecan,
   Olaparib, Imatinib, RG-4733 and Linsitinib) out of 434 that show strong
   binding affinities with the target proteins (cGBs and their TFs).
   Subsequently, we compared the effectiveness of these six chemical
   molecules against 10 independent receptors separately for each of T2D,
   PC and KC (published by others) that supported our results
   (Fig. [226]5B, Table S13). The literature review provided further
   support for the potential effectiveness of our suggested drugs as
   treatments for T2D, PC and KC individually.
   NVP.BHG712^[227]26,[228]103–[229]105, Irinotecan^[230]106–[231]108,
   Olaparib^[232]109–[233]111, Imatinib^[234]111–[235]113,
   RG-4733^[236]114–[237]116 and Linsitinib^[238]117–[239]119, as
   displayed in (Fig. [240]6B). All of our proposed drugs were supported
   as common candidate molecules for T2D, PC, and KC based on individual
   studies of each disease. Among the identified candidate drugs, The FDA
   has approved irinotecan and Olaparib with Drug Bank (DB) accession
   codes (DB00762 and DB09074) for the treatment of PC^[241]106. It has
   been suggested that combining Olaparib with some drugs, such as
   irinotecan, is essential for the treatment of PC^[242]109. The kinase
   inhibitor NVP-BHG712 affects blood flow, oxygen shortage, and the
   development of cancer^[243]103. Additionally, it specifically inhibits
   EphB4 kinase, a potential therapeutic target for T2D and insulin
   resistance^[244]104,[245]105. NVP-BHG712 has been suggested as a
   possible PC treatment^[246]26. According to the Drug Bank database,
   RG-4733 is an experimental medication being investigated for cancer
   (accesssion number DB11870). This new gamma secretase inhibitor, an
   essential part of the enzyme complex that cleaves and activates Notch,
   is being investigated as an anti-cancer drugs^[247]114. FDA-approved
   cancer drug imatinib (DB00619) has been associated with renal failure
   in 14% of patients with 800 mg of KC, particularly those who have had a
   nephrectomy in the past^[248]112. imatinib increases the anti-tumor
   efficaciousness of gemcitabine in drug-resistant PC
   xenografts^[249]113. PC development and metastasis are affected in
   different ways by the experimental cancer medication linsitinib (DB
   accession number DB06075), which inhibits insulin-like growth factor
   receptor-1 (IGF-1R)^[250]26,[251]117. To validate the drug molecules
   computationally, we conducted ADME/T analysis and assessed their
   drug-likeness. This evaluation confirmed the effectiveness of the
   proposed drug molecules. Each identified drug molecule complied with at
   least four of Lipinski’s Rule of Five, demonstrating their suitability
   as drug candidates (Table S15). The six drug molecules that were chosen
   for analysis showed favorable ADME/T profiles (Table [252]4), high HIA
   percentages ranging from 68 to 99.88%, sufficient water solubility, and
   no carcinogenic characteristics. Among the screened drugs, NVP-BHG712,
   Olaparib, Imatinib, and RG-4733 stand out for their favorable
   drug-likeness profiles, good absorption potential, and promising
   molecular characteristics, making them well-suited for clinical use in
   treating T2D, PC, and KC. These drugs show strong pharmacokinetic
   properties, indicating better bioavailability and efficacy. In
   contrast, Irinotecan and Linsitinib are less effective due to
   suboptimal drug-likeness profiles, including lower absorption and
   permeability, though they may still be considered for combination
   therapies based on clinical needs. Thus, the findings of this study
   might be vital resources for diagnosis and therapies PC and KC patients
   who are also suffering from T2D.However, a limitation of this study is
   that the proposed T2D-, PC- and KC-causing cGBs and candidate
   therapeutic agents are not validated yet to the patients who are
   suffering from these three diseases. However, a limitation of this
   study is that T2D-, PC- and KC- causing cGBs and candidate therapeutic
   agents are not validated yet to the patients who are suffering from
   these three diseases.

Conclusion

   Initially, we identified 78 common differentially expressed genes
   (cDEGs) associated with PC and KC in the context of T2D. Expression
   analysis of these cDEGs revealed a genetic connection between PC, KC,
   and T2D. We then pinpointed six top-ranked cDEGs (ALB, BIRC5, E2F7,
   MUC1, TOP2A, and RRM2) as common genomic biomarkers (cGBs) for these
   diseases. The differential expression patterns of these cGBs were
   validated using independent datasets from NCBI, TCGA, and GTEx
   databases for T2D, PC, and KC.We explored shared pathogenetic
   mechanisms of PC, KC, and T2D through cGBs-set enrichment analysis,
   including biological processes, molecular functions, cellular
   components, and KEGG pathways. We also conducted regulatory network
   analysis involving cGBs, transcription factors (TFs), proteins, and
   miRNAs, as well as DNA methylation analysis. Ultimately, we identified
   six top-ranked candidate drug molecules (NVP-BHG712, Olaparib,
   Imatinib, RG-4733, Irinotecan and Linsitinib) as potential treatments
   for PC and/or KC with T2D. Both the proposed genomic biomarkers and
   drug molecules were supported by literature reviews of individual
   studies on T2D, PC, and KC. This study’s findings provide valuable
   resources for developing effective common drugs for treating PC and/or
   KC with T2D.

Electronic supplementary material

   Below is the link to the electronic supplementary material.
   [253]Supplementary Material 1^ (2.7MB, docx)

Author contributions

   Conceptualization: A.A. and M.N.H.M. Data curation and processing:
   A.A., S.M. and T.N. Transcriptomic data analysis: A.A and A.S.
   Molecular docking analysis: A.A. and S.M. Methodology: A.A., T.N. and
   M.N.H.M, Validation & Visualization: R.A and S.M., Writing – original
   draft: A.A. Writing - review & editing: R.A., A.S. & M.N.H.M. Project
   administration & Supervision: M.N.H.M. All authors have read and
   approved the final manuscript.

Data availability

   The datasets analyzed in this study were downloaded from NCBI database
   with the following links that are freely
   availablehttps://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?
   acc=GSE36895https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?
   acc=gse16515https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?
   acc=GSE76896https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?
   acc=GSE71989https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?
   acc=GSE66272https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?
   acc=GSE19420https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?
   acc=[254]GSE15932.

Declarations

Competing interests

   The authors declare no competing interests.

Footnotes

   Publisher’s note

   Springer Nature remains neutral with regard to jurisdictional claims in
   published maps and institutional affiliations.

Supplementary Information

   The online version contains supplementary material available at
   10.1038/s41598-025-91875-3.

References