Abstract

   Pancreatic ductal adenocarcinoma (PDAC) is a highly aggressive
   malignancy, accounting for over 90% of pancreatic cancers, and is
   characterized by limited treatment options and poor survival rates.
   Systems biology provides in-depth insights into the molecular
   mechanisms of PDAC. In this context, novel algorithms and comprehensive
   strategies are essential for advancing the identification of critical
   network nodes and therapeutic targets within disease-related
   protein-protein interaction networks. This study employed a
   comprehensive computational strategy using the metaheuristic algorithm
   Trader to enhance the identification of potential therapeutic targets.
   Analysis of the expression data from the PDAC dataset ([30]GSE132956)
   involved co-expression analysis and clustering of differentially
   expressed genes to identify key disease-associated modules. The STRING
   database was used to construct a network of differentially expressed
   genes, and the Trader algorithm pinpointed the top 30 DEGs whose
   removal caused the most significant network disconnections. Enriched
   gene ontology terms included “Signaling by Rho GTPases,” “Signaling by
   receptor tyrosine kinases,” and “immune system.” Additionally, nine hub
   genes—FYN, MAPK3, CDK2, SNRPG, GNAQ, PAK1, LPCAT4, MAP1LC3B, and
   FBN1—were identified as central to PDAC pathogenesis. This integrated
   approach, combining co-expression analysis with protein-protein
   interaction network analysis using a metaheuristic algorithm, provides
   valuable insights into PDAC mechanisms and highlights several hub genes
   as potential therapeutic targets.

Supplementary Information

   The online version contains supplementary material available at
   10.1038/s41598-024-74252-4.

   Keywords: Pancreatic ductal adenocarcinoma, Therapeutic target, Network
   analysis, WGCNA, Trader algorithm

   Subject terms: Cancer, Computational biology and bioinformatics

Introduction

   Pancreatic cancer (PC) is the 12th most common cancer and the 7th
   leading cause of cancer-related deaths, with a very low 5-year survival
   rate of approximately 10%^[31]1–[32]3. The pathogenesis of this
   malignancy involves abnormal mutations, transcriptional reprogramming,
   epigenetic changes, and chromosomal alterations^[33]4–[34]6. Among the
   three types of pancreatic tumors, pancreatic ductal adenocarcinoma
   (PDAC) is the most common, accounting for over 90% of all PC
   cases^[35]7–[36]9. Most PDACs develop from non-invasive precancerous
   lesions that progress to overt adenocarcinoma through a stepwise
   process^[37]3,[38]10. This progression involves mutations in genes such
   as KRAS2, CDKN2A, TP53, and DPC4/SMAD4, and alterations in pathways
   like RAS, cell-cycle control, DNA damage repair, and
   TGF-β3^[39]11,[40]12. In terms of PDAC treatment, surgical removal of
   the lesions remains the primary therapeutic approach with curative
   potential^[41]4,[42]13. However, the significant risk of clinical
   morbidity and mortality associated with pancreatic excision highlights
   the need for early diagnosis and a deeper understanding of the
   underlying regulatory mechanisms of this malignancy to develop more
   effective treatments and improve patient survival rates^[43]14.

   Systems biology offers a holistic approach to studying biological
   systems in health and disease^[44]15,[45]16. Expression profiling, a
   key omics approach, is widely used to identify differential gene
   expression patterns. Analyzing this high-throughput data provides
   insights into disease molecular biology and aids in identifying novel
   therapeutic targets and prognostic biomarkers^[46]17,[47]18.
   Encouragingly, an increasing body of research utilizing diverse
   methodologies has advanced our understanding of the molecular
   mechanisms underlying PDAC and identified pivotal genes associated with
   this malignancy. A recent in silico study focusing on diabetic PDAC
   revealed 1,546 differentially expressed genes (DEGs) associated with
   metabolic pathways. Co-expression analysis of two datasets
   ([48]GSE74629 and [49]GSE15932) identified two significant gene
   modules, highlighting several key genes implicated in oxidative
   phosphorylation. Furthermore, through the analysis of the
   protein-protein interaction (PPI) network of disease-related gene
   modules and the application of a feature-selection method known as
   support vector machine recursive feature elimination (SVM-RFE), 21
   feature genes were identified that effectively differentiate diabetic
   PDAC from non-diabetic cases^[50]19. However, a notable limitation of
   this approach is its dependency on specific datasets. The classifier’s
   performance, evaluated with datasets [51]GSE74629 and [52]GSE15932,
   raises concerns about its generalizability to other independent
   datasets, potentially impacting the broader applicability of the
   findings. In another similar study, network-based analysis of
   microarray expression data from peripheral blood samples of PDAC
   patients identified pivotal genes implicated in protein
   phosphorylation, immune response, and signal transduction^[53]20. This
   study focused exclusively on identifying hub genes based on topological
   characteristics of the PPI network, specifically using metrics such as
   betweenness centrality.

   To improve the identification of key genes within disease-related PPI
   networks, the adoption of more sophisticated and comprehensive
   analytical tools is essential. Identifying driver nodes in a network is
   a computationally challenging task, classified as a non-deterministic
   polynomial (NP) problem. To address this, heuristic algorithms are
   commonly employed to identify these critical nodes, which are often
   potential therapeutic targets. However, heuristic algorithms can
   sometimes become trapped in local optima, limiting their effectiveness
   in identifying nodes that exert control over the network. To overcome
   this limitation, the current study utilized a metaheuristic algorithm,
   Trader, which outperforms most other metaheuristic approaches in
   optimizing the defined objective functions.

   In brief, we applied a comprehensive approach that combined Weighted
   Gene Co-expression Network Analysis (WGCNA) with the Trader algorithm
   to analyze a recently generated PDAC transcriptomics dataset
   ([54]GSE132956). This approach allowed us to explore disease-related
   co-expression and PPI networks. The Traderalgorithm effectively
   identifies critical genes by determining which nodes’ removal results
   in the maximum number of disconnected sub-networks within the network
   of DEGs^[55]21,[56]22. The identified hub genes were subsequently
   subjected to further investigation, including expression validation,
   single-cell analysis, identification of potential drug candidates, and
   correlation with immune cell infiltration. The study’s workflow is
   illustrated in Fig. [57]1.

Fig. 1.

   [58]Fig. 1
   [59]Open in a new tab

   The workflow of the present study is divided into four stages: dataset
   analysis (blue boxes), the WGCNA experiment (green boxes), the Trader
   application (yellow boxes), and finally, the identification and
   analysis of the hub genes (red boxes).

Results

DEGs identification, network construction, and pathway analysis

   Before analyzing the dataset, several preprocessing steps, including
   principal component analysis (PCA) and normalization methods, were
   conducted to ensure the accuracy of the core analysis. Initially, PCA
   was used to verify the quality of the dataset. The PCA plot indicated
   no outliers among the samples, and both groups (controls and cases)
   were clearly separated based on PC1 (Fig. [60]2A). Additionally,
   quantile normalization was applied to ensure that the expression
   distributions were consistent across all samples. Subsequently, the
   core analysis identified 4,131 significant DEGs, including 2,128
   down-regulated and 2,003 up-regulated genes. Figure [61]2B presents a
   volcano plot highlighting the top DEGs based on log2 fold change
   (log2FC). The complete list of identified DEGs, including their names
   and values, is provided in Table [62]S1 of the supplementary file. In
   the next step, a PPI network with 2,166 nodes and 11,485 edges was
   constructed and analyzed. Figure [63]2C depicts the top 30 DEGs in the
   network based on degree centrality. To elucidate the principal
   biological pathways involving the DEGs, Reactome pathway enrichment
   analysis was performed, with the main terms shown in Fig. [64]2D.

Fig. 2.

   [65]Fig. 2
   [66]Open in a new tab

   Data preprocessing and analysis: (A) The PCA plot illustrates the
   differences and similarities between PDAC and control samples. (B) The
   volcano plot displays the top 10 up- and down-regulated DEGs in the
   analyzed dataset. (C) The constructed PPI network of PDAC DEGs
   highlights the top 30 DEGs based on degree centrality. (D) Reactome
   pathway enrichment analysis results for all PDAC DEGs.

Co-expression analysis

   The results of the co-expression analysis using the WGCNA algorithm,
   including scale independence and mean connectivity plots, cluster
   dendrogram, eigengene dendrogram, adjacency heatmap, network heatmap
   plot, module-trait relationship heatmap, and module membership versus
   gene significance plot for the blue module, are illustrated in a
   multi-panel figure (Fig. [67]3A-F). Based on the scale independence and
   mean connectivity results, a soft threshold of 12 was selected to
   approximate a scale-free topology (Fig. [68]3A). Hierarchical
   clustering and module merging identified seven distinct co-expression
   modules: Royal blue, blue, cyan, black, brown, light green, and
   turquoise (Fig. [69]3B). The clustering dendrogram and eigengene
   adjacency heatmap revealed the division of these modules into two
   clusters (Fig. [70]3C). The network heatmap plot illustrated the
   topological overlap adjacency among genes within the modules
   (Fig. [71]3D). The module-trait relationship plot indicated that genes
   in the blue module had the highest correlation with the disease state,
   leading to their selection for further analysis (Fig. [72]3E). Based on
   the module membership (kME) and gene significance (G.S.) values, 580
   DEGs were identified as the top DEGs in the blue module (Fig. [73]3F).
   The detailed names and values of the clustered DEGs within the
   co-expression modules are provided in Table [74]S2 of the supplementary
   file.

Fig. 3.

   [75]Fig. 3
   [76]Open in a new tab

   Co-expression network analysis (WGCNA): (A) Plots depicting the
   scale-free fit index and mean connectivity analysis for thresholds from
   1 to 20. (B) Cluster dendrogram illustrating DEG clusters (branches)
   and co-expressed modules (colored). (C) Eigengene dendrogram and
   eigengene adjacency heatmap, highlighting low (blue) and high (red)
   adjacencies among modules. (D) Network heatmap plot of selected genes,
   demonstrating module division accuracy. Each column and row correspond
   to a single gene, with progressive yellow indicating higher adjacencies
   among genes within modules.

GO enrichment and pathway analysis

   Gene ontology (GO) enrichment analysis was conducted for the
   co-expressed genes within each module, with the top biological process
   terms presented for each (Fig. [77]4A). Given its strong correlation
   with the disease, the blue module’s genes were subjected to more
   comprehensive enrichment analyses. Reactome pathway analysis revealed
   that genes in the blue module were predominantly associated with
   pathways such as “Signaling by Rho GTPases,” “Signaling by receptor
   tyrosine kinases,” “EPH-Ephrin signaling,” “Extracellular matrix
   organization,” “Hemostasis,” and “Immune system” (Fig. [78]4B). These
   findings were corroborated by the GO terms analysis, which highlighted
   key processes including “Rho protein signal transduction,” “Small
   GTPase-mediated signal transduction,” “Actin cytoskeleton
   organization,” and various immune system-related processes. Enriched GO
   terms for the blue module genes are detailed in Table [79]1.

Fig. 4.

   [80]Fig. 4
   [81]Open in a new tab

   Results of enrichment analysis: (A) Plots showing module membership and
   gene significance values for clustered DEGs within each module, with
   principal biological terms listed at the top of each plot. (B) Results
   of Reactome pathway analysis for DEGs in the blue module.

Table 1.

   Results of gene ontology analysis for the blue module genes.
   Ontology GO terms GO term ID Nr. genes % Associated genes Term P value
   corrected with Bonferroni step down
   BP Negative regulation of ERBB signaling pathway GO:1,901,185 10 31.25
   0.022547045
   Antigen processing and presentation of peptide antigen GO:0048002 16
   23.188406 0.005346598
   Regulation of Rho protein signal transduction GO:0035023 19 18.811882
   0.017618911
   Response to interferon-gamma GO:0034341 26 17.567568 0.001804067
   Regulation of cell-substrate adhesion GO:0010810 39 16.810345
   1.45064E-05
   Rho protein signal transduction GO:0007266 26 16.666666 0.004977302
   Actin cytoskeleton organization GO:0030036 97 12.548512 1.96237E-08
   Response to wounding GO:0009611 73 11.793215 0.000138442
   Small GTPase mediated signal transduction GO:0007264 65 11.711712
   0.000901925
   Blood vessel development GO:0001568 86 11.466666 2.62259E-05
   MF Cell adhesion molecule binding GO:0050839 78 13.04348 5.80E-08
   Cytoskeletal protein binding GO:0008092 108 10.24668 2.39E-05
   Enzyme binding GO:0019899 204 9.255898 1.15E-07
   Identical protein binding GO:0042802 196 9.032258 2.38E-06
   Metalloendopeptidase activity involved in amyloid precursor protein
   catabolic process GO:1,902,945 4 80 0.024879
   Negative regulation of catalytic activity GO:0043086 93 11.07143
   5.49E-06
   Protein domain-specific binding GO:0019904 74 10.10929 0.006617
   Regulation of hydrolase activity GO:0051336 113 10.38603 5.54E-06
   CC Arp2/3 protein complex GO:0005885 7 70 8.23E-05
   MHC protein complex GO:0042611 8 28.571428 0.046645951
   Immunological synapse GO:0001772 13 27.659575 6.80E-04
   Phagocytic vesicle GO:0045335 25 16.891891 5.98E-04
   Cell-substrate junction GO:0030055 78 16.595745 2.87E-14
   Lamellipodium GO:0030027 33 14.537445 7.72E-04
   Ruffle GO:0001726 27 13.432836 0.018817795
   Membrane raft GO:0045121 47 12.912087 1.46E-04
   Collagen-containing extracellular matrix GO:0062023 56 12.043011
   1.33E-04
   Actin cytoskeleton GO:0015629 62 11.63227 1.09E-04
   [82]Open in a new tab

Hub gene identification via both the Trader and WGCNA algorithms

   In this stage, we employed two algorithms with different approaches to
   identify several candidate hub genes for pancreatic adenocarcinoma. The
   first approach utilized the PPI network, where the Trader algorithm
   identified the top 30 DEGs. The second approach was based on
   co-expression network analysis, with WGCNA highlighting the top
   co-expressed DEGs in the disease’s most correlated module. The
   intersection of these two lists revealed nine common DEGs: FYN, MAPK3,
   CDK2, SNRPG, GNAQ, MAP1LC3B, LPCAT4, PAK1, and FBN1 (Table [83]2)
   (Fig. [84]5A). The complete lists of top DEGs identified by both
   strategies are provided in Table [85]S3 of the supplementary file.

Table 2.

   The identified hub genes by both the WGCNA and Trader algorithms.
   Gene symbol Gene description Role in cancer pathogenesis Proposed
   (approved) drug/s Top 10% in the PPI network (degree centrality)
   FYN Tyrosine-protein kinase

   Reduction of FYN blocked cell proliferation, migration, and invasion in
   the MDA-MB-231 cells^[86]23.

   Inhibition of FYN in mouse glioma models delays tumor initiation and
   progression^[87]24

   DASATINIB

   PAZOPANIB
   YES
   MAPK3 MAP kinase-activated protein kinase 3 Targeting MAPK3 by miR-143
   inhibited the proliferation of the breast cancer cells^[88]25

   ETOPOSIDE

   CYCLOPHOSPHAMIDE

   SORAFENIB
   YES
   CDK2 Cyclin-dependent kinase 2 CDK2 and CDK4/6 inhibition neutralizes
   palbociclib resistance in breast cancer^[89]26

   PACLITAXEL

   CARBOPLATIN

   ERIBULIN

   DEXAMETHASONE
   YES
   SNRPG Small nuclear ribonucleoprotein polypeptide G SNRPG
   downregulation results in cell cycle arrest and sensitizing human
   glioblastoma cells to temozolomide^[90]27 - YES
   GNAQ Guanine nucleotide-binding protein G(q) subunit alpha GNAQ
   knockdown results in a blockage in the cell cycle and disturb the
   proliferation of gastric cancer cells^[91]28 EVEROLIMUS YES
   PAK1 Serine/threonine-protein kinase PAK1 inhibition suppresses
   pancreatic cancer through the down-regulation of PD-L1^[92]29 - YES
   LPCAT4 Lysophospholipid acyltransferase LPCAT4 is linked to intestinal
   stem cell proliferation and tumorigenesis^[93]30 - No
   MAP1LC3B Microtubule-associated protein 1 light chain 3B MAP1LC3B
   up-regulation is linked to breast cancer and melanoma
   progression^[94]31,[95]32 - No
   FBN1 Fibrillin-1

   FBN1 knock-down inhibited xenograft (papillary thyroid) tumor formation
   in vivo^[96]33

   FBN1 knockdown inhibits cell migration and invasion in ovarian
   cancer^[97]34
   - No
   [98]Open in a new tab

Fig. 5.

   [99]Fig. 5
   [100]Open in a new tab

   Hub gene identification and expression validation: (A) Left: PPI
   network of all PDAC-related DEGs and the top 30 DEGs identified by
   Trader; Right: Co-expression network of clustered DEGs in the blue
   module; Center: Intersection of top DEGs identified by both methods.
   (B) Validation of the expression profiles of identified hub genes; all
   hub genes exhibited the expected up-regulation in pancreatic
   adenocarcinoma (PAAD) samples.

Hub gene expressional validation, and immune cell infiltration study

   The expression analysis of the identified hub genes, performed using
   the Gene Expression Profiling Interactive Analysis (GEPIA) platform,
   confirmed their up-regulation in pancreatic samples (Fig. [101]5B),
   validating the expression analysis conducted in this study. We then
   examined the protein expression of these hub genes using two databases.
   The Human Protein Atlas (HPA) database, which maps human proteins
   across cells, tissues, and organs using various omics methods, was used
   to evaluate protein expression in PDAC and normal tissues. The
   immunohistochemistry (IHC) data revealed significant up-regulation of
   five hub genes (MAPK3, CDK2, GNAQ, MAP1LC3B, and LPCAT4) in PDAC
   samples. However, IHC data for the remaining hub genes (FYN, FBN1,
   SNRPG, and PAK1) were unavailable, necessitating further investigation.
   Proteomics data from University of Alabama at Birmingham Cancer Data
   Analysis Portal (UALCAN), which included 137 PDAC and 74 normal
   samples, also supported the up-regulation of seven hub genes (FYN,
   MAPK3, CDK2, SNRPG, MAP1LC3B, LPCAT4, and FBN1) in PDAC samples
   (Supplementary Fig. 1). Additionally, Tumor Immune Microenvironment
   Estimation Resource (TIMER) database analysis assessed the correlation
   between hub gene expression and immune cell infiltration. The data
   indicated a positive correlation between several hub genes (MAP1LC3B,
   GNAQ, FYN, and FBN1) and the infiltration of B-cells, CD8 + T cells,
   CD4 + T cells, macrophages, neutrophils, and dendritic cells
   (Fig. [102]6). Using the Drug–Gene Interaction Database (DGIdb), we
   identified several approved drugs targeting the hub genes. Among the
   nine hub genes, FYN, MAPK3, CDK2, and GNAQ were associated with
   multiple drugs, including Dasatinib, Pazopanib, Etoposide,
   Cyclophosphamide, Sorafenib, Paclitaxel, Carboplatin, Eribulin,
   Dexamethasone, and Everolimus (Table [103]2). A literature review also
   highlighted the involvement of nearly all identified hub genes in the
   progression of various cancers (Table [104]2).

Fig. 6.

   [105]Fig. 6
   [106]Open in a new tab

   The plots illustrate the correlation between hub gene expression levels
   and immune cell infiltration, as analyzed using TIMER. Each dot
   represents a sample from the TCGA dataset.

Discussion

   Recent investigations have increasingly focused on targeted therapy,
   leading to significant advances in identifying novel therapeutic
   targets in cancer^[107]35–[108]37. Beyond traditional biological
   approaches, systems biology has recently provided more comprehensive
   and cost-effective strategies for exploring cancer-related biological
   systems, understanding pathogenic mechanisms, and discovering new
   therapeutic targets^[109]38,[110]39.

   Biological systems are typically represented as networks, consisting of
   binary interactions among various entities^[111]40,[112]41. The nodes
   within these networks can include differentially expressed biological
   entities, such as genes, proteins, metabolites, or non-coding RNAs, in
   specific conditions like cancer. Constructing, analyzing, and
   interpreting these networks can deepen our understanding of cancer
   biology and help identify critical factors involved in its
   pathogenesis^[113]42–[114]44. In the realm of biological network
   analysis, numerous algorithms have been developed and
   applied^[115]45–[116]47. Among these, WGCNA stands out as a prominent
   tool for co-expression network analysis of high-throughput expression
   data^[117]48. WGCNA can cluster co-expressed genes into distinct
   modules, identify the most disease-correlated module, and subsequently
   pinpoint key genes within that module^[118]49. To date, this algorithm
   has facilitated the identification of several potential therapeutic
   targets across various disease conditions^[119]49−[120]54.

   Topological analysis of PPI networks is commonly used to identify key
   nodes as potential therapeutic targets^[121]55,[122]56. This strategy
   relies on centrality metrics like degree, betweenness, closeness, and
   eigenvector to pinpoint significant nodes based on their network
   interactions^[123]57,[124]58. However, a major limitation is that some
   crucial nodes with low centrality may be overlooked^[125]59,[126]60.
   Accordingly, more accurate and reliable strategies are needed in
   analyzing biological networks to identify key elements. To address
   this, our team developed the Traderalgorithm, which identifies key
   nodes by detecting the minimal set whose removal results in the maximum
   number of sub-networks. This method provides a more reliable approach
   for identifying potential therapeutic targets^[127]21.

   In this integrative systems biology study, we used the WGCNA and
   Traderalgorithms to analyze a PDAC expression dataset, identifying
   potential therapeutic targets and exploring underlying molecular
   pathways. Enrichment analysis focused on DEGs in the most correlated
   module, the blue module. The results highlight the significant
   involvement of Rho GTPases signaling, receptor tyrosine kinase
   signaling, and immune system pathways in PDAC tumors. Immune-related
   pathways have previously been implicated in PDAC pathogenesis^[128]61.
   Immune cells such as CD8 + T cells, CD4 + T cells, dendritic cells, and
   natural killer cells are active in the PDAC microenvironment, where
   they can inhibit tumor growth and progression^[129]62. However, PDAC is
   characterized by a highly immunosuppressive microenvironment that
   employs various mechanisms to evade immune responses^[130]63,[131]64.
   Key factors in this evasion include dense desmoplastic stroma,
   cancer-associated fibroblasts (CAFs), programmed death-ligand 1
   (PD-L1), and the JAK/STAT signaling pathway^[132]65. Targeting these
   molecules and pathways may enhance the efficacy of immunotherapies for
   pancreatic cancer^[133]66.

   Signaling by Rho GTPases was another enriched term among the clustered
   DEGs in the blue module. This finding is relevant to the
   therapy-resistant nature of PDAC, as Rho GTPase signaling has been
   linked to the promotion of oncogenic microRNAs and pro-survival
   molecules in various cancers. Under normal conditions, Rho GTPases
   regulate cytoskeleton dynamics, cell cycle, survival, adhesion, and
   migration^[134]67,[135]68. In cancer, they are crucial for cell
   invasion, metastasis, and extravasation^[136]68. Targeting Rho GTPase
   signaling might limit these invasive processes. In PDAC, Rho GTPases
   act as central downstream regulators of K-Ras mutations, present in
   over 90% of cases^[137]69. Previous studies have demonstrated the
   association between Rho GTPase signaling and PDAC
   invasiveness^[138]70–[139]73. For example, RIO Kinase 3 (RIOK3) and p21
   Activated Kinase 4 (PAK4) promote pancreatic ductal cell motility and
   invasion through interactions with Rho GTPases^[140]70. Recent research
   also highlighted the role of Rho GTPases in metastatic behavior, with
   silencing cdc42 and Rac1 reducing tumor cell motility under
   stress^[141]74. Despite these insights, only a few Rho GTPase
   inhibitors are currently in clinical trials for cancer
   treatment^[142]75. Our findings support the potential of targeting Rho
   GTPases in PDAC therapy.

   In the subsequent phase of this study, nine hub genes were identified
   by integrating the top DEGs from both WGCNA and Traderanalyses. These
   hub genes—FYN, MAPK3, CDK2, SNRPG, GNAQ, PAK1, LPCAT4, MAP1LC3B, and
   FBN1—represent the intersection of DEG lists from both algorithms.
   These identified hubs warrant further investigation due to their
   potential as therapeutic targets in PDAC. Notably, PAK1, a
   serine/threonine-protein kinase, and MAPK3, a member of the
   mitogen-activated protein kinase (MAPK) family, are prominent effectors
   of Rho GTPase signaling networks. PAK1, in particular, is known to
   interact with various Rho GTPases, including Rac1, Rac2, Rac3, and
   Cdc42, highlighting its central role in these pathways^[143]76–[144]78.
   In addition to its involvement with Rho GTPases and its critical role
   in cytoskeletal dynamics, PAK1 is implicated in various oncogenic
   pathways^[145]79,[146]80. MAPK3 (ERK1), another kinase among the
   identified hub genes, plays a pivotal role in regulating cell
   proliferation, differentiation, and survival. Targeting MAPK3 in
   cancers such as breast cancer^[147]25, prostate cancer^[148]81,
   multiple myeloma^[149]82and glioma^[150]83 has demonstrated promising
   therapeutic outcomes.

   In addition to FYN, MAPK3, CDK2, and PAK1, which function as kinases,
   the identified hub genes include FBN1, a key protein in extracellular
   matrix regulation; MAP1LC3B, involved in autophagic vacuole formation;
   LPCAT4, an acyltransferase; SNRPG, a core component of the spliceosome;
   and GNAQ, a transmembrane signaling transducer. Detailed information on
   these hub genes and their potential as therapeutic targets is provided
   in Table [151]2.

   Further analysis of the hub genes revealed a positive correlation
   between several hub genes and the infiltration of immune cells.
   Tumor-infiltrating immune cells are well-established components of the
   tumor microenvironment and are crucial for tumor development, invasion,
   and metastasis^[152]84,[153]85. Reports indicate that these immune
   cells can both enhance and suppress antitumor immunity within the tumor
   microenvironment^[154]84. Thus, it is critical to assess the status of
   immune cell infiltration in this environment. The results provided
   evidence of the involvement of MAP1LC3B, GNAQ, FYN, and FBN1 in the
   infiltration of immune cells, including B-cells, CD8 + T cells, CD4 + T
   cells, macrophages, neutrophils, and dendritic cells. These findings
   suggest the potential role of these genes in shaping the PDAC immune
   microenvironment.

   Of course, the current study has some limitations. The primary
   limitation is the small sample size of the evaluated dataset.
   Conducting a meta-analysis on multiple PDAC datasets might be a
   potential approach. However, it is important to consider the
   differences among gene expression platforms and the quality of the
   datasets prior to performing a meta-analysis. Secondly, our results are
   solely based on gene expression data. To enhance the practicality of
   our method, it would be beneficial to apply our strategy to
   PDAC-related RNA-Seq and proteomics data. Additionally, further
   molecular experiments are necessary to validate our findings.

   In conclusion, applying a comprehensive protocol to investigate both
   co-expression and PPI networks of PDAC DEGs revealed the therapeutic
   potential of Rho GTPase signaling members and several identified hub
   genes. The novel Trader algorithm, alongside WGCNA, was used to analyze
   the PDAC-related PPI network. This integrative approach, combining both
   strategies and particularly the metaheuristic method, could inspire new
   data analysis methodologies. The study demonstrates that these hub
   genes, involved in multiple biological pathways, may be promising
   therapeutic targets to hinder PDAC progression.

Methods

Dataset analysis and PPI network construction

   The [155]GSE132956 dataset, consisting of 10 PDAC samples and 5 normal
   tissue samples analyzed on the Affymetrix HTA2.0 Array ([156]GPL17586),
   was selected for analysis and retrieved from the Gene Expression
   Omnibus ([157]https://www.ncbi.nlm.nih.gov/geo/). The selection of
   [158]GSE132956 was based on its quality and the reasonable sample size,
   which are important factors for co-expression analysis. PCA was
   utilized before the primary analysis to assess the dataset’s quality
   and detect potential outliers. The Network Analyst online tool
   ([159]http://www.networkanalyst.ca, Version 3.0) was used for quantile
   normalization, removal of probes related to multiple genes, and
   performing the analysis using the Limma package. The identified DEGs
   (false discovery rate (FDR) cutoff < 0.049) were further investigated,
   including WGCNA and PPI network construction. All identified DEGs were
   used as seeds for the PPI network in the STRING database (Version 11.5)
   with a confidence score of 0.7, indicating high confidence^[160]86. The
   confidence score in STRING reflects the reliability of proposed protein
   interactions based on the quality of supporting evidence. The network
   was visualized and analyzed using Cytoscape software (Version
   3.9.1)^[161]87, and the top DEGs with high centrality were identified.

Co-expression network construction and analysis

   The WGCNA algorithm in the R programming environment (Version 4.2.2)
   was used to construct co-expressed networks and classify DEGs into
   different modules. Initially, a matrix containing the intensity values
   of identified DEGs was input into the algorithm. After sample
   clustering, the “pickSoftThreshold” function was used to determine the
   optimal soft threshold power (from 1 to 20), with a value of 12
   selected based on the degree of independence and mean connectivity.
   Subsequent steps included constructing the adjacency matrix, estimating
   the topological overlap matrix (TOM), constructing modules, and
   performing dynamic branch cutting with a merging threshold of 0.15 to
   define co-expressed modules. A TOM dissimilarity-based heatmap was
   created to illustrate gene adjacencies and verify module division
   accuracy. Finally, the “labeledHeatmap” function was used to identify
   the modules most correlated with cancer by comparing sample groups as
   either control or cancer.

Analyzing the PPI network using Trader optimization algorithm

   The Trader algorithm was utilized to identify driver nodes in the
   constructed PPI network associated with pancreatic cancer. This
   algorithm randomly creates several potential answers and divides them
   into groups by chance. Every candidate solution (C.S.) shows some
   proteins as the pathogenesis of pancreatic cancer. However, the
   selected proteins are not presumably effective in inducing cancer.
   Therefore, the algorithm tries to change the C.S.s and generate better
   potential answers based on three operators. The first operator, called
   distributing and formulated in Eq. [162]1, chooses several variables of
   the best of a group and passes them to the other C.S.s of the same
   group.
   graphic file with name 41598_2024_74252_Equ1_HTML.gif 1

   R, i, and m represent a randomly selected subset of the variables, the
   i^th C.S., and the m^th C.S. (the best of the group) of a group,
   respectively.

   The second operator, named retailing and coded in Eq. [163]2, aims to
   induce small changes in a C.S.
   graphic file with name 41598_2024_74252_Equ2_HTML.gif 2

   R, K, and rand show a randomly selected subset of the variables, a
   random floating-point value between − 1 and 1, and a function
   generating a random floating-point value between the determined range,
   respectively. The third operator called Importing-exporting/I.E. and
   formulated in Eq. [164]3 transfers values of the best potential
   solutions of the groups to the variables of a C.S. of interest.
   graphic file with name 41598_2024_74252_Equ3_HTML.gif 3

   Like Eq. [165]1 and Eq. [166]2, R indicates a randomly selected subset
   of the variables. Also, Inline graphic and Inline graphic show the jth
   variables of the importer and exporter C.S.s (the best of their related
   groups), respectively. The fitness/score function was called after
   applying the operators mentioned above. In identifying candidate driver
   nodes of the pancreatic cancer-related PPI network, the score function
   removes the selected proteins/nodes and their edges/interactions from
   the networks and counts the total number of disjoint networks using the
   depth first search (DFS) algorithm. The higher generated subnetworks
   mean the higher worthiness of the selected proteins. Of note, in the
   Trader algorithm, the received changes will be ignored if they have no
   improvement in the C.S. score. The proposed algorithm was implemented
   in MATLAB programming language (MATLAB R2022a).

Hub gene selection

   Candidate hub genes were selected by analyzing both the constructed PPI
   network and co-expression networks. Initially, genes from the most
   disease-correlated module were extracted based on module membership
   (kME > 0.8) and gene significance (G.S. > 0.6) values. Concurrently,
   the edge list of the PPI network was analyzed using Trader to identify
   the top 30 DEGs. Trader detects nodes whose removal results in the
   maximum number of disconnected sub-networks. Finally, the intersection
   of genes identified by both Traderand WGCNA was considered as hub genes
   in the disease condition^[167]21.

Functional enrichment analysis

   To understand the underlying molecular mechanisms and pathways in
   pancreatic cancer, gene ontology (GO) and Reactome pathway enrichment
   analyses were conducted on the genes from the module of interest, as
   well as on all DEGs from other co-expressed modules. The CluGO module
   (Version 2.5.9) in Cytoscape software (Version 3.9.1) was used for the
   enrichment analysis, with a significance threshold set at
   P < 0.05^[168]88.

Hub gene validation

   In this step, the identified hub genes were subjected to various in
   silico experiments to determine their functions and roles in pancreatic
   cancer. GEPIA, which integrates data from The Cancer Genome Atlas
   (TCGA) and the Genotype-Tissue Expression (GTEx) project, was used to
   validate the expression of the hub genes in pancreatic versus normal
   samples. This validation involved 179 paired tumor and normal samples.
   Additionally, to confirm the protein expression of the hub genes in
   PDAC samples, immunohistochemistry (IHC) data from pancreatic
   adenocarcinoma and corresponding normal tissues were obtained from the
   Human Protein Atlas (HPA) database (Version 22.0)^[169]89.

   UALCAN ([170]http://ualcan.path.uab.edu/index.html), which provides
   proteomics data from the Clinical Proteomic Tumor Analysis Consortium
   (CPTAC)^[171]90, was used to examine protein expression patterns in
   tumor and control tissues. TIMER (Version 2.0), a platform for
   analyzing immune cell infiltration in various cancers, was employed to
   assess whether the expression levels of the hub genes correlate with
   immune cell infiltration in pancreatic cancer^[172]91. Additionally,
   DGIdb (Version 4.0) was utilized to identify potential drug candidates
   for targeting the identified hub genes in pancreatic cancer^[173]92.

Electronic supplementary material

   Below is the link to the electronic supplementary material.
   [174]Supplementary Material 1^ (12.5KB, docx)
   [175]Supplementary Material 3^ (8.8MB, png)
   [176]Supplementary Material 3^ (1.5MB, xlsx)

Acknowledgements