Abstract Hepatocellular carcinoma (HCC) is one of the most prevalent malignant tumors globally, significantly affecting liver functions, thus necessitating the identification of biomarkers and effective therapeutics to improve HCC-based disabilities. This study aimed to identify prognostic biomarkers, signaling cascades, and candidate drugs for the treatment of HCC through integrated bioinformatics approaches such as functional enrichment analysis, survival analysis, molecular docking, and simulation. Differential expression and functional enrichment analyses revealed 176 common differentially expressed genes from two microarray datasets, [46]GSE29721 and [47]GSE49515, significantly involved in HCC development and progression. Topological analyses revealed 12 hub genes exhibiting elevated expression in patients with higher tumor stages and grades. Survival analyses indicated that 11 hub genes (CCNB1, AURKA, RACGAP1, CEP55, SMC4, RRM2, PRC1, CKAP2, SMC2, UHRF1, and FANCI) and three transcription factors (E2F1, CREB1, and NFYA) are strongly linked to poor patient survival. Finally, molecular docking and simulation identified seven candidate drugs with stable complexes to their target proteins: tozasertib (−9.8 kcal/mol), tamatinib (−9.6 kcal/mol), ilorasertib (−9.5 kcal/mol), hesperidin (−9.5 kcal/mol), PF−562271 (−9.3 kcal/mol), coumestrol (−8.4 kcal/mol), and clofarabine (−7.7 kcal/mol). These findings suggest that the identified hub genes and TFs could serve as valuable prognostic biomarkers and therapeutic targets for HCC-based disabilities. 1. Introduction Liver cancer has emerged as the fourth-leading cause of cancer-related deaths worldwide and the sixth-most frequently diagnosed malignancy [[48]1]. The incidence of liver cancer in men is about two times higher compared to women [[49]2]. Several risk factors have been implicated in the pathogenesis of this disease. The most prevalent risk factors for liver cancer are chronic hepatitis B virus (HBV) [[50]3], hepatitis C virus (HCV) [[51]4], aflatoxin-contaminated food [[52]5], type 2 diabetes [[53]6], heavy alcohol consumption [[54]7], smoking [[55]8], and obesity [[56]9]. There are three commonly recognized types of liver cancer, such as hepatocellular carcinoma (HCC), characterized by the malignant transformation of hepatocytes, accounting for approximately 80% of the global morbidity rate; secondly, intrahepatic cholangiocarcinoma (ICC), arising from biliary duct epithelium tissue; and finally, a rare type of hepatocellular carcinoma called combined HCC-ICC (cHCC-ICC), contributing to fewer cases worldwide [[57]10,[58]11,[59]12,[60]13]. Hepatocellular carcinoma (HCC) constitutes almost 80% of all primary liver cancer cases, posing a significant global health burden. Therefore, to ensure early diagnosis and the prediction of prognosis for patients with HCC, the identification of liver cancer biomarkers can be crucial for improving outcomes. Due to the absence of early clinical manifestations, approximately three-fourths of primary liver cancer instances are detected and diagnosed in the advanced stage, failing to accommodate radical surgery and other treatment options, leading to poor prognosis and outcome with high recurrence and mortality rate [[61]14]. Despite the identification of various risk factors as contributing factors to HCC, the molecular mechanism of pathogenesis has been poorly understood. Several studies have found alterations in a variety of vulnerable genes, including TP53, CTNNB1, AXIN1, cyclin D1 (CCND1), epidermal growth factor receptor (EGFR), c-myc, and Ras. These genes exhibit aberrant expression and mutations and are, therefore, implicated in the genesis and advancement of HCC [[62]15,[63]16,[64]17]. Down-regulated ASPP1 and ASPP2 genes have been found to play roles in the development of HCC [[65]18]. However, despite various studies aimed at identifying the underlying mechanisms, the current understanding is still insufficient. Therefore, further investigation is of profound importance to unveil the underlying mechanisms of HCC initiation and progression. To date, for early-stage HCC, different interventions such as surgical resection, liver transplantation, radiofrequency ablation, and various drugs such as cisplatin and doxorubicin by transarterial chemoembolization (TACE) have been used, but the efficacy of these treatments is still insufficient due to recurrence, complex risk factors, and drug resistance [[66]19,[67]20]. Several targeted therapies, including sorafenib, lenvatinib, regorafenib, cabozantinib, and pembrolizumab, have demonstrated effectiveness in diverse clinical trials either alone or in combination [[68]21]. However, the efficacy of these drugs is compromised due to the adverse effects they entail. Hence, drug repositioning or repurposing through bioinformatic tools can be a cost-effective and time-efficient approach to drug development. This approach involves exploring approved or investigational drugs that can reverse cancer-related abnormal gene expression and can be reused for HCC, thereby expanding the array of treatment alternatives [[69]22]. Microarray technology has been used for nearly a decade to discover molecular signatures, biomarkers, and regulatory pathways in multiple malignancies, including HCC, aiding in the prediction of progression and prognosis [[70]23]. However, the results obtained using different data processing and technological methods and small sample sizes are often contradictory, inconsistent, and insufficient [[71]24,[72]25]. Meanwhile, integrated bioinformatics analysis can process a large number of datasets to extract valuable biological insights, enabling the discovery of molecular mechanisms underlying pathogenesis, progression, and reliable biomarkers [[73]26]. This study aimed to explore the core DEGs, prospective biomarkers, therapeutic targets, potential drug agents, and enriched pathways in HCC by using an integrated bioinformatics approach. Herein, we focused on screening hub genes by establishing a functional protein–protein interaction (PPI) network alongside regulatory biomarkers (i.e., TFs and miRNA) and potential therapeutic agents via drug repositioning techniques. By elucidating the enriched pathways associated with the core DEGs, the study intended to enhance our understanding of the molecular mechanisms driving HCC development and progression. 2. Materials and Methods 2.1. Differentially Expressed Gene Screening from Microarray Datasets Microarray gene expression profile datasets were searched in the NCBI-GEO database [[74]27]. Then, two datasets with accession numbers [75]GSE29721 and [76]GSE49515 were chosen from two distinguished experiments. The former comprised a total of 20 samples from 11 patients, of which 10 were micro-dissected HCC tissue and the remaining 10 were normal adjacent liver tissue derived from the Chinese National Human Genome Center in Shanghai, China [[77]28]. [78]GSE49515 studied 24 samples of PBMC collected from normal healthy, hepatocellular carcinoma (HCC), pancreatic, and gastric cancer patients to establish aberrant gene expression analysis using Affymetrix gene arrays in the National Cancer Centre, Singapore [[79]29]. We used 10 healthy PBMC samples as controls and 10 PBMC samples of HCC patients as cases to find out differentially expressed genes. Then, datasets were analyzed to compare the expression level of genes between HCC and normal tissues in the GEO2R web platform [[80]27], where differential expression analyses were performed exercising the limma (linear models for microarray analysis) package [[81]30]. Here, the expression matrix was log2-transformed, and Benjamini and Hochberg corrections were applied to control the false positive rate. A nominal p-value < 0.05 and |log2FC| > 1 were applied as the cutoff criteria, and the resultant genes that satisfied both the cutoffs were considered statistically significant DEGs. To determine common DEGs between the datasets, we used the Jvenn web application [[82]31]. 2.2. Gene Ontology and Pathway Enrichment Analysis of CDEGs To figure out the molecular action and cellular action of the differentially expressed genes, functional and pathway enrichment analyses can be delineated. Therefore, the Database for Annotation, Visualization, and Integrated Discovery (DAVID; [83]http://david.ncifcrf.gov) (accessed on 7 January 2024) [[84]32], an open web-based biological information repository, was employed to recognize gene ontology (GO) [[85]33] terms that include the molecular function (MF), biological process (BP), and cellular component (CC) of the CDEGs. MF indicates molecular functions of genes; BP refers to processes about the activities of multiple genes; CC annotates the place within the cell where the genes are active. Pathway-based analysis to explore the mechanisms of complex diseases has introduced a new method called the topological-based approach that integrates both molecular and structural data of the pathway from a biological database to understand how and where genes interact with one another in a pathway [[86]34]. Here, we considered three knowledge bases—the Kyoto Encyclopedia of Genes and Genomes (KEGG) [[87]35], Reactome [[88]36], and BioCarta ([89]http://www.biocarta.com/) (accessed on 7 January 2024) in the Enrichr [[90]37] package, a gene set enrichment analysis tool—that work in topological-based approaches to explore signaling pathways of differentially expressed genes. GO terms and pathways were visualized in a bubble plot using the SRplot online tool [[91]38]. For all enrichment analyses, a p-value < 0.05 was considered statistically significant. 2.3. Identification of Hub Genes Identification of essential proteins by protein–protein interaction network topologies using high-throughput technology can be performed by analyzing the centrality of genes. Centrality measures in a PPI network provide insights into the importance of proteins within the network. Here, we exploited the STRING (Search Tool for the Retrieval of Interacting Genes) [[92]39] database to elucidate putative PPI networks of CDEGS, and the resulting PPI networks were depicted in Cytoscape software version 3.10.1 [[93]40], using the highest confidence score of 0.9. The hub genes were subsequently identified using the Cytoscape plugin cytoHubba ([94]http://apps.cytoscape.org/apps/cytohubba) (accessed on 8 Feburary 2024), which permits 11 specific ranking methods to determine hub genes from the PPI network. These methods are categorized into two types of algorithms: local-based methods and global-based methods [[95]41]. In this study, we specifically employed two local-based methods—degree centrality (DC) and maximal clique centrality (MCC)) and two global-based methods—closeness and edge percolated component (EPC), to enhance the reliability of our findings. Afterwards, DAVID was used to carry out the KEGG and GO analyses for the hub genes. 2.4. Identification of Transcriptional and Post-Transcriptional Regulators Dysregulated gene expression can be a contributing factor in several human disorders. Two major classes of gene expression regulators are transcription factors (TFs) and microRNAs (miRNAs). TFs regulate the transcriptional level of differentially expressed genes (DEGs) by binding in promoter regions, influencing the synthesis of RNA from DNA. In contrast, miRNAs control the post-transcriptional level of gene expression by binding to an area of 3′ untranslated regions of target genes, thereby negatively regulating the expression of the associated gene [[96]42]. We used the JASPAR [[97]43] and TarBase [[98]44] databases to identify regulatory TFs and miRNA, respectively. We availed of the Network Analyst [[99]45] web tool for using both the database and to visualize topological networks. 2.5. Validation of Hub Genes and TFs by mRNA Expression Level UALCAN [[100]46] is an extensive and sophisticated website housing cancer OMICS data designed to facilitate in-depth analysis of cancer transcriptomic data through the Cancer Genome Atlas (TCGA) Program. The UALCAN online database comprises 50 normal tissues and 371 HCC tissues from the TCGA database to compare mRNA expression levels. We utilized the UALCAN database to validate the expression level of the hub genes. 2.6. Clinicopathological Characteristics of Hub Genes The clinicopathological characteristics mainly encompass factors such as survival analysis, tumor stage, and tumor grade. The Kaplan–Meier (KM) plotter online tool [[101]47], which is a vast database comprising over 35 thousand samples from 21 distinct cancers, was employed to conduct the survival analysis of hub genes. This sophisticated and in-depth analysis contributes to the identification of robust prognostic biomarkers [[102]48]. In this study, overall survival (OS) and progression-free survival (PFS) were used to find prognostic biomarkers in HCC. The UALCAN database was used to incorporate characteristic changes in expression level with tumor stages and grades. 2.7. Identification of Novel Candidate Drugs for HCC In our quest to identify potential drugs that would target highly expressed hub genes, we employed drug-gene interactions through online drug repository web tools DSigDB [[103]49] and DGIdb [[104]50]. The DSigDB was accessed through Enrichr to predict candidate drugs. We also undertook interactions between hub genes and chemicals by the Drug–Gene Interaction Database (DGIdb), which combines a bunch of databases to predict druggable genes. 2.8. Molecular Docking and Simulation of Candidate Drugs To strengthen the suggestion of repositioning drugs for HCC, we employed molecular docking and simulation techniques to scrutinize the interactions between drug agents and specific molecular targets. The 3D crystal structure of AURKA (PDB ID: 5DT0), CCNB1 (PDB ID: 6GU2), RAGAP1 (PDB ID: 2OVJ), and RRM2 (PDB ID: 3OLJ) was collected in PDB format from the RCSB protein data bank [[105]51], and subsequently unnecessary chains, water molecules, and co-crystalized ligands were removed using PyMOL software (Version 1.7.4). After that, the energy of the target proteins was reduced using the SwissPDB viewer (Version 4.1.0) [[106]52]. Finally, molecular docking analyses were performed using PyRx software (version 0.8). The 3D structures of all the ligands were retrieved from the PubChem database (except cisplatin, barasertib, and paclitaxel) [[107]53]. Amber potential was applied to detect the most stable conformer with the lowest energy using Gabedit software (version 2.5.0) [[108]54]. Initially, all thirty energy-minimized ligands underwent docking against their respective target macromolecules. Later, using Gaussian 09 W Revision D.01, selected eight ligands were geometrically optimized employing density functional theory (DFT) [[109]55] with the B3LYP method and 6-31G basis set to obtain more reliable binding conformation. The Discovery Studio Visualizer 2021 software was used to calculate and visualize non-bonding interactions. Molecular dynamics simulation is widely used to examine the stability of protein-ligand complexes and to validate the results of molecular docking. In this context, normal mode analysis (NMA) dynamics simulation was conducted for the Cα atoms of the receptor proteins via the iMODS ([110]https://imods.iqfr.csic.es/) (accessed on 13 July 2024) server [[111]56]. 3. Results 3.1. Transcriptomic Molecular Signatures Identification of HCC Differential gene expression analyses of two HCC datasets ([112]GSE29721 and [113]GSE49515) elucidated a total of 1449 and 1719 DEGs, respectively, after setting off the cut-off criteria. In [114]GSE29721, analyzed between normal liver tissue and HCC tissue, 837 were up-regulated and 612 were down-regulated genes, as shown in the volcano plot ([115]Figure 1a). In [116]GSE49515, screened between PBMC samples from healthy patients and HCC patients, 1083 were up-regulated and 658 were down-regulated genes ([117]Figure 1b). A total of 176 common DEGs (CDEGs) between the datasets ([118]Table S1) were estimated and shown using a Venn diagram ([119]Figure 1c). Figure 1. [120]Figure 1 [121]Open in a new tab Distribution of differentially expressed genes (a) volcano plot for [122]GSE29721; and (b) [123]GSE49515; red dots, blue dots, and black dots indicate up-regulated genes, downregulated genes, and not significantly expressed genes in HCC compared to normal tissue, respectively. (p < 0.05, |log2FC| > 1) (c) Venn diagram showing the overlapping CDEGs between the datasets. The up-regulated genes and down-regulated genes exhibited an overlap of 86 and 22 differentially expressed genes, respectively ([124]Figure S1). It was first verified whether the acquired CDEGs were connected to HCC using the DisGeNet database [[125]57]. The top 5 diseases, ranked by overlapping genes, breast carcinoma, malignant neoplasm of breast, neoplasm metastasis, carcinogenesis, and hepatocellular carcinoma were found (p value < 0.05 was considered significant). Therefore, gene-disease interaction analysis reveals involvement of identified CDEGs in various carcinomas, including liver carcinoma. 3.2. Functional and Pathway Enrichment Analysis GO enrichment analysis demonstrated that the top five biological processes (BP) of up-regulated DEGs were cellular response to DNA damage stimulus, DNA repair, DNA damage checkpoint, cellular response to ionizing radiation, and regulation of cyclin-dependent protein serine/threonine kinase activity ([126]Figure 2a). The top five CC (cellular component) included the nucleus, nucleoplasm, cytosol, centrosome, and condensed chromosome, and the top three significantly enriched MF (molecular function) included protein binding, RNA-binding, and pyrophosphatase activity ([127]Figure 2a). Figure 2. [128]Figure 2 [129]Open in a new tab Enrichment analysis of up-regulated CDEGs (a) the top 5 enriched GO terms of the BP, CC, and MF categories of up-regulated DEGs; (b) the molecular pathways enriched by up-regulated DEGs in HCC patients. In the pathway enrichment analysis, we used three different databases: KEGG, Reactome, and BioCarta, to reveal which pathways are altered by CDEGs. Up-regulated DEGs were significantly enriched in the 6 KEGG, 52 Reactome, and 5 BioCarta pathways. KEGG analysis showed that DEGs were primarily enriched in the p53 signaling pathway, cellular senescence, cell cycle, purine metabolism, and D-glutamine and D-glutamate metabolism ([130]Figure 2b). The top five pathways included by Reactome were transport and synthesis of PAPS, cell cycle and mitotic, resolution of sister chromatid cohesion, Golgi cisternae pericentriolar stack reorganization, and cell cycle ([131]Figure 2b). The roles of BRCA1, BRCA2, and ATR in cancer susceptibility, cyclins and cell cycle regulation, BRCA1-dependent Ub ligase activity, polyadenylation of mRNA, and Sonic Hedgehog receptor Ptc1 regulating cell cycle ([132]Figure 2b) were the most enriched pathways in BioCarta. Gene ontology and pathway enrichment analysis of down-regulated genes and hub genes are presented in [133]Figures S2 and S3, respectively. 3.3. Protein–Protein Interactions (PPIs) Analysis A protein–protein interaction network predicting physical and functional associations between proteins was established using the STRING database, hiding the disconnected genes ([134]Figure 3a). Using the four different ranking methods—MCC, degree, closeness, and EPC—of the cytoHubba algorithm, we estimated a total of 19 different top-ranked genes ([135]Figure S4), among which 12 genes were selected as hub genes from the intersection of the topological methods ([136]Figure 3b). These hub genes are found to play roles in multiple aspects of HCC, including tumorigenesis, progression, and metastasis ([137]Table 1). Figure 3. [138]Figure 3 [139]Open in a new tab Identification of hub genes through protein–protein interaction network (a) PPI network of CDEGs. The nodes in hexagonal shapes represent identified hub genes; (b) the Venn diagram shows the intersection of 12 hub genes from four cytoHubba ranking methods. Table 1. A summary of 12 hub proteins identified from protein–protein interaction analysis, describing their functions and affliction in HCC. Gene Description Role in HCC Ref. CCNB1 G2/mitotic-specific cyclin-B1 Promote HCC; suggested biomarker of HCC. [[140]58] SMC4 Structural maintenance of chromosomes protein 4 Contribute to the development and prognosis of tumors and prognostic markers of HCC. [[141]59] RACGAP1 Rac GTPase-activating protein 1 Promotes proliferation of hepatocellular carcinoma cells; identified as a prognosis marker for early HCC detection. [[142]60] AURKA Aurora kinase A Impelled proliferation, migration, and invasion of HCC cells. [[143]61] PRC1 Protein regulator of cytokinesis 1 Identified as a prognostic marker for early HCC detection. [[144]60] SMC3 Structural maintenance of chromosomes protein 3 Upregulation of the SMC3 gene has been found in HCC patients and proposed as a potential therapeutic target. [[145]62] CEP55 Centrosomal protein of 55 kDa Overexpressed and correlated with progression and poor prognosis of HCC patients; stimulates JAK2–STAT3–MMPs axis. [[146]63] RRM2 Ribonucleoside-diphosphate reductase subunit M2 Suggested prognostic biomarkers and a therapeutic target for HCC. [[147]64] CKAP2 Cytoskeleton-associated protein 2 Suggested novel diagnostic biomarker of HCC. [[148]65] SMC2 Structural maintenance of chromosomes protein 2 Overexpressed and linked to tumorigenesis and poor prognosis in HCC. [[149]62] UHRF1 E3 ubiquitin-protein ligase UHRF1 Involved in poor prognosis by promoting cell proliferation and metastasis in HCC. [[150]66] FANCI Fanconi anemia group I protein Suggested as a diagnostic, prognostic marker, and therapeutic target. [[151]67] [152]Open in a new tab 3.4. Gene Expression Regulators of CDEGs Transcription factors (TFs) and microRNAs (miRNAs) are regulators of gene expression and biological processes, thereby influencing events associated with these processes [[153]68,[154]69]. We examined the interaction networks of TFs and miRNAs with CDEGs and hubs separately and selected the top 10 based on a topological parameter called degree. Among them, we identified seven TFs—FOXC1, GATA2, NFIC, YY1, E2F1, NFYA, and CREB1 ([155]Figure S5)—and seven miRNAs—hsa-mir-1-3p, hsa-mir-124-3p, hsa-mir-16-5p, hsa-mir-34a-5p, hsa-mir-129-2-3p, hsa-mir-103a-3p, and hsa-mir-147a ([156]Figure S6)—as mutual regulatory components for both the CDEGs and hub genes. 3.5. mRNA Expression of Hub Genes and TFs in Patients with HCC The mRNA expression levels of the 12 hub genes were verified from the UALCAN online database. The mRNA expression level of all twelve hub genes (CCNB1, AURKA, RACGAP1, CEP55, SMC4, RRM2, PRC1, CKAP2, SMC2, UHRF1, FANCI, and SMC3) in primary liver cancer tissues was highly up-regulated compared to normal tissues ([157]Figure 4). Figure 4. [158]Figure 4 [159]Open in a new tab Expression levels of 12 hub genes in primary liver tumors (n = 371) compared to normal samples (n = 50) from TCGA data. We identified seven regulatory TFs, namely FOXC1, GATA2, NFIC, YY1, E2F1, NFYA, and CREB1. Given the statistically significant result, only E2F1, NFYA, and CREB1 were overexpressed in liver tumor tissues, in contrast to their adjacent normal tissues, and were associated with worse survival for patients with liver tumors, according to the Kaplan–Meier plotter database ([160]Figure 5). Figure 5. [161]Figure 5 [162]Open in a new tab Depiction of expression patterns of three transcription factors: (a) E2F1, (b) NF-YA, and (c) CREB1, highlighting the upregulation of these transcription factors in liver tumors (red) compared to adjacent normal tissues (blue). Sub-figures (d–f) illustrate survival analyses for patients with high expression levels of the TFs E2F1, NF-YA, and CREB1, indicating worse overall survival in liver cancer patients with higher expression levels. 3.6. Clinicopathological Characteristics of Hub Genes To assess the prognostic significance of a specific gene, a cohort of 364 patient samples was divided into two groups: a low-expression cohort and a high-expression cohort, using the median expression level of the proposed biomarker as the cutoff point. A hazard ratio with a 95% confidence interval and a LogRank p value (<0.05 regarded as statistically significant) were used to create Kaplan–Meier survival graphs. Kaplan–Meier curves of the 12 hub genes revealed that patients with higher expression levels of these genes experienced poorer overall survival (OS) and recurrence-free survival (RFS). However, both the OS (LogRank p = 0.33) and RFS (LogRank p = 0.23) for SM3 were statistically insignificant ([163]Figure 6 and [164]Figure S7, respectively). According to the UALCAN database, for all the hub genes, the greater the degree of expression, the higher the grade ([165]Figure 7) and stage ([166]Figure S8) in HCC patients. Therefore, higher grades and stages of HCC lead to a decrease in the differentiation capability in HCC tissues. Figure 6. [167]Figure 6 [168]Open in a new tab Overall survival (OS) analysis of 12 potential hub genes. p < 0.05 indicates a statistically significant difference in mortality between groups. Sub-figures (a–l) demonstrate that patients in the high-expression group experience significantly worse overall survival compared to the low-expression group, except for SMC3 (c), which is not statistically significant (log-rank p = 0.33). HR: hazard ratio of the two groups. Figure 7. [169]Figure 7 [170]Open in a new tab Expression of twelve hub genes in hepatocellular carcinoma patients with different tumor grades using TCGA samples. All boxplots illustrate that patients with higher tumor grades exhibit elevated gene expression in different colors. 3.7. Identification of Novel Candidate Drugs To distill key insights from the analyses of gene-drug interactions, DGIdb and DSigDB were meticulously employed to identify potential therapeutics that may reverse the expression patterns of key genes. DGIdb revealed a comprehensive list of 144 drugs targeting four specific genes (CCNB1, AURKA, RACGAP1, and RRM2), with approximately 13.88% of the compounds being approved medications. DSigDB provided a total of 287 distinct drugs, gathered from different sources, applying a statistically significant p-value < 0.05. The drugs exhibited interactions with all the hub genes. According to DGIdb, drugs or genes with lower interaction scores are implicated in numerous overlapping interactions and will have a higher rank in the search set [[171]50]. Therefore, in this study, we considered the top 10 drugs with the lowest interaction scores from DGIdb, as well as the commonly identified 23 drugs between two databases, for further analyses ([172]Figure S9). 3.8. Molecular Docking Analysis Molecular docking is widely used to anticipate the binding strength and orientation of ligands within the active site of a receptor [[173]70,[174]71]. Here, MLN-8054 was used as the standard drug for AURKA, against which the binding affinity of other compounds was evaluated and compared [[175]72,[176]73]. Notably, tozasertib (−9.8 kcal/mol), tamatinib (−9.6 kcal/mol), ilorasertib (−9.5 kcal/mol), hesperidin (−9.5 kcal/mol), and PF-562271 (−9.3 kcal/mol) exhibited higher binding affinities than MLN-8054 (−9.0 kcal/mol). Clofarabine (−7.7 kcal/mol) against RRM2, and coumestrol (−8.4 kcal/mol) against CCNB1 demonstrated the highest binding energy among the drugs. Binding affinities and non-bonding interactions of ligands with the amino acid residues of the target protein are presented in [177]Table S3. 3.9. Validation of the Findings Using Additional Microarray Dataset ([178]GSE6764) To further validate our findings, we analyzed an additional dataset ([179]GSE6764), including HCC-diagnosed and normal liver tissue samples. We chose this dataset for two reasons: first, it uses a similar expression profiling platform to the primary liver dataset [180]GSE29721; second, it contains samples from patients at different stages, thereby allowing us to compare our primary results with those from different disease stages. However, to maintain consistency with the previous analysis, we screened DEGs across three distinct stages in the new dataset, obtained CDEGs by comparing these results with previous findings, and conducted the remaining analyses following the same method as before. We found 52, 76, and 114 CDEGs in the early stage, advanced stage, and very advanced stage, respectively, with two previous datasets ([181]Table S3). Interestingly, among the previously identified 12 hub genes, 7 genes found in the early stage, i.e., AURKA, CCNB1, CKAP2, FANCI, PRC1, RACGAP1, and RRM2, were present in all three stages. CEP55, SMC4, and UHRF1 were newly identified in the advanced stage and SMC2 in the very advanced stage. However, SMC3 did not appear across any of the stages. All the hub genes were up-regulated in all stages, and their log fold change values significantly increased with the advancement in the stages, which further validated our previous findings regarding the clinicopathological characteristics of hub genes. Based on validation results, it is suggested that these seven hub genes can be used for diagnostic and prognostic purposes even at early-stage HCC. Further, we compared the functional enrichment results of the three stages with previous findings; there we found similarities to a large extent ([182]Table S3). Therefore, these additional analysis results validate our framework by obtaining consistent results in general. 4. Discussion Hepatocellular carcinoma (HCC) is the most prevalent liver tumor globally, with high rates of recurrence and metastasis. However, the early diagnostic tools and underlying pathogenesis of HCC are not fully understood, leading to a poor prognosis. Recent studies have found that aberrant gene expression and mutation are implicated in the tumorigenesis and advancement of HCC [[183]17]. Several studies have suggested that transcriptomic alterations in peripheral blood mononuclear cells (PBMCs) are associated with multiple malignancies such as hepatocellular carcinoma [[184]74], renal cell carcinoma, and pancreatic cancer [[185]75]. PBMCs, which include various blood cells such as lymphocytes and monocytes, are abundant and easy to collect. PBMCs are found to contribute to tumorigenesis and poor prognosis in HCC patients by immune evasion [[186]76]. A study found that various subsets of PBMCs, such as neutrophils, monocytes, and M1 macrophages highly infiltrating the tumor microenvironment, decrease the survival of patients [[187]77]. Furthermore, the integration of datasets from different biological sources allows the identification of commonly dysregulated genes across tissues and the discovery of similar disease pathways, thereby strengthening our understanding of the molecular mechanisms underlying complex diseases. Therefore, we considered the study of transcriptomic changes in PBMC samples along with the liver tumor tissues of HCC patients to identify crucial biomarkers commonly expressed in both and also to enhance the validity and credibility of our findings. This approach also could potentially lead to the utilization of a less invasive method for prognostic and diagnostic evaluation. In the current study, we utilized two GEO microarray datasets to screen differentially expressed genes between liver cancer tissues and non-cancerous tissues. To the best of our knowledge, this is the first study where PBMC samples and liver tissue of HCC patients have been used in a single study to identify crucial genes that are coordinately expressed in both samples. This study identified a total of 176 DEGs between the datasets, including 22 downregulated genes and 86 up-regulated genes. To validate our findings, we checked the expression pattern of all up- and down-regulated genes, and the results were consistent with our initial findings. Subsequently, we validated our study using a gold benchmark database such as DisGeNet. Functional and pathway enrichment analyses were performed to explore the biological functions of DEGs and pathways associated with HCC. GO analysis of BP showed that up-regulated genes were mainly enriched in cellular response to DNA damage stimulus, DNA repair, DNA damage checkpoint, and regulation of cyclin-dependent protein serine/threonine kinase activity. Previous studies have reported that enhancement of the DNA repair pathway and mutation of the DNA damage repair pathway have been associated with tumorigenesis, development, and worse survival in hepatocellular carcinoma (HCC) [[188]78,[189]79]. Cyclin-dependent protein kinases (CDKs) control the regulation and expression of various components essential for cell proliferation. So, up-regulation of cyclin-dependent protein serine/threonine kinase activity may prominently implicate in the pathogenesis and progression of HCC; therefore, inhibitors of cyclin-dependent protein kinases can be promising drug targets for HCC, given their successful targeting in other types of cancer [[190]80]. Additionally, the MF (molecular function) of up-regulated genes was significantly enriched in protein binding, RNA-binding, and pyrophosphatase activity. Studies have suggested that RNA-binding proteins (RBPs) function to form RNP (ribonucleoprotein) units or protein-protein complexes; thus, deregulated genes of RBPs are implicated in the development and progression of hepatocellular carcinoma and can serve as therapeutic targets [[191]81,[192]82]. KEGG analysis of up-regulated CDEGs was primarily enriched in the p53 signaling pathway, cellular senescence, cell cycle, and purine metabolism. The p53 gene governs the regulation of diverse biological processes, ranging from cell cycle arrest and apoptosis to senescence and energy metabolism to thwart tumorigenesis [[193]83,[194]84]. So, mutations in the tumor suppressor p53 signaling gene contribute to the proliferation, metastasis, and sensitivity to radiotherapy in HCC. The top pathways included by Reactome were cell cycle, mitotic, and resolution of sister chromatid cohesion. Prior studies have indicated that abnormalities in the cell cycle process and mitotic cell cycle are drivers in the development or advancement of HCC [[195]85,[196]86]. BRCA1, BRCA2, and ATR genes, along with their associated pathways, cyclins, and cell cycle regulation, as well as BRCA1-dependent Ub ligase activity, may contribute to the pathogenesis and progression of HCC, according to BioCarta [[197]87]. The 12 up-regulated hub genes were found to be closely related to cell cycle regulation, and their dysregulated expression can lead to aberrant protein functions, ultimately resulting in uncontrolled cell proliferation, which is one of the hallmarks of cancer [[198]88]. Functional enrichment analysis revealed some crucial processes related to cell cycle such as p53 signaling pathway (CCNB1 and RRM2), cell cycle (CCNB1 and RRM2), mitotic and G1/S transition (CCNB1 and RRM2), cell cycle checkpoints (CCNB1), G2/M transition of mitotic cell cycle (CCNB1 and AURKA), DNA repair (FANCI, UHRF1, and SMC3), and mitotic cytokinesis (RACGAP1, CKAP2, and CEP55) [[199]Table S2]. Therefore, the cell cycle disruptions with these hub genes may contribute to the pathogenesis. Hub genes regulating processes that are related to cell cycle disruption are given in [200]Table S2. The study identified CCNB1, AURKA, RACGAP1, CEP55, SMC4, RRM2, PRC1, CKAP2, SMC2, UHRF1, SMC3, and FANCI as the hub genes. We found that all the hub genes were significantly overexpressed in hepatocellular carcinoma. Several studies have demonstrated overexpression of CCNB1, AURKA, CEP55, SMC2, RRM2, and PRC1, which is consistent with our study [[201]62,[202]63,[203]89]. In various studies, these hub genes were suggested as prognostic biomarkers and to be involved in the development, progression, migration, and poor prognosis of HCC patients [[204]58,[205]59,[206]60,[207]61,[208]66,[209]67]. Moreover, high expression levels of these genes, except for SMC3, were associated with worse OS and RFS. Additionally, higher expression levels of these hub genes were found to be associated with higher tumor grade and individual tumor stage, all indicating a poor prognosis in HCC patients. Therefore, given these considerations, this study suggests that these eleven hub genes (CCNB1, AURKA, RACGAP1, CEP55, SMC4, RRM2, PRC1, CKAP2, SMC2, UHRF1, and FANCI) can be potential prognostic biomarkers and therapeutic targets in HCC. We revealed regulatory biomolecules of identified hub genes and CDEGs with high degree values. Among the seven transcription factors (TFs), E2F1, CREB1, and NFYA were found to be overexpressed and associated with worse survival simultaneously in HCC tissue. E2F1, an important transcription factor that regulates the transcription of particular genes essential for cell cycle progression, has been found to be involved in HCC development, progression, and invasion by manipulating the expression of several genes, including MREG and DDX11 [[210]90,[211]91]. While inducing tumorigenesis, E2F1 activates the PI3K/AKT/mTOR pathway [[212]92]. A previous study reported the overexpression of E2F1 in liver tumor tissues, which aligns with our findings [[213]90]. The metastatic and proliferative characteristics of HCC cells are promoted via CREB1, driven by the HBx-CTTN coordination [[214]93]. Furthermore, we found elevated expression of CREB1 with poor survival outcomes in HCC, which was also supported by other studies [[215]94]. NF-YA is one of three isoforms of NF-Y, a ubiquitous trimetric transcription factor. As NF-YA regulates cell-cycle genes, any changes in expression thereby contribute to tumorigenesis [[216]95]. NF-YA has been found to be up-regulated in various cancers, such as cervical cancer and HCC [[217]96,[218]97]. Additionally, it is reported that, through the MEK/ERK pathway, NF-YA promotes the proliferation and invasion of HCC cells [[219]95]. Finally, we utilized drug repurposing strategies to find small molecular compounds capable of reversing the expression patterns of key genes. Among 33 identified drugs, this study presents seven drugs based on good binding affinity against their respective target proteins. AURKA, a member of the serine/threonine kinase family, plays a pivotal role in cell proliferation, particularly in the G2/M phase, as well as in processes such as DNA damage checkpoint recovery, centrosome maturation, and spindle formation [[220]98,[221]99]. AURKA has been found to be involved in tumorigenesis, affecting cell proliferation, apoptosis, and metastasis in hepatocellular carcinoma [[222]100]. Thus, several drugs targeting AURKA have been produced and are in various phases of clinical trial. This study focused on drugs that target this gene. Tozasertib, an inhibitor of Aurora A, B, and C, is the first AKI (aurora kinase inhibitor) that has clinically been used for the treatment of CML and ALL [[223]101]. Tozasertib exhibited the highest binding affinity against AURKA. Hesperadin, a small-molecule inhibitor with activity against Aurora A and/or B, is being developed as the first-generation AKI [[224]102]. Ilorasertib is an aurora kinase inhibitor and has been used for phase I and phase II-advanced solid malignancies and phase I-advanced hematologic malignancies [[225]103]. PF-562271, a FAK/Pyk2 tyrosine kinase inhibitor, has shown effectiveness in inhibiting tumor growth and recurrence when used in combination with sunitinib in rats [[226]104]. Tamatinib is an active metabolite of the prodrug fostamatinib that has been approved for chronic immune thrombocytopenia and investigated for its effectiveness against sorafenib-resistant hepatic cell lines [[227]105,[228]106]. No studies have been found investigating the activity of tamatinib and ilorasertib against HCC. All the molecules targeting AURKA bind at a common site, as depicted in [229]Figure S10. Coumestrol, a phytoestrogen, has been notified as a chemo-preventive in breast cancer [[230]107]. However, there are currently no studies reporting coumestrol’s activity against CCNB1. Elevated expression of human ribonucleotide reductase is observed to play roles in diseases such as cancer, rendering this enzyme an appealing target for drug discovery [[231]108,[232]109]. According to molecular docking analysis, clofarabine was found to bind to the same site as triapine ([233]Figure S10), with a binding affinity of −7.7 kcal/mol against RRM2 [[234]110]. Research indicates that hydrogen bonding plays a critical role in the specificity of ligand binding, with hydrogen bonds formed within distances of 1.5 to 2.5 Å contributing to enhanced binding characteristics to different extents [[235]111]. Significant conventional hydrogen bonds with ASP274 LYS162 have been found in almost all drugs ([236]Figure S11). Notably, clofarabine showed hydrogen bonds with all the amino acid residues that justify strong binding interactions with proteins. Non-covalent interactions confer stability to the ligand at its target site and regulate its efficacy [[237]112]. Deformability measures the flexibility of each amino acid residue during distortion. Distortions of all amino acids in the three proteins remained within permissible limits ([238]Figure S12). These unequivocally signify the existence of thermodynamically favorable conformations for each of these interactions. This study presents findings based on bioinformatics analysis; therefore, experimental validation in biological systems is essential to enhance credibility. Biomarkers should be highly specific and sensitive for use in diagnosis; thus, prospective studies with large populations are required, utilizing various techniques such as Western blot and qRT-PCR. In vitro and in vivo experiments should also be conducted to confirm the potential of repurposed drugs in clinical use. However, our study suggests potential noninvasive biomarkers that could predict the prognosis of HCC patients, as well as promising treatments for HCC. 5. Conclusions The purpose of the current study was to identify prognostic biomarkers and potential therapeutic targets and reposition drugs to achieve better treatment strategies in HCC-related disabled patients by leveraging integrated bioinformatics analyses. The insights gained from this study may be of assistance in understanding the molecular mechanisms deeply driving tumorigenesis and the advancement of HCC. The hub genes and TFs uncovered in our study might hold promise as robust prognostic biomarkers and therapeutic targets, while the small molecular compounds identified may offer potential as targeted agents against HCC. However, further in vitro and in vivo validation should be undertaken to assess the safety and efficacy of the proposed drug candidates. Supplementary Materials The following supporting information can be downloaded at: [239]https://www.mdpi.com/article/10.3390/biology13120966/s1, Table S1: Common differentially expressed genes (CDEGs) between [240]GSE29721 and [241]GSE49515; Figure S1: Up-regulated genes shared by two datasets a. Up-regulated genes b. Down-regulated genes; Figure S2: Gene ontology and pathway analysis of down regulated DEGs a. Gene ontologies of down-regulated genes, b. Enriched molecular pathways by down-regulated DEGs in HCC patients; Figure S3: GO and KEGG pathway enrichment analyses of hub genes; Table S2: Hub genes regulating processes that are related to cell cycle disruptions; Figure S4: Hub proteins identified by four different cytoHubba algorithms; a. MCC b. Degree c. Closeness d. EPC; Figure S5: Illustration of regulatory molecules in HCC. The larger the nodes in squared shape, the higher the degree of interactions and their significance. and the edges indicate the interconnection between genes and regulatory molecules a. TFs-CDEGs interaction b. TFs- hub genes interaction; Figure S6: Analysis of CDEGs/hub genes-miRNAs interaction networks reveals the post-transcriptional regulators of the CDEGs and hub genes in HCC. This figure illustrates the interactions of 12 hub genes and 176 CDEGs with their regulatory elements: a. miRNA-CDEGs interaction and b. miRNAs-hub genes interaction. Square-shaped nodes represent the miRNAs identified in this study; Figure S7: Recurrence-free survival (RFS) of 12 hub genes. p value < 0.05 was considered significant; Figure S8: A depiction of the expression level of hub genes in hepatocellular carcinoma patients using TCGA samples illustrate that the higher the tumor stage, the greater the expression of all hub genes. (Due to the small sample size of stage 4 (n = 6), bias in the result could arise, therefore, it was not included in the analysis); Figure S9: Summary of the identified 33 drug candidates a. 33 prospective candidate drugs with their corresponding targets according to both databases. b. Categorization of the identified candidate drugs according to advancement in therapeutic pipeline c. Drug target with their corresponding percentage of all drugs; Figure S10: Docked complex of a. AURKA (PDB ID: 5DT0) with i. PF-562271 ii. Ilorasertib iii. Hesperidin iv. Tozasertib v. Tamatinib vi. MLN-8054 b. CCNB1 (PDB ID: 6GU2) with Coumestrol c. RRM2 (PDB ID: 3OLJ) with clofarabine; Figure S11: Nonbonding interactions of a. MLN8054 b. Tozasertib c. Ilorasertib d. Tamatinib e. Coumestrol f. Clofarabine with the amino acid residues of target protein; Table S3. Nonbonding interactions of drug molecules with their targeted receptor proteins; Figure S12: Deformability of receptor proteins bounded with drug in the active site, highlighting the specific amino acid most susceptible to breakdown due to interaction with drug; a. RRM2 b. CCNB1 c. AURKA; Table S4: Comparative results between primary datasets ([242]GSE49515 and [243]GSE29721) and validation dataset ([244]GSE6764) [245]biology-13-00966-s001.zip^ (3.6MB, zip) Author Contributions Conceptualization, M.A.M. and. A.A.; methodology, M.O.R. and M.A.H.; validation, M.A.M., A.A. and M.N.A.; formal analysis, visualization, and data curation, M.O.R., A.D., N.N., M.A.H. and J.-E.-T.; writing—original draft preparation, M.O.R., A.S.A.-M., N.A. and S.A.A.; writing—review and editing, A.A., M.N.A. and M.A.M.; supervision, M.N.A. and M.A.M.; funding acquisition, A.S.A.-M., N.A. and S.A.A. All authors have read and agreed to the published version of the manuscript. Institutional Review Board Statement Not applicable. Informed Consent Statement Not applicable. Data Availability Statement The original data presented in the study are openly available in the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) open access database ([246]http://www.ncbi.nlm.nih.gov/geo/) (accessed on 5 Feburary 2024) with accession numbers [247]GSE29721 [[248]25] and [249]GSE49515 [[250]26]. Conflicts of Interest The authors declare no conflicts of interest. Funding Statement The authors extend their appreciation to the King Salman Center for Disability Research for funding this work through research group no KSRG-2023-456. Footnotes Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. References