Abstract The COVID-19 pandemic has become a significant global issue in terms of public health. While it is largely associated with respiratory complications, recent reports indicate that patients also experience neurological symptoms and other health issues. The objective of this study is to examine the network of protein-protein interactions (PPI) between SARS-CoV-2 proteins and human host proteins, pinpoint the central genes within this network implicated in disease pathology, and assess their viability as targets for drug development. The study adopts a network-based approach to construct a network of 29 SARS-CoV-2 proteins interacting with 2896 host proteins, with 176 host genes being identified as interacting genes with all the viral proteins. Gene ontology and pathway analysis of these host proteins revealed their role in biological processes such as translation, mRNA splicing, and ribosomal pathways. We further identified EEF2, RPS3, RPL9, RPS16, and RPL11 as the top 5 most connected hub genes in the disease-causing network, with significant interactions among each other. These hub genes were found to be involved in ribosomal pathways and cytoplasmic translation. Further a disease-gene interaction was also prepared to investigate the role of hub genes in other disorders and to understand the condition of comorbidity in COVID-19 patients. We also identified 13 drug molecules having interactions with all the hub genes, and estradiol emerged as the top potential drug target for the COVID-19 patients. Our study provides valuable insights using the protein-protein interaction network of SARS-CoV-2 proteins with host proteins and highlights the molecular basis of manifestation of COVID-19 and proposes drug for repurposing. As the pandemic continues to evolve, it is anticipated that investigating SARS-CoV-2 proteins will remain a critical area of focus for researchers globally, particularly in addressing potential challenges posed by specific SARS-CoV-2 variants in the future. Keywords: COVID-19, SARS-CoV-2 proteins, Comorbidity, Host interacting genes, Ribosomal pathways, Drug repurposing Highlights * • A network-based approach, creating a network comprising 29 SARS-CoV-2 proteins interacting with 2896 human host proteins. * • Network find 176 host genes interacting with all viral proteins and involved in translation, splicing and ribosomal pathways. * • Top 5 most connected hub genes in the disease-associated network were determined to be EEF2, RPS3, RPL9, RPS16, and RPL11. * • 292 drug molecules were discovered to interact with hub genes and 13 molecules were found to interact with all hub genes. * • Estradiol was identified as interacting with all the hub genes in the drug-gene interaction databases. 1. Introduction Since its emergence in late 2019, the COVID-19 pandemic has led to unparalleled disruptions in societies and economies worldwide [[37]1]. As of early 2023, the virus continues to pose a significant threat, with millions of new cases and hundreds of thousands of deaths reported each day. The SARS-CoV-2 genome exhibits a high degree of diversity, possibly influenced by environmental pressures and regional health indices [[38]2,[39]3]. The emergence of new variants has added further complexity to the situation, leading to renewed concerns about the effectiveness of existing vaccines, antivirals, and the potential for further waves of infection [[40][4], [41][5], [42][6], [43][7], [44][8]]. While vaccination efforts have made significant progress in some parts of the world, the slow distribution of vaccines and the rise of vaccine hesitancy in other areas remain major challenges. The pandemic has sparked a surge in research endeavors focused on comprehending the virus and devising efficacious treatments and vaccines [[45]9,[46]10]. Studies have focused extensively on the proteins of the virus, revealing that SARS-CoV-2, a RNA-enveloped single-stranded virus, encodes 29 structural proteins [[47]11]. These include structural proteins such as spike, envelope, membrane, and nucleocapsid proteins, as well as nonstructural proteins like RNA-dependent RNA polymerase, papain-like protease, and main protease. These proteins are pivotal in facilitating SARS-CoV-2 replication within the human body, leading to cell infection, illness induction, and the onset of diverse manifestations throughout the body [[48][12], [49][13], [50][14], [51][15]]. The systematic network theoretical approach to studying host-pathogen protein interactions offers valuable insights, contributing to our understanding and illumination of the mechanisms that underlie a wide range of infectious diseases [[52][16], [53][17], [54][18]]. Both structural and nonstructural proteins play crucial roles in the virus lifecycle. Among them, the spike protein (S) is indispensable for facilitating viral entry into host cells [[55]19], while the envelope protein (E) aids in viral assembly and release [[56]20]. The membrane protein (M) plays a role in viral assembly [[57]21], and the nucleocapsid protein (N) binds to viral RNA and is involved in viral replication [[58]22]. Nonstructural proteins like RdRp, PLpro, and Mpro are crucial for viral replication and are potential targets for antiviral drugs [[59][8], [60][9], [61][10],[62]12,[63]23]. Moreover, recent studies have shown that the nonstructural proteins of SARS-CoV-2, including PLpro and Mpro, can play an important role in immune evasion strategies [[64]24]. SARS-CoV-2 proteins have been shown to interact with a large number of host proteins, which indicates the importance of genes in the virus-host interaction [[65]25]. In this study, we have identified 176 host genes out of 2896 total host genes that interact with all 29 viral proteins, indicating their critical role in SARS-CoV-2 infections. The gene ontology analysis revealed that these genes are involved in several biological processes, including translation, mRNA splicing, and ribosomal biogenesis. These findings suggest that targeting these genes may provide potential therapeutic targets for the treatment of COVID-19. Additionally, we have investigated the potential of drug repurposing by identifying drugs that interact with the hub genes and investigating their potential therapeutic effects. 2. Materials and methods 2.1. Generation of SARS-CoV-2 protein and host proteins interaction network The compilation of host proteins demonstrating interaction with SARS-CoV-2 proteins was sourced from the studies conducted by Gordan et al. [[66]25,[67]26]. To construct the interaction network between viral proteins and host proteins, the STRING plugin [[68]27] within the Cytoscape tool [[69]28] was utilized. This plugin leverages various data sources, including text-mining data, gene fusion, co-expression, neighborhood analysis, and experimental data, to generate the protein-protein interaction (PPI) network. 2.2. Generation of PPI network of host proteins and identification of hub genes The host proteins interacting with all viral proteins were compiled, and a protein-protein interaction (PPI) network was constructed using the Cytoscape tool. Subsequently, the network analyzer plugin within Cytoscape was employed to compute various network parameters. Hub genes within the network were identified based on their topological significance. Calculated network topological properties such as Degree of Connectivity (k), betweenness centrality, closeness centrality, and topological coefficient values were utilized to pinpoint highly connected nodes.The Degree of Connectivity (k) represents the number of interactions established by nodes within a network. It is expressed as the total number of edges connected to a particular node [Equation [70](1)]. [MATH: Degreecentrality(k)=aεKbw(a,b) :MATH] 1 In this equation [71](1), [MATH: Ka :MATH] represents the node set containing all the neighbors of node a, and w(a,b) denotes the edge weight connecting node a with node b. Betweenness centrality ( [MATH: Cb :MATH] ) quantifies the extent to which nodes serve as intermediaries along the shortest paths between other nodes in the network. A node with higher betweenness centrality holds more influence over the flow of information within the network. It is expressed as Equation [72](2): [MATH: Cb(u)=kufp(k,u,f)p(k,f) :MATH] 2 In Equation [73](2), p(k,u,f) represents the number of interactions from node k to f that pass through node u, while p(k,f) denotes the total number of shortest interactions between node k and node f. Closeness centrality ( [MATH: Cc :MATH] ) gauges the efficiency with which information propagates from a given node to other nodes in the network. The value of closeness centrality ranges from 0 to 1, where isolated genes exhibit a closeness centrality value of zero [Equation [74](3)]. [MATH: Cc(z)=1av< mi>g(L(z,m)) :MATH] 3 In Equation [75](3) z represents the node for which the closeness value is being calculated, and L(z,m) denotes the length of the shortest path between two nodes z and m. It has been observed that genes tend to have a high degree of connectivity and also tends to have high closeness centrality score. The Topological coefficients ( [MATH: Tf :MATH] ) reflects the propensity of nodes in the network to share common neighbors. Nodes with zero or one neighbor are assigned a topological coefficient of zero. For a node n with neighbors [MATH: kf :MATH] the topological coefficient is calculated as [Equation [76](4)]: [MATH: Tf=avg< mrow>(j(f,p))kf :MATH] 4 In Equation [77](4), j(f,p) represents the number of shared neighbors between node f and p, incremented by 1 if there exists an edge between nodes f and p. 2.3. Preparation of disease-gene interaction network Following the identification of hub genes within the network, databases containing information on disease-gene interactions, such as GeneORGANizer, DisGeNET, and MalaCards database [[78][29], [79][30], [80][31]], were surveyed. This screening aimed to pinpoint the disorders linked with the identified hub genes. These databases facilitate the analysis of the association between genes and the organs they affect. A comprehensive collection of over approximately 2 million disease-gene interactions was extracted from these databases. Subsequently, disease-gene interactions specifically relevant to the identified hub genes were further filtered and retrieved for analysis. 2.4. Gene ontology and pathways enrichment analysis The Database for Annotation, Visualization, and Integrated Discovery (DAVID) [[81]32] as employed to conduct a comprehensive enrichment analysis of RNA binding proteins. DAVID utilizes the Gene Ontology (GO) database and the Kyoto Encyclopedia of Genes and Genomes (KEGG) database [[82]33] for functional and pathway enrichment analysis. Functional enrichment analysis encompasses examination at biological, cellular, and molecular levels. Pathways and functions with a P-value less than 0.05 were deemed significantly enriched and were further investigated in the study. 2.5. Identification of drug compounds as regulators For identifying the drug compound as regulators against our identified hub-genes, databases such as PubChem [[83]34], DrugBank [[84]35], and Comparative Toxicogenomic Databases (CTD) [[85]36] having drug-gene interaction information were screened thoroughly. In addition to the databases mentioned above, the Enrichr database [[86]37] was also utilized to identify drugs interacting with the selected hub genes. Enrichr is a web-based tool grounded on Gene Set Enrichment Analysis (GSEA), which consolidates information concerning the function of groups of genes. Enrichr, in its backend, scans multiple drug-gene interaction databases alongside the GEO database and presents relevant significant interactions. Further, the NCBI GEO Profiling database [[87]38] was used to study the effect of the selected drug on the expression of the identified hub-genes. 3. Results 3.1. Protein-protein interaction network of SARS-CoV-2 proteins targeted host proteins The list interactions of COVID-19 viral protein-targeted human genes were retrieved from the Gordan et al. studies. From the list, we identified that 29 SARS-CoV-2 proteins were showing interactions with 2896 host proteins. The Protein-Protein interaction network of viral proteins with host proteins was prepared using the Cytoscape tool [[88]Fig. 1] [[89]Supplementary Table 1]. From the prepared network, many host proteins were identified as having interactions with multiple viral proteins. A total of 176 host genes were identified interacting with all the 29 viral proteins, indicating their importance during the COVID-19 viral attack. Fig. 1. [90]Fig. 1 [91]Open in a new tab Protein-Protein interaction network of SARS-CoV-2 proteins (red nodes) with the host proteins (blue nodes) to identify the reach of viral protein in the hosts body. 3.2. Protein-protein interaction network of host gene and identification of hub-genes After identifying the 176 host proteins interacting with all the viral proteins, we have used the STRING plugin of the cytoscape tool to prepare a PPI network to study the interactions between the concerned viral targeted proteins [[92]Fig. 2] [[93]Supplementary Table 2]. The prepared network showed the high-density interactions among the proteins suggesting high dependency in-between the proteins for their functional role. The Gene ontology analysis of the concerned hosts protein reveals their role in biological processes such as translation, cytoplasmic translation, mRNA splicing via spliceosome, rRNA processing, ribosomal small subunit biogenesis, RNA splicing, mRNA splicing, and alternative mRNA splicing. Cellular components were enriched in ribosome, cytosolic ribosome, cytosolic small ribosomal subunit, and nucleoplasm. Whereas the molecular functions were enriched in RNA binding, structural constituent of ribosome, mRNA binding, protein binding, cadherin binding, DNA binding and ubiquitin ligase inhibitor activity [[94]Fig. 3] [[95]Supplementary Table 3]. Pathway analysis of the genes revealed their roles in Ribosomal pathways, spliceosome pathways and in Parkinson disease [[96]Table 1] [[97]Supplementary Table: 3]. Fig. 2. [98]Fig. 2 [99]Open in a new tab PPI network of host protein showing direct interactions with the SARS-CoV-2 proteins. Fig. 3. [100]Fig. 3 [101]Open in a new tab Gene ontology analysis of the identified 176 host proteins showing direct interaction with viral proteins. Table 1. Kegg Pathways analysis of the host genes showing interaction with the viral proteins. Category Term Count P value KEGG_PATHWAY hsa03010:Ribosome 47 1.40E-51 KEGG_PATHWAY hsa03040:Spliceosome 17 4.60E-11 KEGG_PATHWAY hsa05012:Parkinson disease 10 0.007 [102]Open in a new tab Further we calculated the topological parameters of network for the identification of the hub genes [[103]Supplementary Table 4]. Hub genes in the network are those genes having highest direct interactions with the other genes in the network suggesting their importance in regulating the behavior of major part of the disease-causing network. To reinforce our earlier assertion regarding the high-density interaction network, we computed the eccentricity value of the network. In a biological network, the eccentricity of a node can be understood as its accessibility to be functionally reached by all other nodes in the network. Nodes with a higher eccentricity value compared to the average eccentricity value of the network can exert more influence on other nodes in the network and, conversely, can also be more easily influenced. Our observation revealed that more than 88 % of the nodes in the network possessed a higher eccentricity value than the average eccentricity value, indicating a strong interconnection among the nodes in the network. Among the nodes, EEF2 (k = 78), RPS3 (k = 72), RPL9 (k = 71), RPS16 (k = 70), and RPL11 (k = 69) emerged as the top 5 most connecting genes in the network, exhibiting a high degree of connectivity (k) and betweenness centrality value. Gene ontology analysis of the identified hub genes reveals their role in biological processes such as translation, cytoplasmic translation and rRNA processing. Cellular components were enriched in cytosolic ribosome, ribosome, membrane, rRNA binding, focal adhesion, extracellular exome, cytosol, cytoplasm, small ribosomal subunit, polysomal ribosome, rRNA processing. Whereas the molecular functions were enriched in structural constituent of ribosomes, RNA binding, and rRNA binding [[104]Fig. 4] [[105]Supplementary Table:5]. Pathways analysis reveals the role of hub genes in ribosomal pathways [[106]Table 2]. Fig. 4. [107]Fig. 4 [108]Open in a new tab Gene ontology analysis of the identified hubs genes from the network. Table 2. Kegg Pathways analysis of the identified hub genes from the host PPI network. Category Term Count P value Genes FDR KEGG_PATHWAY hsa03010: Ribosome 4 3.21E-05 RPS16, RPL11, RPS3, RPL9 1.92E-04 [109]Open in a new tab 3.3. Disease & hub-genes interaction network Following the identification of the most influential genes in the network, a disease-gene interaction network was constructed to establish connections between the hub genes, COVID-19, and other disorders. This disease-gene interaction network was created to gain insights into the comorbidities observed in COVID-19 patients. The disease-gene interaction network reveals that out of 5 hub-genes, 4 hub genes were having association with multiple disorders such as respiratory difficulties, dysarthria, congestive heart failure, melanoma, anaemia, parkinson's disease, cerebellar atrophy, truncal ataxia and more. The network further illustrates that numerous disorders share common genotypes. For instance, genes such as RPL11 (k = 79), EEF2 (k = 9), RPL6 (k = 3), and RPS3 (k = 2) are associated with multiple disorders [[110]Fig. 5]. Fig. 5. [111]Fig. 5 [112]Open in a new tab Disease gene interaction network: The network represents the association of the identified hub genes with other disorders creating an association between the COVID-19 and other disorders. 3.4. Drug repurposing The Enrichr database was used to identify the expression of the hub-genes in COVID-19 patients, and interestingly all the five hub genes were known to show downregulated expression in the COVID-19 patient [[113]Fig. 6A]. For identifying the potential drug targets against the concerned hub-genes several drug-gene interaction databases such as PubChem, DrugBank and CTD were screened. A total of 292 drug molecules were identified interacting with the hub-genes. Out of 292 molecules, we further identified 13 molecules interacting with all the hub-genes [[114]Fig. 6C]. Further we also screened the Enrichr database for potential drug targets. The GSEA of the drug perturbations from GEO database records of upregulated genes revealed Estradiol as the top significant enriched candidates [[115]Fig. 6B]. Interestingly, while screening the drug-gene interaction databases we also identified estradiol interacting with all the hub genes as represented in [116]fig-6B. Fig. 6. [117]Fig. 6 [118]Open in a new tab (A) Enrichr heatmap representing the downregulation of all the identified hub genes in COVID-19 conditions. (B) Heatmap from Enrichr database to identify the drugs showing interactions with hub genes (C) Drug-gene interaction network representing the drugs interacting with the hub-genes. Next, we scanned the GEO profile related to Estradiol against all the concerned hub genes and interestingly identified that estradiol increases the expression of all the hub genes (EEF2, RPS3, RPL9, RPS16, and RPL11) [[119]Fig. 7A–E] suggesting the greater potential of estradiol in the treatment of COVID-19 patient. Fig. 7. [120]Fig. 7 [121]Open in a new tab In-silico validation of the effects of estradiol on the expression of the concerned hub genes EEF2 (A), RPS3 (B), RPL9 (C), RPS16 (D) and RPL11 (E). 4. Discussion SARS-CoV-2, a single-stranded RNA virus, possesses a relatively large size genome [[122]11]. This genome encompasses all the instructions required for the virus to replicate and generate new viral particles. The virus also carries several non-structural proteins that aid in viral replication and evasion of the host immune response [[123]12,[124]23]. There have been many thousands of studies conducted worldwide on the SARS-CoV-2 viral proteins, in which researchers have been focused on understanding how they function, how they interact with host cells, and how they can be targeted by drugs or vaccines [[125]23]. One of the most critical aspects of COVID-19 research is identifying the interactions between viral proteins and host proteins [[126]25]. By studying these interactions, researchers can better understand the mechanisms by which the virus infects cells and identify potential targets for therapeutic intervention. In this study we have used network analysis to identify the host proteins that interact with all 29 (structural and non-structural) SARS-CoV-2 proteins and identified 176 critical genes that play a critical role in COVID-19 pathogenesis. Moreover, the study identified hub genes such as EEF2, RPS3, RPL9, RPS16, and RPL11. Hub genes are those with the highest number of direct interactions with other genes in the network. In addition, these hub genes also show interactions with each of the 29 proteins of SARS-CoV-2 ([127]Supplementary Table 1). The extensive connectivity of these hub genes across the viral protein repertoire highlights their central role in orchestrating molecular responses within the host-virus interplay. EEF2, or eukaryotic elongation factor 2, is accountable for the translocation of the ribosome during protein synthesis [[128]39]. Whereas the Ribosomal Protein S3 (RPS3), Ribosomal Protein L9 (RPL9), Ribosomal Protein S16 (RPS16), and Ribosomal Protein L11(RPL11) are ribosomal proteins that are involved in the assembly of ribosomes and translation initiation. These proteins help to stabilize the binding of mRNA to the ribosome for translation [[129]40]. Further we have analysed the expression of these hub genes through Enrichr, interestingly we found that all the five hub genes were getting downregulated in COVID-19 patients. The downregulation of these genes may have various effects on the body, such as the downregulation of EEF2 in COVID-19 patients may impair protein synthesis and affect various physiological functions [[130]41]. EEF2 has also been linked to the regulation of autophagy, a cellular process that removes damaged proteins and organelles [[131]42]. Downregulation of EEF2 may consequently impact the autophagy process and potentially contribute to the pathogenesis of COVID-19. While the RPS3, RPL9, RPS16, and RPL11 are ribosomal proteins that are essential for the formation of ribosomes and protein synthesis [[132]43]. The downregulation of these genes may impair the ribosome biogenesis process and affect protein synthesis [[133]44]. This may have various effects on the body, including a decrease in the production of immune-related proteins and an impairment in the response to viral infections [[134]45]. Additionally, these hub genes have been associated with various biological processes, including translation, cytoplasmic translation, mRNA splicing, and rRNA processing. The downregulation of these genes may thus affect these processes and contribute to the pathogenesis of COVID-19. Overall, the downregulation of EEF2, RPS3, RPL9, RPS16, and RPL11 genes in COVID-19 patients may have various effects on the body, including impairments in protein synthesis, autophagy, and immune responses [[135]46]. These hub genes are potential targets for drug repurposing, which involves using existing drugs to treat new diseases, which is an important approach in the search for effective COVID-19 treatments. In this study, we identified 292 drug molecules that interact with the EEF2, RPS3, RPL9, RPS16, and RPL11 hub genes that have been identified as being downregulated in COVID-19 patients. Furthermore, we identified 13 potential drug molecules that interact with all the hub genes. Among the possible therapeutic targets, estradiol was shown to interact with all the hub genes. Estradiol, a steroid hormone, has been identified as a common drug that interacts with all five hub genes (EEF2, RPS3, RPL9, RPS16, and RPL11) involved in protein synthesis and immune response in COVID-19 patients. The significance of this study lies in the potential therapeutic value of estradiol in the treatment of COVID-19. Research has shown that estradiol can regulate the expression of genes involved in protein synthesis. Estradiol has been found to increase the expression of EEF2, a gene involved in elongation during protein synthesis, and ribosomal proteins, genes involved in ribosome assembly and translation initiation [[136]47,[137]48]. Furthermore, estradiol has been shown to have immunomodulatory effects. Estradiol can regulate the production of cytokines and chemokines, crucial components of the immune response to infections [[138]49]. The ability of estradiol to modulate the immune response and regulate protein synthesis makes it a promising drug candidate for the treatment of COVID-19. Several studies have investigated the potential therapeutic value of estradiol in COVID-19. A study published in the Journal of Women's Health reported that postmenopausal women taking estradiol had a reduced risk of hospitalization and death from COVID-19 [[139]50]. Another study published in the journal Menopause found that estradiol treatment was associated with a lower risk of severe COVID-19 outcomes in women [[140]51]. Indeed, it's crucial to acknowledge that not all studies have identified a significant association between estradiol and COVID-19 outcomes. It's worth noting that investigations into the potential therapeutic benefits of estradiol in COVID-19 are still in their preliminary phases, and additional research is warranted to comprehensively comprehend its effects and potential clinical applications. Moreover, our analysis revealed that Estradiol upregulates the expression of all the hub genes, by interacting with all five hub genes involved in protein synthesis and immune response in COVID-19 patients. These findings suggest that Estradiol may have potential therapeutic effects against COVID-19. 5. Conclusion In conclusion, our study offers valuable insights into the gene-protein interaction networks of SARS-CoV-2 proteins and host proteins. Our findings suggest that targeting the identified hub genes may provide possible targets for therapeutic intervention in the treatment of COVID-19. Furthermore, our analysis of drug-gene interactions identified potential drugs, specifically the Estradiol. The estradiol's ability to interact with all five hub genes involved in protein synthesis and immune response in COVID-19 patients makes it a promising drug candidate for the treatment of COVID-19. The ability of estradiol to regulate protein synthesis and modulate the immune response suggests that it may have a significant therapeutic value in COVID-19. However, further research is essential to fully understand the effects and possible clinical applications of the identified drug molecules through in-silico analysis. While this computational approach helps pinpoint potential drugs from a large pool, it's crucial to note that the actual effects of these candidates require validation through clinical analysis. To address this, we are in the process of designing a systematic pipeline for the future validation of our findings through clinical studies. Overall, our study highlights the importance of genes, protein-protein interaction networks, and drug repurposing in the exploration for effective COVID-19 treatments. Ethical statement Present study does not require any ethical clearance. Data availability The data that support the findings of this study are within the manuscript and in the supplementary file. CRediT authorship contribution statement Wajihul Hasan Khan: Writing – original draft, Methodology, Data curation, Conceptualization. Razi Ahmad: Writing – original draft, Data curation, Conceptualization. Ragib Alam: Software, Formal analysis. Nida Khan: Software, Formal analysis, Data curation. Irfan A. Rather: Funding acquisition, Formal analysis, Data curation. Mohmmad Younus Wani: Writing – review & editing, Investigation, Formal analysis. R.K. Brojen Singh: Writing – review & editing, Supervision, Resources. Aijaz Ahmad: Writing – review & editing, Validation, Supervision, Resources, Conceptualization. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgment