Abstract

Purpose

   Graph coloring approach has emerged as a valuable problem-solving tool
   for both theoretical and practical aspects across various scientific
   disciplines, including biology. In this study, we demonstrate the graph
   coloring’s effectiveness in computational network biology, more
   precisely in analyzing protein–protein interaction (PPI) networks to
   gain insights about the viral infections and its consequences on human
   health. Accordingly, we propose a generic model that can highlight
   important hub proteins of virus-associated disease manifestations,
   changes in disease-associated biological pathways, potential drug
   targets and respective drugs. We test our model on SARS-CoV-2
   infection, a highly transmissible virus responsible for the COVID-19
   pandemic. The pandemic took significant human lives, causing severe
   respiratory illnesses and exhibiting various symptoms ranging from
   fever and cough to gastrointestinal, cardiac, renal, neurological, and
   other manifestations.

Methods

   To investigate the underlying mechanisms of SARS-CoV-2
   infection-induced dysregulation of human pathobiology, we construct a
   two-level PPI network and employed a differential evolution-based graph
   coloring (DEGCP) algorithm to identify critical hub proteins that might
   serve as potential targets for resolving the associated issues.
   Initially, we concentrate on the direct human interactors of SARS-CoV-2
   proteins to construct the first-level PPI network and subsequently
   applied the DEGCP algorithm to identify essential hub proteins within
   this network. We then build a second-level PPI network by incorporating
   the next-level human interactors of the first-level hub proteins and
   use the DEGCP algorithm to predict the second level of hub proteins.

Results

   We first identify the potential crucial hub proteins associated with
   SARS-CoV-2 infection at different levels. Through comprehensive
   analysis, we then investigate the cellular localization, interactions
   with other viral families, involvement in biological pathways and
   processes, functional attributes, gene regulation capabilities as
   transcription factors, and their associations with disease-associated
   symptoms of these identified hub proteins. Our findings highlight the
   significance of these hub proteins and their intricate connections with
   disease pathophysiology. Furthermore, we predict potential drug targets
   among the hub proteins and identify specific drugs that hold promise in
   preventing or treating SARS-CoV-2 infection and its consequences.

Conclusion

   Our generic model demonstrates the effectiveness of DEGCP algorithm in
   analyzing biological PPI networks, provides valuable insights into
   disease biology, and offers a basis for developing novel therapeutic
   strategies for other viral infections that may cause future pandemic.

Supplementary Information

   The online version contains supplementary material available at
   10.1186/s12859-024-05690-0.

   Keywords: Differential evolution, Protein–protein interaction networks,
   Drug target, Generic model, Hub proteins, KEGG pathway, Respiratory
   disorder

Background

   COVID-19, a global pandemic that emerged in late 2019, is caused by the
   highly transmissible severe acute respiratory syndrome coronavirus 2
   (SARS-CoV-2). Belonging to the Coronaviridae family and
   Orthocoronavirinae subfamily, SARS-CoV-2 is a single-stranded
   positive-sense RNA (+ssRNA) virus, which belongs to Betacoronavirus
   genus among four classified coronavirus genera: Alphacoronavirus,
   Betacoronavirus, Gammacoronavirus, and Deltacoronavirus [[31]1]. Within
   its genomic makeup, SARS-CoV-2 encodes sixteen non-structural proteins
   (Nsp1-16), four structural proteins (S, E, M, and N), and some
   accessory proteins (ORF) [[32]2]. SARS-CoV-2-infected individuals
   commonly experience severe respiratory symptoms, such as fever, cough,
   and loss of taste and smell at the initial stage of infection. However,
   the impact of the virus extends beyond the respiratory system,
   resulting in additional manifestations such as nausea, intestinal
   obstruction, lung injury, and various complications affecting organs
   such as the kidney, gut, brain, heart, and other body parts [[33]3,
   [34]4].

   Till now, this global threat has taken almost seven-million human lives
   and infected more than seven hundred million people worldwide.[35]1
   During the pandemic period, numerous studies underscored the critical
   role of protein–protein interactions (PPIs) between SARS-CoV-2 virus
   proteins and human host factors in the initiation and progression of
   the infection, leading to the development of pathogenesis and
   associated human pathophysiological responses [[36]5–[37]9]. PPIs are
   the physical connections between two or more proteins. The collection
   of various protein–protein interactions forms a biological network
   called protein–protein interaction network (PPIN). Many researchers
   predicted these PPIs between SARS-CoV-2 virus and human proteins to
   point out the underlying mechanism of SARS-CoV-2 proteins-mediated
   disease propagation [[38]5, [39]6, [40]10, [41]11]. The authors also
   predicted the affected pathways, target host proteins, associated
   disease progression, underlying cause of COVID-19 pathobiology,
   anti-viral drug repurposing and preventive therapeutic measures in a
   discrete way [[42]6–[43]8, [44]12]. But the actual host cellular
   targets of the SARS-CoV-2 virus, the precise pathways of virus
   transmission in humans, and the relative importance of each of those
   proteins and pathways are yet to be understood fully. Also, the
   long-term impacts of SARS-CoV-2 infection in humans still not fully
   explored. Furthermore, no generic model has been proposed yet that can
   reveal the underlying mechanism, transmission, effects, disease
   manifestations, and effective therapeutic strategies in a single
   framework for all viral infections. Therefore, it suggests an immense
   need to better understand viral infections and consequent dysregulation
   of human biological homeostasis in a comprehensive manner to offer new
   treatment or preventive strategies for COVID-19 like pandemic which may
   occur in future. Biological network establishment using virus-human
   protein–protein interactions, analysing the network for extrapolation
   to the next level of human–human protein–protein interaction network,
   identification of the crucial hub proteins at different levels in those
   biological networks along with bioinformatic analysis of those hub
   proteins to identify associated disease manifestations might have the
   potential to fulfill this purpose.

   In Discrete Mathematics and Computer Science, any network is commonly
   represented using a graph. A graph is a collection of a set of vertices
   called nodes, and a set of links or connections between the nodes,
   called edges. Based on the direction of the edges or the weight
   associated with the edges, a graph can be directed, undirected, or
   weighted. Likewise, based on the functionalities and characteristics of
   the network, any biological network can be mapped mathematically to a
   directed, undirected, or weighted graph. Hence, biological
   protein–protein interaction network can also be mathematically
   represented as an undirected graph. In the mathematical representation
   of a PPI network, the proteins are represented as the vertices of the
   graph, and the interactions between the proteins are represented as the
   edges of the graph. Similarly, hub protein identification in a PPI
   network can also be mapped to the problem of vertex identification in a
   graph, which can be done in various ways, and most of them are
   successfully applied to numerous real-life combinatorial optimization
   problems which are NP-hard or NP-complete in nature [[45]13, [46]14].
   One of a very promising method to identify the vertices of a graph is
   graph coloring, which is also a well-known NP-complete combinatorial
   optimization problem [[47]15]. Graph coloring is a fundamental concept
   in graph theory, involving the assignment of colors to the vertices
   (nodes) of a graph so that no two neighboring vertices, connected by an
   edge, share the same color. The least number of colors required to
   generate such coloring is known as the chromatic number of the graph.
   However, determining the chromatic number or finding an optimal
   coloring is considered an NP-complete problem, which is computationally
   challenging and time-consuming [[48]16]. Consequently, meta-heuristic
   algorithms, including evolutionary algorithms and approximation
   methods, are commonly employed to find theoretical and practical
   solutions efficiently [[49]17].

   Graph coloring has wide applications in solving various optimization
   and allocation problems in real-world scenarios. Its successful
   implementation extends to diverse fields, including computational and
   network biology [[50]18, [51]19]. Specifically, in biology, graph
   coloring has been proven helpful in modeling gene regulation networks,
   protein–protein interaction networks, disease-gene associations, drug
   target identification, functional annotation, disease subtyping,
   pathway analysis, and numerous other biological processes
   [[52]20–[53]24]. It is a straightforward and intuitive method that does
   not require complex algorithms or extensive computational resources and
   needs less memory consumption [[54]22]. The approach is highly scalable
   and can be scaled to networks of varying sizes, making it suitable for
   small-scale and large-scale protein–protein interaction networks. As
   the network size increases, the graph coloring algorithm can
   efficiently handle the increased computational complexity without
   sacrificing accuracy [[55]22, [56]25]. In addition, it is a flexible
   method that can be adapted to incorporate additional information or
   constraints, such as functional annotations or experimental data, to
   enhance the accuracy of hub protein identification [[57]26].

   In this study, we explore the application of an evolutionary graph
   coloring algorithm, namely DEGCP algorithm, in the context of a generic
   mathematical modelling of a biological protein–protein interaction
   network. The main objective is to propose a generic mathematical model
   for analysing a complex biological PPI network to uncover pivotal
   mediators associated with viral diseases and their effects on human
   pathophysiology. By utilizing DEGCP, we aim to analyze the complex PPI
   network efficiently and identify crucial hub proteins that play
   essential roles in the disease’s progression and manifestation. This
   investigation holds the potential to contribute valuable insights into
   the underlying biological mechanisms of viral diseases and may pave the
   way for novel therapeutic strategies in combating the disease and
   related conditions in future that may cause global threat. The proposed
   model (Fig. [58]1) consists of identifying pivotal hub proteins, their
   interactions with various DNA & RNA viral families, gene ontology and
   pathway analyses of those hub proteins, transcription factors
   identification, association of the hub proteins with various disease
   symptoms, and identification of their druggable targets and respective
   drugs. To validate our proposed model, we used the experimental data
   associated with COVID-19 disease, a pandemic that occurred all of the
   world in the recent past. The experimental outcomes aim that the
   proposed model can also be applied to other viral infections.

Fig. 1.

   [59]Fig. 1
   [60]Open in a new tab

   Schematic view of establishing SARS-CoV-2-human protein–protein
   interaction network and application of DEGCP algorithm to unravel the
   deeper insight of SARS-CoV-2 infection

Related work

   Over the last few decades, literature surveys revealed various
   experiments that were conducted to analyze and predict the PPI for
   understanding different biological functions and identifying the
   crucial target proteins in protein–protein interaction networks
   [[61]27–[62]29]. An effective graph coloring-based integrative
   statistical algorithm has been designed for essential protein
   prediction [[63]22]. The authors proposed hybridization methods
   consisting of graph coloring and artificial neural network (ANN) for
   finding the target proteins in infectious diseases [[64]25, [65]26].
   Graph theoretic approach was also proposed to find the infected
   pathways in viral disease [[66]30]. PPIs between virus and host
   proteins were studied to highlight the underlying mechanism of
   SARS-CoV-2 proteins-mediated disease propagation [[67]5]. A recent
   study exhaustively examined the molecular pathways, including
   pathways-based therapeutic targets for COVID-19 [[68]6]. A
   comprehensive analysis was made using unsupervised machine learning
   method to highlight COVID-19-related affected pathways [[69]7].
   Bioinformatics analysis was conducted to understand the underlying
   molecular mechanism of advancement of SARS-CoV-2 infection by receiver
   operating characteristic (ROC) curve analysis [[70]8]. Network
   topological analysis was performed to identify the potential target hub
   genes and affected pathways of COVID-19 [[71]9]. SARS-CoV-2-induced
   pathways and corresponding drug repurposing strategies were identified
   by artificial neural network analysis using random walk with restart
   (RWR) method [[72]12]. Computational identification of host genomic
   biomarkers influencing SARS-CoV-2 infections was made using statistical
   R-packages [[73]31]. Underlying causes of COVID-19 pathobiology and
   prediction of potential therapeutic targets and effective drugs were
   examined extensively through virus-host protein interaction network
   study [[74]32–[75]35]. The authors attempted to develop combinatorial
   treatment strategies targeting both host factors and viral enzymes
   through comprehensive mapping of interactions between SARS-CoV-2
   proteins and human proteins [[76]36, [77]37]. An extensive
   computational investigation of the interactome of SARS-CoV-2 and human
   proteins was conducted for identifying possible virus-affected
   processes and potential protein binding sites [[78]10]. The common
   pathways and molecular biomarkers in COVID-19 were identified through
   PPI network analysis which can cause pulmonary fibrosis and lung cancer
   [[79]11].

Motivation and contribution

   Despite the enormous research, our understanding about the underlying
   mechanism and host targets of viral infections including SARS-CoV-2,
   effects of viral infections on host biological pathways, mechanism of
   development of disease pathology and its long-term impacts on hosts are
   still limited. The development of effective treatment options to
   prevent the disease and resolve associated manifestations are still an
   open challenge. Moreover, no single generic model exists till date that
   can address the issues associated with SARS-CoV-2 and other viral
   infections as well. These limitations motivated us to propose a new
   model that can enhance the deeper understanding of the mechanism of
   such viral infections and consequent pathophysiology. The understanding
   of viral-host interactions and their consequences is of utmost
   importance in comprehending the disease’s complex pathophysiology.
   Thus, in this study, we delve into the protein–protein interaction
   network of SARS-CoV-2 and human host proteins to identify crucial
   mediators that might shed light on the mechanisms underlying COVID-19
   infection and its associated impact on human health. By elucidating
   these molecular interactions, we aim to contribute to the development
   of targeted therapeutic strategies and improve patient outcomes in this
   ongoing global health crisis, as well as for any other pandemic that
   can occur in future. We believe, our present research is a wise effort
   in this context.

   Our proposed model, rooted in an evolutionary graph coloring algorithm,
   presents numerous advantages over traditional degree centrality
   criteria in discerning hub proteins within PPI networks. In contrast to
   degree centrality, which predominantly assesses the number of direct
   interactions a protein has, our model integrates the topological
   context of the entire network. It explores protein interactions
   comprehensively, encompassing both direct and indirect connections. The
   employed graph coloring methodology adheres to the principle that
   adjacent nodes (proteins) should not share the same color, signifying
   the absence of direct interactions. However, proteins with the same
   color may possess indirect connections through shared intermediates,
   indicating potential functional relationships. This approach recognizes
   that proteins involved in a pathway may contribute to the pathway’s
   activity without necessitating direct physical interactions. Degree
   centrality, reliant on the count of direct interactions, may be
   sensitive to the overall network size. This size of the networks or
   graphs keep on changing in realistic situation, and thus the networks
   or graphs are dynamic in nature. Any changes in network structure, such
   as node addition or removal, can significantly influence degree
   centrality and alter the identification of hub proteins. In contrast,
   our model demonstrates greater robustness to structural changes,
   adapting to alterations in network connectivity through the assigned
   colors reflecting local context connectivity, and thus is well-suited
   for dynamic graphs or dynamic network structures.

   While our model does not entirely dismiss the significance of degree
   centrality, it incorporates it alongside additional weightings. The hub
   protein identification process involves two primary steps. Firstly, the
   PPI network graph undergoes chromatic coloring with minimal number of
   colors using a combination of differential evolution and sequential
   coloring algorithms. Differential evolution algorithm optimizes the
   ordering of the nodes so as to minimize the number of colors and the
   sequential algorithm assigns valid colors based on the degree and
   adjacency of the nodes. Subsequently, weights are assigned to human
   proteins based on color class, interactions with viral proteins, direct
   interactions with other proteins in the graph, and interactions with
   other human proteins outside the network. The incorporation of these
   diverse criteria, including but not limited to degree centrality,
   contributes to a more comprehensive hub protein identification process.
   The calculated Z-score, considering all aforementioned weightings, aids
   in establishing essential hub proteins based on a defined threshold.
   This multi-criteria approach enhances the probability of identifying
   crucial hub proteins compared to methods solely reliant on degree
   centrality criteria.

   Our proposed work model demonstrates the applicability of graph
   coloring in computational network biology. We propose a generic
   mathematical model using graph coloring that can extract the potential
   human hub proteins associated with different viral infections. We have
   tested our models on SARS-CoV-2 infection occured in the recent past.
   It shows the importance of two levels of protein–protein interaction
   networks to understand the underlying mechanism and highlight the
   associated mediators of SARS-CoV-2 infection-induced disease
   manifestations. Furthermore, the experimental findings highlight the
   importance of hub proteins in COVID-19. It also conjectures the hub
   proteins associated biological pathways as the probable underlying
   mechanisms of SARS-Cov-2-mediated human pathophysiology. The
   bioinformatic analysis-based validation of our obtained results also
   identifies some essential transcription factors that might play an
   important role in altering biological signaling pathways. Finally, the
   proposed model also underscored the interconnection between SARS-CoV-2
   infection and the long-term effects of the disease and accordingly
   highlighted some of the probable drug targets and respective drugs that
   might be beneficial to prevent the disease and resolve associated
   disorders. Furthermore, the experimental results also show that the
   proposed model can also be applied for other pandemic like COVID-19
   that may occur in future.

Materials and methods

Mathematical formulation

   Following are the notations used for the mathematical formulation of
   the proposed model.
   [MATH: <mrow><msub><mi>V</mi><mi>p</mi></msub><mo>:</mo></mrow> :MATH]
   set of virus proteins,
   [MATH: <mrow><msub><mi>H</mi><mi>p</mi></msub><mo>:</mo></mrow> :MATH]
   set of human proteins,
   [MATH: <mrow><mi>G</mi><mo>=</mo><mo
   stretchy="false">(</mo><msub><mi>V</mi><mi>G</mi></msub><mo>,</mo><msub
   ><mi>E</mi><mi>G</mi></msub><mo
   stretchy="false">)</mo><mo>:</mo></mrow> :MATH]
   an undirected graph G.,
   [MATH: <mrow><msub><mi>V</mi><mi>G</mi></msub><mo>:</mo></mrow> :MATH]
   set of vertices,
   [MATH: <mrow><msub><mi>E</mi><mi>G</mi></msub><mo>:</mo></mrow> :MATH]
   set of edges whose elements are of the form
   [MATH: <mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo>,</mo><msub
   ><mi>v</mi><mi>j</mi></msub><mo stretchy="false">)</mo></mrow> :MATH]
   where
   [MATH:
   <mrow><msub><mi>v</mi><mi>i</mi></msub><mo>,</mo><msub><mi>v</mi><mi>j<
   /mi></msub><mo>∈</mo><msub><mi>V</mi><mi>G</mi></msub></mrow> :MATH]
   ,
   [MATH: <mrow><mi>d</mi><mi>e</mi><mi>g</mi><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo><mo>:</mo></mrow> :MATH]
   degree of
   [MATH: <mrow><msub><mi>v</mi><mi>i</mi></msub><mo>.</mo></mrow> :MATH]

   [MATH: <mrow><mi>χ</mi><mo stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo><mo>:</mo></mrow> :MATH]
   chromatic number of a graph G.

   [MATH: <mrow><msub><mi>C</mi><mi>i</mi></msub><mo>=</mo><mfenced
   close="}"
   open="{"><msub><mi>C</mi><mn>1</mn></msub><mo>,</mo><msub><mi>C</mi><mn
   >2</mn></msub><mo>,</mo><msub><mi>C</mi><mn>3</mn></msub><mo>,</mo><mo>
   ⋯</mo><msub><mi>C</mi><mi>n</mi></msub></mfenced><mo>:</mo></mrow>
   :MATH]
   set of color classes.

   [MATH: <mrow><mi>C</mi><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo><mo>:</mo></mrow> :MATH]
   color of
   [MATH: <mrow><msub><mi>v</mi><mi>i</mi></msub><mo>.</mo></mrow> :MATH]

   [MATH: <mrow><msub><mi>V</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><mi>x</mi><mo
   stretchy="false">)</mo></mrow><mo>:</mo></mrow> :MATH]
   number of virus protein interactors of x.

   [MATH: <mrow><mi>T</mi><mi>O</mi><mi>T</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><mi>x</mi><mo
   stretchy="false">)</mo></mrow><mo>:</mo></mrow> :MATH]
   total number of human protein interactors of x.

   [MATH:
   <mrow><mi>I</mi><mi>N</mi><mi>T</mi><mi>E</mi><mi>R</mi><msub><mi>H</mi
   ><mrow><mi mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><mi>x</mi><mo
   stretchy="false">)</mo></mrow><mo>:</mo></mrow> :MATH]
   number of inter human protein interactors of x.

   [MATH:
   <mrow><mi>I</mi><mi>N</mi><mi>T</mi><mi>R</mi><mi>A</mi><msub><mi>H</mi
   ><mrow><mi mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><mi>x</mi><mo
   stretchy="false">)</mo></mrow><mo>:</mo></mrow> :MATH]
   number of intra human protein interactors of x.

   L-1, L-2 graphs:
   [MATH: <mrow><mi>G</mi><mo>=</mo><mo
   stretchy="false">(</mo><msub><mi>V</mi><mi>G</mi></msub><mo>,</mo><msub
   ><mi>E</mi><mi>G</mi></msub><mo stretchy="false">)</mo></mrow> :MATH]
   where
   [MATH: <mrow><msub><mi>V</mi><mi>G</mi></msub><mo>=</mo><mfenced
   close="}"
   open="{"><msub><mi>v</mi><mi>i</mi></msub><mo>:</mo><msub><mi>v</mi><mi
   >i</mi></msub><mo>∈</mo><msub><mi>H</mi><mi>p</mi></msub></mfenced><mo>
   ,</mo></mrow> :MATH]
   and
   [MATH: <mrow><msub><mi>E</mi><mi>G</mi></msub><mo>=</mo><mfenced
   close="}" open="{"><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo>,</mo><msub
   ><mi>v</mi><mi>j</mi></msub><mo
   stretchy="false">)</mo></mfenced></mrow> :MATH]
   where
   [MATH:
   <mrow><msub><mi>v</mi><mi>i</mi></msub><mo>,</mo><msub><mi>v</mi><mi>j<
   /mi></msub><mo>∈</mo><msub><mi>H</mi><mi>p</mi></msub><mo>,</mo></mrow>
   :MATH]
   [MATH: <mrow><mi>i</mi><mo>≠</mo><mi>j</mi><mo>,</mo></mrow> :MATH]
   and
   [MATH: <msub><mi>v</mi><mi>i</mi></msub> :MATH]
   is interacting with
   [MATH: <mrow><msub><mi>v</mi><mi>j</mi></msub><mo>.</mo></mrow> :MATH]

   [MATH:
   <mrow><mi>H</mi><mi>U</mi><msub><mi>B</mi><mi>p</mi></msub><mo>:</mo></
   mrow> :MATH]
   set of human hub proteins.

   Z(x) :  Z-score of x., X(x) :  raw score of x.,
   [MATH: <mrow><mi>μ</mi><mo>:</mo></mrow> :MATH]
   mean,
   [MATH: <mrow><mi>σ</mi><mo>:</mo></mrow> :MATH]
   standard deviation.

   For a given graph
   [MATH: <mrow><mi>G</mi><mo>=</mo><mo
   stretchy="false">(</mo><msub><mi>V</mi><mi>G</mi></msub><mo>,</mo><msub
   ><mi>E</mi><mi>G</mi></msub><mo stretchy="false">)</mo></mrow> :MATH]
   , graph coloring problem can be mathematically defined as follows:
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow><mi>m</mi><mi>i</mi><mi>n</mi><munderover><mo
   >∑</mo><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mi>m</mi></munderove
   r><msub><mi>c</mi><mi>k</mi></msub></mrow></mtd></mtr></mtable></mrow>
   :MATH]
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mi>s</mi><mo>.</mo><mi>t</mi><mo
   >.</mo><mspace width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=
   </mo><mn>1</mn></mrow><mi>m</mi></munderover><msub><mi>u</mi><mrow><mi
   mathvariant="italic">vk</mi></mrow></msub><mo>=</mo><mn>1</mn></mrow></
   mtd></mtr></mtable></mrow> :MATH]
   1
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><msub><mi>u</mi><mrow><mi
   mathvariant="italic">vk</mi></mrow></msub><mo>+</mo><msub><mi>u</mi><mr
   ow><mi
   mathvariant="italic">wk</mi></mrow></msub><mo>≤</mo><msub><mi>c</mi><mi
   >k</mi></msub><mspace width="0.277778em"></mspace><mo>∀</mo><mspace
   width="0.277778em"></mspace><mrow><mo
   stretchy="false">(</mo><mi>v</mi><mo>,</mo><mi>w</mi><mo
   stretchy="false">)</mo></mrow><mo>∈</mo><msub><mi>E</mi><mi>G</mi></msu
   b><mo>,</mo><mspace
   width="0.277778em"></mspace><mi>k</mi><mo>∈</mo><mfenced close="}"
   open="{"><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mo>⋯</mo><mi>m</mi></
   mfenced></mrow></mtd></mtr></mtable></mrow> :MATH]
   2
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><msub><mi>u</mi><mrow><mi
   mathvariant="italic">vk</mi></mrow></msub><mo>,</mo><msub><mi>c</mi><mi
   >k</mi></msub><mo>∈</mo><mfenced close="}"
   open="{"><mn>0</mn><mo>,</mo><mn>1</mn></mfenced><mspace
   width="0.277778em"></mspace><mo>∀</mo><mspace
   width="0.277778em"></mspace><mi>v</mi><mo>∈</mo><msub><mi>V</mi><mi>G</
   mi></msub><mo>,</mo><mspace
   width="0.277778em"></mspace><mi>k</mi><mo>∈</mo><mfenced close="}"
   open="{"><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mo>⋯</mo><mi>m</mi></
   mfenced></mrow></mtd></mtr></mtable></mrow> :MATH]
   3

   where m is the upper bound of the chromatic number
   [MATH: <mrow><mi>χ</mi><mo stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow> :MATH]
   ,
   [MATH:
   <mrow><msub><mi>c</mi><mi>k</mi></msub><mo>=</mo><mn>1</mn></mrow>
   :MATH]
   for all used color k, and
   [MATH: <mrow><msub><mi>u</mi><mrow><mi
   mathvariant="italic">vk</mi></mrow></msub><mo>=</mo><mn>1</mn></mrow>
   :MATH]
   if
   [MATH: <mrow><mi>C</mi><mo stretchy="false">(</mo><mi>v</mi><mo
   stretchy="false">)</mo><mo>=</mo><mi>k</mi></mrow> :MATH]
   [MATH:
   <mrow><mo>∀</mo><mi>v</mi><mo>∈</mo><msub><mi>V</mi><mi>G</mi></msub></
   mrow> :MATH]
   . Equation (2) prevents two vertices to be assigned the same color
   values.

   The hub human proteins are extracted at different levels based on the
   color values of the vertices of the graph and weightages given for
   different types of interactions. The first set of hub proteins at
   first-level are calculated as
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow><mi>H</mi><mi>U</mi><msub><mi>B</mi><mi>p</mi
   ></msub><mrow><mo
   stretchy="false">(</mo><mi>L</mi><mo>-</mo><mn>1</mn><mo
   stretchy="false">)</mo></mrow><mo>=</mo><mfenced close="}"
   open="{"><msub><mi>v</mi><mi>i</mi></msub><mo>:</mo><msub><mi>v</mi><mi
   >i</mi></msub><mo>∈</mo><msub><mi>H</mi><mi>p</mi></msub><mo>,</mo><mi>
   Z</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>≥</mo><mi>θ</mi></mfenced></mrow></mt
   d></mtr></mtable></mrow> :MATH]
   4

   where
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mi>Z</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>=</mo><mfrac><mrow><mi>X</mi><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo><mo>-</mo><mi>μ</mi></mrow><mi>σ</mi></mfrac></m
   row></mtd></mtr></mtable></mrow> :MATH]
   5
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mi>X</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>=</mo><msub><mi>X</mi><mn>1</mn></msu
   b><mrow><mo stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>+</mo><msub><mi>X</mi><mn>2</mn></msu
   b><mrow><mo stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>+</mo><msub><mi>X</mi><mn>3</mn></msu
   b><mrow><mo stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow></mrow></mtd></mtr></mtable></mrow>
   :MATH]
   6
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><msub><mi>X</mi><mn>1</mn></msub>
   <mrow><mo stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>=</mo><mfenced
   open="{"><mrow><mtable><mtr><mtd
   columnalign="left"><msub><mi>w</mi><mn>1</mn></msub></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩽</mo><mfrac><mi>n</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><msub><mi>V</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩾</mo><mi>ϕ</mi></mrow></mtd></mtr><m
   tr><mtd
   columnalign="left"><mrow><mrow></mrow><mfrac><msub><mi>w</mi><mn>1</mn>
   </msub><mn>2</mn></mfrac></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩽</mo><mfrac><mi>n</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><msub><mi>V</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo><</mo><mi>ϕ</mi></mrow></mtd></mtr><m
   tr><mtd
   columnalign="left"><mrow><mrow></mrow><mfrac><msub><mi>w</mi><mn>1</mn>
   </msub><mn>4</mn></mfrac></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>></mo><mfrac><mi>n</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><msub><mi>V</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩾</mo><mi>ϕ</mi></mrow></mtd></mtr><m
   tr><mtd
   columnalign="left"><mrow><mrow></mrow><mo>-</mo><msub><mi>w</mi><mn>1</
   mn></msub></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>></mo><mfrac><mi>n</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><msub><mi>V</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo><</mo><mi>ϕ</mi></mrow></mtd></mtr></
   mtable></mrow></mfenced></mrow></mtd></mtr></mtable></mrow> :MATH]
   7
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><msub><mi>X</mi><mn>2</mn></msub>
   <mrow><mo stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>=</mo><mfenced
   open="{"><mrow><mtable><mtr><mtd
   columnalign="left"><msub><mi>w</mi><mn>2</mn></msub></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩽</mo><mfrac><mi>n</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>I</mi><mi>N</mi><mi>T</mi><mi>R</mi><mi
   >A</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩾</mo><mi>ψ</mi></mrow></mtd></mtr><m
   tr><mtd
   columnalign="left"><mrow><mrow></mrow><mfrac><msub><mi>w</mi><mn>2</mn>
   </msub><mn>2</mn></mfrac></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩽</mo><mfrac><mi>n</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>I</mi><mi>N</mi><mi>T</mi><mi>R</mi><mi
   >A</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo><</mo><mi>ψ</mi></mrow></mtd></mtr><m
   tr><mtd
   columnalign="left"><mrow><mrow></mrow><mfrac><msub><mi>w</mi><mn>2</mn>
   </msub><mn>4</mn></mfrac></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>></mo><mfrac><mi>n</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>I</mi><mi>N</mi><mi>T</mi><mi>R</mi><mi
   >A</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩾</mo><mi>ψ</mi></mrow></mtd></mtr><m
   tr><mtd
   columnalign="left"><mrow><mrow></mrow><mo>-</mo><msub><mi>w</mi><mn>2</
   mn></msub></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>></mo><mfrac><mi>n</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>I</mi><mi>N</mi><mi>T</mi><mi>R</mi><mi
   >A</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo><</mo><mi>ψ</mi></mrow></mtd></mtr></
   mtable></mrow></mfenced></mrow></mtd></mtr></mtable></mrow> :MATH]
   8
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><msub><mi>X</mi><mn>3</mn></msub>
   <mrow><mo stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>=</mo><mfenced
   open="{"><mrow><mtable><mtr><mtd
   columnalign="left"><msub><mi>w</mi><mn>3</mn></msub></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩽</mo><mfrac><mi>n</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>I</mi><mi>N</mi><mi>T</mi><mi>E</mi><mi
   >R</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩾</mo><mi>η</mi></mrow></mtd></mtr><m
   tr><mtd
   columnalign="left"><mrow><mrow></mrow><mfrac><msub><mi>w</mi><mn>3</mn>
   </msub><mn>2</mn></mfrac></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩽</mo><mfrac><mi>n</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>I</mi><mi>N</mi><mi>T</mi><mi>E</mi><mi
   >R</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo><</mo><mi>η</mi></mrow></mtd></mtr><m
   tr><mtd
   columnalign="left"><mrow><mrow></mrow><mfrac><msub><mi>w</mi><mn>3</mn>
   </msub><mn>4</mn></mfrac></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>></mo><mfrac><mi>n</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>I</mi><mi>N</mi><mi>T</mi><mi>E</mi><mi
   >R</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩾</mo><mi>η</mi></mrow></mtd></mtr><m
   tr><mtd
   columnalign="left"><mrow><mrow></mrow><mo>-</mo><msub><mi>w</mi><mn>3</
   mn></msub></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>></mo><mfrac><mi>n</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>I</mi><mi>N</mi><mi>T</mi><mi>E</mi><mi
   >R</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo><</mo><mi>η</mi></mrow></mtd></mtr></
   mtable></mrow></mfenced></mrow></mtd></mtr></mtable></mrow> :MATH]
   9

   where n is the percentages of color classes to be chosen,
   [MATH: <mrow><msub><mi>X</mi><mn>1</mn></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow></mrow> :MATH]
   is the score of the node calculated based on the the color classes and
   number of virus interactor proteins,
   [MATH: <mrow><msub><mi>X</mi><mn>2</mn></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow></mrow> :MATH]
   is the score of the node calculated based on the the color classes and
   number of intra-human interactor proteins,
   [MATH: <mrow><msub><mi>X</mi><mn>3</mn></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow></mrow> :MATH]
   is the score of the node calculated based on the the color classes and
   number of inter-human interactor proteins, and
   [MATH:
   <mrow><msub><mi>w</mi><mn>1</mn></msub><mo>,</mo><msub><mi>w</mi><mn>2<
   /mn></msub><mo>,</mo><msub><mi>w</mi><mn>3</mn></msub><mo>,</mo><mi>θ</
   mi><mo>,</mo><mi>ϕ</mi><mo>,</mo><mi>ψ</mi><mo>,</mo><mi>η</mi></mrow>
   :MATH]
   are different threshold values.

   The second set of hub proteins at second-level are calculated as
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow><mi>H</mi><mi>U</mi><msub><mi>B</mi><mi>p</mi
   ></msub><mrow><mo
   stretchy="false">(</mo><mi>L</mi><mo>-</mo><mn>2</mn><mo
   stretchy="false">)</mo></mrow><mo>=</mo><mfenced close="}"
   open="{"><msub><mi>v</mi><mi>i</mi></msub><mo>:</mo><msub><mi>v</mi><mi
   >i</mi></msub><mo>∈</mo><msub><mi>H</mi><mi>p</mi></msub><mo>,</mo><mi>
   Z</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>≥</mo><mi>λ</mi></mfenced></mrow></mt
   d></mtr></mtable></mrow> :MATH]
   10

   where
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mi>Z</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>=</mo><mfrac><mrow><msub><mi>X</mi><m
   n>4</mn></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>-</mo><mi>μ</mi></mrow><mi>σ</mi></mf
   rac></mrow></mtd></mtr></mtable></mrow> :MATH]
   11
   [MATH: <mrow><mtable><mtr><mtd
   columnalign="right"><mrow></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><msub><mi>X</mi><mn>4</mn></msub>
   <mrow><mo stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>=</mo><mfenced
   open="{"><mrow><mtable><mtr><mtd
   columnalign="left"><msub><mi>w</mi><mn>4</mn></msub></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩽</mo><mfrac><mi>m</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>I</mi><mi>N</mi><mi>T</mi><mi>E</mi><mi
   >R</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩾</mo><mi>ζ</mi></mrow></mtd></mtr><m
   tr><mtd
   columnalign="left"><mrow><mrow></mrow><mfrac><msub><mi>w</mi><mn>4</mn>
   </msub><mn>2</mn></mfrac></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩽</mo><mfrac><mi>m</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>I</mi><mi>N</mi><mi>T</mi><mi>E</mi><mi
   >R</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo><</mo><mi>ζ</mi></mrow></mtd></mtr><m
   tr><mtd
   columnalign="left"><mrow><mrow></mrow><mfrac><msub><mi>w</mi><mn>4</mn>
   </msub><mn>4</mn></mfrac></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>></mo><mfrac><mi>m</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>I</mi><mi>N</mi><mi>T</mi><mi>E</mi><mi
   >R</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>⩾</mo><mi>ζ</mi></mrow></mtd></mtr><m
   tr><mtd
   columnalign="left"><mrow><mrow></mrow><mo>-</mo><msub><mi>w</mi><mn>4</
   mn></msub></mrow></mtd><mtd
   columnalign="left"><mrow><mrow></mrow><mspace
   width="0.333333em"></mspace><mtext>if</mtext><mspace
   width="0.333333em"></mspace><mi>C</mi><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo>></mo><mfrac><mi>m</mi><mn>100</mn></
   mfrac><mo>·</mo><mi>χ</mi><mrow><mo
   stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>a</mi><mi>n</mi><mi>d</mi><mspace
   width="0.277778em"></mspace><mspace
   width="0.277778em"></mspace><mi>I</mi><mi>N</mi><mi>T</mi><mi>E</mi><mi
   >R</mi><msub><mi>H</mi><mrow><mi
   mathvariant="italic">IC</mi></mrow></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow><mo><</mo><mi>ζ</mi></mrow></mtd></mtr></
   mtable></mrow></mfenced></mrow></mtd></mtr></mtable></mrow> :MATH]
   12

   where m is the percentage of color classes to be considered,
   [MATH: <mrow><msub><mi>X</mi><mn>4</mn></msub><mrow><mo
   stretchy="false">(</mo><msub><mi>v</mi><mi>i</mi></msub><mo
   stretchy="false">)</mo></mrow></mrow> :MATH]
   is the score of the node calculated based on the the color classes and
   number of inter-human interactor proteins, and
   [MATH:
   <mrow><msub><mi>w</mi><mn>4</mn></msub><mo>,</mo><mi>λ</mi><mo>,</mo><m
   i>ζ</mi></mrow> :MATH]
   are different threshold values.

Development of DEGCP algorithm

   DEGCP algorithm (Algorithm 1) was developed based on the existing
   Modified Discrete Differential Evolution (MDDE) algorithm [[80]38]. Two
   modifications were made in the existing MDDE algorithm. Firstly, we
   generated the initial populations with some guidance for producing
   promising vectors where the graph’s vertices were first sorted
   according to the degree in descending order. Then, the vertices of
   equal degree positions were interchanged randomly to generate different
   populations. Lastly, instead of a two-point crossover, the ordered
   crossover technique was incorporated in DEGCP algorithm.

Algorithm 1.

   [81]Algorithm 1
   [82]Open in a new tab

   DEGCP Algorithm

Algorithm 2.

   [83]Algorithm 2
   [84]Open in a new tab

   Sequential Coloring Algorithm

First level human–human PPI network construction

   Thirty SARS-CoV-2 virus proteins (E, M, N, S, Nsp1-Nsp-16, ORF10,
   ORF14, ORF3A, ORF3B, ORF6, ORF7A, ORF7B, ORF8, ORF9B, ORF9C) including
   spike, envelop and accessory proteins and their direct human
   interactors were extracted from BioGRID repository [[85]39, [86]40].
   The human interactors were included in our network contained all the
   experimentally identified physical interactors of virus proteins
   specified in the repository. Based on this criteria, 5225 unique human
   interactor proteins were selected for the first level network
   construction. The intra-interactions amongst these 5225 human proteins
   were extracted based on the same experimentally identified physical
   interactions criteria from BioGRID human–human protein interactome
   repository. Accordingly, 209,945 unique interactions were identified
   amongst 5225 human proteins which were further employed as the size and
   order respectively to create an undirected graph. This first-level of
   this PPI network was named PPIN1, and the undirected graph named L-1.

Implementation of DEGCP algorithm and degree centrality criteria on L-1 graph

   DEGCP algorithm was used to color the L-1 (5225, 209,945) graph with
   the parameters
   [MATH: <mrow><mi>P</mi><mo>=</mo><mn>50</mn></mrow> :MATH]
   ,
   [MATH: <mrow><mi>ω</mi><mo>=</mo><mn>0.6</mn></mrow> :MATH]
   ,
   [MATH:
   <mrow><msub><mi>P</mi><mi>c</mi></msub><mo>=</mo><mn>0.75</mn></mrow>
   :MATH]
   ,
   [MATH:
   <mrow><mi>M</mi><mi>a</mi><mi>x</mi><mi>_</mi><mi>I</mi><mi>t</mi><mi>e
   </mi><mi>r</mi><mi>a</mi><mi>t</mi><mi>i</mi><mi>o</mi><mi>n</mi><mo>=<
   /mo><mn>2000</mn></mrow> :MATH]
   . Consequently, a total of 60 disjoint color classes were identified.
   Then, we executed Eqs. (4)–(9) with different threshold values to
   extract first level potential human hub proteins. After several trails,
   we choose
   [MATH: <mrow><mi>n</mi><mo>=</mo><mn>25</mn></mrow> :MATH]
   , and the other threshold values as
   [MATH:
   <mrow><mi>ϕ</mi><mo>=</mo><mn>5</mn><mo>,</mo><mi>ψ</mi><mo>=</mo><mn>1
   0</mn><mo>,</mo><mi>η</mi><mo>=</mo><mn>25</mn><mo>,</mo><msub><mi>w</m
   i><mn>1</mn></msub><mo>=</mo><mn>5</mn><mo>,</mo><msub><mi>w</mi><mn>2<
   /mn></msub><mo>=</mo><mn>1</mn><mo>,</mo><msub><mi>w</mi><mn>3</mn></ms
   ub><mo>=</mo><mn>1</mn></mrow> :MATH]
   , and
   [MATH: <mrow><mi>θ</mi><mo>=</mo><mn>1</mn></mrow> :MATH]
   to obtain a legitimate number of potential hub human proteins in the
   first level.

Second level human–human PPI network construction

   The second level of PPI network was constructed using the second level
   human protein interactors of 1082 L1 hub proteins. Accordingly, a total
   of 10,371 unique human interactors of L-1 hub proteins were extracted
   from BioGRID repository based on same experimentally identified
   physical interaction criteria. We then constructed a PPI network
   (PPIN2) and undirected graph L-2 based on 268,738 unique
   inter-interactions amongst 11,453 (10,371
   [MATH: <mo>+</mo> :MATH]
   1082) human proteins.

Implementation of DEGCP algorithm and degree centrality criteria on L-2 graph

   DEGCP algorithm was used similarly to color the L-2 (11,453, 268,738)
   graph with the parameters
   [MATH: <mrow><mi>P</mi><mo>=</mo><mn>50</mn></mrow> :MATH]
   ,
   [MATH: <mrow><mi>ω</mi><mo>=</mo><mn>0.6</mn></mrow> :MATH]
   ,
   [MATH:
   <mrow><msub><mi>P</mi><mi>c</mi></msub><mo>=</mo><mn>0.75</mn></mrow>
   :MATH]
   ,
   [MATH:
   <mrow><mi>M</mi><mi>a</mi><mi>x</mi><mi>_</mi><mi>I</mi><mi>t</mi><mi>e
   </mi><mi>r</mi><mi>a</mi><mi>t</mi><mi>i</mi><mi>o</mi><mi>n</mi><mo>=<
   /mo><mn>2000</mn></mrow> :MATH]
   . Consequently, a total of 80 disjoint color classes were identified.
   After that, we executed Eqs. (10)–(12) with
   [MATH: <mrow><mi>m</mi><mo>=</mo><mn>25</mn></mrow> :MATH]
   , and the different threshold values to extract second level potential
   human hub proteins. After several trails, we set the threshold values
   [MATH:
   <mrow><mi>ζ</mi><mo>=</mo><mn>100</mn><mo>,</mo><msub><mi>w</mi><mn>4</
   mn></msub><mo>=</mo><mn>5</mn></mrow> :MATH]
   , and
   [MATH: <mrow><mi>λ</mi><mo>=</mo><mn>1</mn></mrow> :MATH]
   to obtain the potential hub human proteins in the second level.

Gene ontology analysis of the identified human hub proteins

   Gene ontology (GO) analysis of the identified L-1 and L-2 hub proteins
   were performed using the DAVID Functional Annotation Tool [[87]41,
   [88]42]. The Entrez Gene ID of the hub proteins were uploaded and the
   functional annotation were performed using DAVID. Biological Process
   (BP), Cellular Component (CC), and Molecular Function (MF) were
   extracted for the respective hub proteins. A modified Fisher Exact
   p-value, also known as the threshold of EASE Score, was used for gene
   enrichment analysis. Enrichment was performed by setting an EASE Score
   0.01 and the minimum number of gene count as 5 for stringent gene
   enrichment.

KEGG pathway analysis of the identified human hub proteins

   Involvement of hub proteins in biological pathways were identified
   through KEGG Pathway Analysis using DAVID Functional Annotation Tool
   [[89]41, [90]42]. The Entrez Gene ID of the hub proteins were uploaded
   and functional annotation were done by selecting the Pathways option.
   Only KEGG-PATHWAY was selected for specified pathway analysis. Like GO
   Analysis, the threshold value of the EASE Score for pathways analysis
   has also been set to 0.01, but unlike GO, the minimum number of gene
   count has been set to 15 to identify the most crucial pathways.

Results

Implementation of the DEGCP algorithm on undirected graph instances exhibits
optimal/best-known results

   To assess the applicability of the DEGCP algorithm to biological
   protein–protein interaction networks, first, we evaluated its
   performance on multiple undirected graph instances. Accordingly, this
   algorithm was tested on conventional DIMACS benchmark instances widely
   employed and originally suggested for graph coloring problems. The
   proposed DEGCP algorithm was tested fifty times independently on some
   of the different small, medium, and large-size graphs of varying
   complexity. Our experimental outcomes for respective graph instances
   with corresponding edges and vertices (columns 1–3, Table [91]1)
   demonstrated that the results produced by the proposed algorithm
   (column 6, Table [92]1) were the same as the optimal (column 4, Table
   [93]1) or best-known results (column 5, Table [94]1). Furthermore,
   DEGCP algorithm-derived results achieved optimal or best-known results
   with a 100% success rate for 17 graph instances (first 17 instances of
   column 8, Table [95]1), and the success rates for the other instances
   were within an acceptable range (60–80% for 18–20th instances of column
   8, Table [96]1). Notably, our proposed algorithm produced the optimal
   results within an acceptable time limit (column 7, Table [97]1), which
   further suggested its efficiency and applicability for graphs with
   larger order and size. Altogether, the results demonstrated the
   effectiveness of the proposed algorithm on undirected graphs.
   Subsequently, we applied the DEGCP algorithm on SARS-CoV-2-Human
   protein–protein interaction network to delineate a deeper insight into
   viral infection and its consequences on human pathobiology.

Table 1.

   Results obtained by DEGCP Algorithm with time and success rate
   DIMACS graph instances Edges Vertex Known optimal
   [MATH: <mrow><mi>χ</mi><mo stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow> :MATH]
   Best known result for
   [MATH: <mrow><mi>χ</mi><mo stretchy="false">(</mo><mi>G</mi><mo
   stretchy="false">)</mo></mrow> :MATH]
   = ? Results obtained by DEGCP DEGCP time (S) DEGCP success rate
   homer 1629 561 13 13 13 0.12 50/50
   fpsol2.i.1 11,654 496 65 65 65 0.59 50/50
   inithx.i.1 18,707 864 54 54 54 1.64 50/50
   zeroin.i.1 4100 211 49 49 49 0.11 50/50
   le450_25a 8260 450 25 25 25 24.81 50/50
   le450_25b 8263 450 25 25 25 8.28 50/50
   myciel7 2360 191 8 8 8 0.02 50/50
   1-Insertions_6 6337 607 ? 7 7 0.16 50/50
   2-Insertions_5 3936 597 ? 6 6 0.12 50/50
   3-Insertions_5 9695 1406 ? 6 6 0.66 50/50
   4-Insertions_4 1795 475 ? 5 5 0.71 50/50
   1-FullIns_5 3247 282 ? 6 6 0.03 50/50
   2-FullIns_5 12201 852 ? 7 7 0.27 50/50
   3-FullIns_5 33,751 2030 ? 8 8 1.20 50/50
   4-FullIns_5 77,305 4146 ? 9 9 5.44 50/50
   5-FullIns_4 11,395 1085 ? 9 9 0.35 50/50
   wap05a 43,081 905 ? 50 50 118.31 50/50
   school1 19,095 385 ? 14 14 3145.21 30/50
   school1_nsh 14,612 352 ? 14 14 2459.35 35/50
   will199GPIA 7065 701 ? 7 7 924.26 40/50
   [98]Open in a new tab

PPI network establishment using human interactors of SARS-CoV-2 and
graph-coloring algorithm implementation predicts crucial first-level hub
proteins

   To identify crucial human protein facilitators through which SARS-CoV-2
   impacts the biological changes in infected individuals, we first
   constructed a protein–protein interaction network (PPIN). To create
   this PPIN, human interactors of thirty SARS-CoV-2 proteins were
   extracted from the BioGRID database (Fig. [99]2A). After that, the PPIN
   was constructed using 5225 human interactors of thirty SARS-CoV-2
   proteins and their associated 2,09,945 interactions from the BioGRID
   human protein interactome database and visualized through a connected
   network (Fig. [100]2B). In this network, the human interactors of
   SARS-CoV-2 were denoted as different colored nodes based on the
   chromatic coloring approach of using minimum number (not necessarily
   optimum) of colors to designate adjacent connecting nodes with
   different colors. The edges represented direct interactions among these
   proteins, devoid of any indication of regulatory relationships,
   functional similarities, or directionality of regulation or signal or
   upstream-downstream demarcation. Thereafter, to find out the important
   hub proteins among those 5225 interactors, a graph coloring algorithm
   was implemented on PPIN and the first 25% of color classes were
   extracted. Additionally, weightage to the degree centrality of the
   nodes was added to extract potential hub proteins based on three
   criteria: interaction
   [MATH: <mo>≥</mo> :MATH]
   5 with SARS-CoV-2 proteins, intra-interaction
   [MATH: <mo>≥</mo> :MATH]
   10 within 5225 proteins, and inter-interaction
   [MATH: <mo>≥</mo> :MATH]
   25 with next-level human interactors. The first criterion was used to
   add extra weightage based on the higher interaction probability with
   viral proteins, the second criterion was to add the additional impact
   of the viral interaction on the first-level interactors, and the third
   criterion was to add the higher likelihood of impacting on infected
   patients. It yielded 1082 potential hub proteins (Additional file
   [101]2: Table: Sheet S1) that were designated as L-1 hub proteinsand
   visualized through a network (Fig. [102]2C).

Fig. 2.

   [103]Fig. 2
   [104]Open in a new tab

   First-level protein–protein interaction network and graph coloring
   algorithm implemented filtration of HUB proteins. A Occurrence of human
   interactors of individual SARS-Cov-2 proteins. B PPI network of 5225
   total interactors and visualization using chromatic graph coloring
   approach. C Highlighted PPI network of 1082 L-1 hub proteins and
   visualization using chromatic graph coloring approach. D Subcellular
   localization analysis of 1082 L-1 hub proteins. E Interaction frequency
   of 1082 L-1 hub proteins with 27 different groups of virus families. F
   Categorization of 27 interacting different groups of virus families
   based on genome content

   Evaluation of cellular localization of L-1 hub proteins showed 21% were
   plasma membrane, 21% were ER membrane, and 14% were golgi
   membrane-localized proteins whereas 35% were cytosolic proteins
   (Fig. [105]2D). It suggests that most of the essential direct
   interactors of SARS-CoV-2 were either membrane or cytosolic proteins.
   It also supports their higher possibility to interact with SARS-CoV-2
   proteins which indirectly validates the potential of our graph coloring
   approach and weightage degree centrality-based hub protein
   identification method. The literature and HVIDB database [[106]43]
   mapping of L-1 hub protein revealed that 80.4% (870 out of 1082) of
   them were previously reported as possible interactors of other than
   SARS-CoV-2 viruses (Additional file [107]2: Sheet S3), which belong to
   27 different viral families (Fig. [108]2E). Notably, the hub proteins
   interacting with viral proteins from the same family may or may not
   fall into the same category. This is because of our undirected
   protein–protein interaction networking model, where color codes are not
   assigned based on functional similarity. As SARS-CoV-2 is an RNA virus
   family member, categorization based on the genetic content of
   interacting virus families indicates 33.33% of them were DNA virus
   families, whereas 66.67% were RNA virus families (Fig. [109]2F). It is
   noteworthy that the majority of the identified hub proteins, associated
   with both RNA and DNA viruses, may contribute to the manifestations of
   viral diseases. However, one-third of the DNA-virus family interactors
   may have distinctive roles in the context of SARS-COVID-19 responses.
   The complexity and distinct manifestations of COVID-19 compared to
   other viral diseases prompt the consideration that these hub proteins
   might play unique roles in the development of the complex disease
   associated with SARS-CoV-2. Alternatively, these hub proteins may act
   as common mediators for both DNA and RNA viral infections. This further
   suggests our method identified essential hub proteins with a higher
   probability of interacting with RNA viruses.

PPI network construction of the second-level interactors of the first-level
hub proteins and application of graph coloring algorithm predicts the
important second-level hub proteins

   To characterize the second level of important human proteins through
   which L-1 hub proteins influence the biological alterations in infected
   individuals, a second protein–protein interaction network (PPIN2) was
   constructed. To generate the PPIN2 network the second-level interactors
   of 1082 L1-hub proteins were extracted from the BioGRID database
   (Fig. [110]3A). Subsequently, PPIN2 was created using 1082 L1-hub
   proteins and their second-level 2,68,738 unique inter-interactions with
   the rest of the 10,371 human interactors from the BioGRID human protein
   interactome database and presented through a network highlighting the
   L-1 proteins (Fig. [111]3B). Similar to PPIN1 network, nodes were
   colored based on the chromatic coloring approach and the edges denoted
   as direct interactions among the nodes, devoid of any indication of
   regulatory relationships, functional similarities, or directionality of
   regulation or signal or upstream-downstream demarcation. The chromatic
   coloring of nodes were applied based on the similar criteria of PPIN
   and the first 25% of color classes were extracted as the significant
   nodes. Furthermore, weightage to the degree centrality of the nodes was
   added to extract most potential hub proteins based on the criteria:
   intra-interaction
   [MATH: <mo>≥</mo> :MATH]
   100 within a total of 11,453 interactors. This criterion was added to
   filter out the second-level hub proteins with a higher possibility of
   impacting infected patients. It yielded 1922 potential hub proteins
   (Additional file [112]2: Sheet S4) that were designated as L-2 hub
   proteins and highlighted in the total network (Fig. [113]3C).

Fig. 3.

   [114]Fig. 3
   [115]Open in a new tab

   Second-level human–human protein–protein interaction network and graph
   coloring algorithm implemented filtration of L-2 hub proteins. A
   Frequency of human interactors of individual L-1 hub proteins. B PPI
   network of 1082 L-1 hub proteins and 10,371 total interactors
   highlighting L-1 hub proteins. C Highlighted PPI network of 1922 L-2
   hub proteins and visualization. D Subcellular localization analysis of
   1922 L-2 hub proteins. E Interaction frequency of 1922 L-2 hub proteins
   with 32 different groups of virus families. F Categorization of 32
   interacting different groups of virus families based on DNA or RNA
   genome content

   Cellular localization analysis of L-2 hub proteins corresponds to 44%
   of nucleus-localized proteins and 50% of Cytosolic proteins and no or
   minimal plasma membrane, ER, or Golgi membrane proteins (Fig. [116]3D).
   It suggested the possibility that the majority of second-level
   interactors might be directly or indirectly involved in gene regulation
   which is also expected from the second-level interactors. Hence, it
   also validates the potential of our method to extract the important hub
   proteins. HVIDB database [[117]43] mapping for L-2 hub proteins for the
   interaction information with other viruses revealed that 59% (1134 out
   of 1982) of them were previously reported as possible interactors for
   non-SARS-CoV-2 viruses (Additional file [118]2: Sheet S6). Grouping of
   these viruses according to different families showed they belonged to
   32 different viral families (Fig. [119]3E). Similar to L-1 hub
   proteins, the identified L-2 hub proteins interacting with viral
   proteins from the same family may or may not fall into the same
   category because of our proposed protein–protein interaction networking
   structure. Additionally, genetic content-based categorization of these
   32 virus families appeared as 34.38% of them were DNA virus families
   whereas 65.62% were RNA virus families (Fig. [120]3F). As explained
   earlier, one-third of the DNA-virus family interactors may have
   distinctive roles in the context of SARS-COVID-19 responses and might
   play unique roles in the development of the complex disease associated
   with SARS-CoV-2. This further suggests our method was able to find out
   important hub proteins which have a higher probability to interact with
   RNA viruses like SARS-CoV-2.

Collective PPI network-graph coloring model identifies important biological
and functional consequences of SARS-CoV-2 infection

   To demonstrate the efficiency of our proposed model for identifying
   SARS-CoV-2 infection-linked important hub proteins and to establish
   their biological implications, L-1 and L-2 hub proteins associated
   biological and functional consequences and their potentiality to alter
   the human gene expression in favor of them were explored. Therefore,
   transcription factor database [[121]44] mapping and DAVID functional
   annotation tool-based analysis of L-1 and L-2 hub proteins were
   performed. Cellular localization analysis of L-1 and L-2 hub proteins
   (Fig. [122]2D, Additional file [123]2: Sheet S2, Fig. [124]3D,
   Additional file [125]2: Sheet S5) showed that the majority of L-1 hub
   proteins were cytosolic whereas most of the L-2 hub proteins were
   nuclear and cytosolic. In an extension of those findings, transcription
   factor (TF) database mapping on this combined PPI network of L-1 and
   L-2 proteins demonstrates that none of the L-1 hub proteins were TF
   whereas 206 out of 1,922 L-2 hub proteins were TF (Fig. [126]4A,
   Additional file [127]2: Sheet S9). Collectively this result
   substantiates the possibility that SARS-CoV-2 proteins interact with
   key first level interactor (L-1 hub proteins) which further interacts
   with the vital second-level proteins (L-2 hub proteins), which have the
   potential to enter inside the nucleus and alter gene expression to
   establish the disease manifestations. Furthermore, it signifies the
   potential of our proposed model to identify the crucial proteins
   related to SARS-CoV-2 infection to develop COVID-19 disease.

Fig. 4.

   [128]Fig. 4
   [129]Open in a new tab

   Biological and functional consequences of PPI network-Graph coloring
   model identified L-1 and L-2 hub proteins. A L-1 and L-2 hub proteins
   highlighted second level PPI network and identified transcription
   factors (TF) amongst those hub proteins after TF database mapping. B
   L-1 and L-2 proteins associated 50 biological processes and their gene
   count in each process. C KEGG pathway enrichment analysis of L-1 and
   L-2 hub proteins and 55 represented pathways and their gene count in
   each pathway

   To further investigate the all-inclusive involvement of L-1 and L-2 hub
   proteins in the disease manifestation of COVID-19, DAVID functional
   analysis for analyzing their involvement in biological process were
   performed (Fig. [130]4B). The result revealed that a good number of of
   L-1 and L-2 hub proteins identified by our proposed model were
   associated with important biological process like regulation of
   transcription positively and negatively through the alteration of
   RNA-Pol-II or DNA template or regulating the mRNA splicing and
   processing as well as protein translation, which might be the possible
   ways of alteration of host gene expression in favor of the SARS-CoV-2
   infection and disease manifestation. Also, a good number of proteins
   were involved in regulation of apoptosis, cell division and cell cycle,
   cell proliferation and cell migration, chromatin organization process
   which might be the possible ways SARS-CoV-2 alter the host cell fates
   towards the death to induce the injury in different organs mainly in
   the first infected sight, lung. Some of those identified L-1 and L-2
   proteins were engaged in protein phosphorylation and their transport in
   different organellar location and degradation through ubiquitin pathway
   that might be the possible ways through which SARS-CoV-2 alters the
   hosts’ cellular signaling to support their favorable conditions. Our
   result suggested that another group of proteins were involved in
   endocytosis and more importantly in accordance to response in hypoxic
   conditions which might be the important one to support their survival
   and activity in less oxygen environment which is the common
   manifestation of SARS-CoV-2 infected hosts (Fig. [131]4B, Additional
   file [132]2: Sheet S7).

   Furthermore, to delineate the L-1 and L-2 hub proteins’ involvement in
   biological pathways through which SARS-CoV-2 essentially alter the
   homeostasis of host cellular pathways, the KEGG pathways enrichment
   analysis was performed (Fig. [133]4C). The result showed that a large
   number of proteins were involved in important cellular signaling
   pathways like MAPK signaling, cAMP signaling, cGMP-PGK, FoxO and ErbB
   signaling pathways which might be responsible for altering the host
   cellular functions to support the infection and disease progression
   (Fig. [134]4C). A good number of hub proteins were involved in HIF-1
   signaling pathways responsible to alter the pathways related to hypoxic
   condition, suggest that through this alteration SARS-CoV-2 might
   support their progression at less oxygen conditions. Likewise, proteins
   involved in p53 signaling pathway were the possible moderator of
   SARS-CoV-2 through which virus change the host cell death or cell
   proliferation of the infected organs. Similarly, large number of hub
   proteins were involved in different cancer related signaling pathways
   including lung cancer, suggested that those are responsible to change
   the large number of cellular pathways in favor of infection
   establishment and disease manifestations (Fig. [135]4C, Additional file
   [136]2: Sheet S8).

This Proposed model underscores the connections between SARS-CoV-2 infection
and its patho-physiology

   To investigate the biological connection between SARS-CoV-2 infection
   and the development of its pathophysiology, the association of the L-1
   and L-2 hub proteins with NIH defined COVID-19 manifestations were
   evaluated. Therefore, well established COVID-19 symptoms or
   pathophysiology as per NIH report were listed from the published
   literature search [[137]45, [138]46]. The four major symptoms
   categories were identified as most established manifestations of
   COVID-19 as respiratory, cardiovascular, hematologic, and
   neuropsychiatric disorders. Based on the literature search (References