Abstract
Background
Hodgkin Lymphoma (HL) is a type of aggressive malignancy in lymphoma
that has high incidence in young adults and elderly patients.
Identification of reliable diagnostic markers and efficient therapeutic
targets are especially important for the diagnosis and treatment of HL.
Although many HL-related molecules have been identified, our
understanding on the molecular mechanisms underlying the disease is
still far from complete due to its complex and heterogeneous
characteristics. In such situation, exploring the molecular mechanisms
underlying HL via systems biology approaches provides a promising
option. In this study, we try to elucidate the molecular mechanisms
related to the disease and identify potential pharmaceutical targets
from a network-based perspective.
Results
We constructed a series of network models. Based on the analysis of
these networks, we attempted to identify the biomarkers and elucidate
the molecular mechanisms underlying HL. Initially, we built three
different but related protein networks, i.e., background network,
HL-basic network and HL-specific network. By analyzing these three
networks, we investigated the connection characteristic of the
HL-related proteins. Subsequently, we explored the miRNA regulation on
HL-specific network and analyzed three kinds of simple regulation
patterns, i.e., co-regulation of protein pairs, as well as the direct
and indirect regulation of triple proteins. Finally, we constructed a
simplified protein network combined with the regulation of miRNAs on
proteins to better understand the relation between HL-related proteins
and miRNAs.
Conclusions
We find that the HL-related proteins are more likely to connect with
each other compared to other proteins. Moreover, the HL-specific
network can be further divided into five sub-networks and 49 proteins
as the backbone of HL-specific network make up and connect these 5
sub-networks. Thus, they may be closely associated with HL. In
addition, we find that the co-regulation of protein pairs is the main
regulatory pattern of miRNAs on the protein network in the HL-specific
network. According to the regulation of miRNA on protein network, we
have identified 5 core miRNAs as the potential biomarkers for
diagnostic of HL. Finally, several protein pathways have been
identified to closely associated with HL, which provides deep insights
into underlying mechanism of HL.
Electronic supplementary material
The online version of this article (10.1186/s12859-019-3041-9) contains
supplementary material, which is available to authorized users.
Keywords: Protein interaction network, Network analysis, miRNA
regulation, Hodgkin Lymphoma
Background
Cancer is thought to be a kind of complex and highly heterogeneous
disease that involves multiple causes and factors. Moreover, cancer is
also associated with the alteration of molecular interactions rather
than the abnormality of a single gene [[33]1]. In particular,
dysregulation of multiple pathways governing fundamental cell processes
contributes to cancer development and progression. Therefore, these
characteristics determine that we should apply systems biology
approaches specifically network-based approaches to study underlying
mechanism of cancer [[34]2]. As protein-protein interactions (PPIs)
form the basis of cellular processes, the dysfunction of some
interactions causes many diseases including cancer [[35]3]. Thus the
construction and analysis of PPIs network can not only provide a global
view of biological events, but also decipher the molecular basis of
cancer from the perspective of network dynamics [[36]4]. In addition,
systematic analysis of the PPIs network also provide a wealth of
valuable information that may be useful for identifying therapeutic
targets [[37]5, [38]6] and potential biomarkers for diagnosis and
prognosis of cancer [[39]7, [40]8].
As an important class of post-transcriptional regulator, microRNAs
(miRNAs) can regulate many crucial cellular processes, such as
differentiation, growth, proliferation, and apoptosis. The abnormality
of miRNAs expression also leads to various diseases, especially cancer.
It is well known that miRNAs play a crucial role in the formation and
development of cancer by functioning as tumor suppressors or oncogene
[[41]9]. Moreover, miRNAs have also been considered as important
molecules for cancer diagnosis [[42]10] and therapeutic targets
[[43]11, [44]12].
miRNAs can negatively modulate target genes and consequently perform
fine-scale adjustment of protein output by influencing the stability of
encoding mRNAs [[45]13]. In addition, miRNAs can also regulate
functionally related proteins and exert specific effects on the
formation of protein complexes [[46]14, [47]15] and biological pathways
[[48]16]. Therefore, in order to more clearly understand the function
of miRNAs and their role in diseases, the investigation of miRNA
biology should be conducted in the context of protein interaction
network rather than isolated target genes [[49]17].
Although how miRNAs regulate protein interaction network is still not
fully understood, some characteristics of miRNA-mediated protein
interaction network have been investigated by integrating information
about miRNA targets and protein interaction data [[50]18, [51]19]. For
instance, a statistical analysis was conducted to compare topological
characteristics between miRNA-mediated proteins and randomly selected
proteins from protein interactions network. The results demonstrated
that the miRNA-mediated proteins tend to more frequently interact with
other proteins. Moreover, the proteins mediated by the same miRNA have
high tendency to interact with each other. These specific
characteristics imply that miRNAs might exert their regulatory effects
on protein complex and pathways through protein interactions network.
Therefore, based on the analysis of miRNA-mediated protein interactions
network, we can not more comprehensively understand the function of
miRNAs [[52]20], but more accurately identify the miRNAs associated
with diseases [[53]21, [54]22].
Hodgkin Lymphoma (HL) is a tumor arising from the lymphatic system and
its hallmark is the emergence of Hodgkin and Reed-Sternberg cells
[[55]23]. Although the exact cause for HL is not clearly clarified yet,
some risk factors have been considered to be related with the
occurrence of HL. Because HL is an aggressive malignancy that can
quickly spread through the body, identification of reliable diagnostic
markers and efficient therapeutic targets are especially important for
diagnosis and treatment of HL. Using the high-throughput techniques,
many HL-related molecules have been identified, such as the proteins
uniquely expressed in HL-derived cell lines [[56]24] and miRNAs
differently expressed between normal and patients with HL [[57]25],
which make it feasible to construct a specific network for HL. The
analysis of such network can provide valuable insight into the
underlying mechanism of HL and identification of key proteins and
miRNAs for HL. For example, a regulatory network consisting of genes,
miRNAs and transcription factors is constructed using the available
data and several important pathways in HL are identified based on the
resulting regulatory network [[58]26]. However, this study just focused
on the regulatory of miRNAs on isolated target genes and transcription
factors. It is still unclear about the protein interactions network
specific to HL and miRNA regulation on protein interactions network.
In this study, we firstly manually collected the HL-associated proteins
and miRNAs. Subsequently, we extracted the experimentally verified
protein-protein interactions from five protein interaction databases.
Based on the collected data, we constructed a protein interactions
network specific to HL using a three-step strategy. By analyzing this
network, we identified the core proteins that are crucial for
maintaining network structure. These proteins can be considered as
candidates of diagnostic and therapeutic markers for HL. Finally, we
obtained experimentally validated miRNA-target interactions from
miRWalk and miRTarBase. By integrating HL-specific protein network with
miRNA-target interactions, we investigate miRNA regulation on the
HL-specific protein network. On the basis of the analysis at the
network level, we obtain a comprehensive insight into the role of
HL-associated proteins and miRNAs playing in pathogenesis of HL. These
results provide more valuable information for studying mechanism and
treatment of HL.
Results
Analysis of three related PPI networks
PPI background network
In order to provide a network-level view for the HL-specific proteins,
we constructed a background network that includes as many proteins as
possible. The constructed background network has 17,076 proteins as
nodes and 146,295 protein interactions as edges. Subsequently, we
calculated the degree distribution of the background network (shown in
Additional file [59]1: Figure S1). As displayed in the figure, the
degree distribution clearly follows a power law. It indicates that the
background network is a typical scale-free network and has scale-free
properties [[60]27]. This result is also in agreement with the previous
study [[61]28].
The power-law decay of degree distribution implies that there are hub
proteins that are heavily interacted with other proteins in the
background network. In this study, we identified the hub proteins by
calculating the relative connectivity of subgraph [[62]29]. According
to the previously study [[63]30], the links between hub proteins in a
network are systematically suppressed. Therefore, for the subgraphs
consisting of only hub proteins, the relative connectivity will be
smaller than that of other subgraphs containing non-hub proteins. Due
to considering the unique topological property of hub proteins in the
network, this identification method should be more precise compared
with just using a degree threshold.
The relative connectivity of subgraphs was computed as a function of
node number and shown in Fig. [64]1. From this figure, we find that the
relative connectivity is continual decrease when the number of nodes is
less than 20. Subsequently, the relative connectivity shows some
fluctuations with increase of nodes. When the number of nodes is
greater than 132, the variation of relative connectivity becomes stable
and reaches the relative connectivity of entire network. Therefore, we
define the top 132 proteins in the degree ranking as the hub proteins
in the background network. The Uniprot ID and name of each hub proteins
is listed in Additional file [65]1: Table S1.
Fig. 1.
Fig. 1
[66]Open in a new tab
Relative subgraph connctivity as a funciton of number of nodes in the
background network. The panel in this figure shows the change of
relative connectivity in the node range between 1 and 200
The degree distribution of HL-specific proteins in the background
network is shown in Fig. [67]2. From this distribution we can find that
85% of HL-specific proteins have the degree with less than 100.
According to the definition of hub proteins in the background network,
only 10 HL-specific proteins belong to the hub proteins in the
background network. Based on the guilt-by-association principle, we
assume that the HL-specific proteins may be closely connected together
in the background network. Whereas the 10 hub proteins might play an
important role in connecting other HL-specific proteins. Therefore, we
obtained a small network only consisting of HL-specific proteins from
the background network. This small network is referred as HL-basic
network.
Fig. 2.
Fig. 2
[68]Open in a new tab
Degree distribution of HL-specific proteins in the background network.
The black line in the figure is the lowest degree for hub proteins in
the backgournd network. The HL-specific proteins whose degree is bigger
than the lowest degree can be thought as the hub proteins in the
background network. There are 10 HL-speicific proteins as the hub
proteins in the background network
HL-basic PPI network
In the HL-basic network, these are only 144 nodes and 180 edges. The
nodes represent the HL-specific proteins and the edge is the
interaction between two HL-specific proteins. The HL-basic network is
displayed in Fig. [69]3. Based on the connection between nodes, 144
nodes can be distinctly classified into two groups. In one group, 84
out of 144 nodes are connected to form a sub-network and 9 hub proteins
in the background network are included into this sub-network. The nodes
in another group have not any interacting partners in HL-basic network.
Moreover, according to the calculated maximum modularity score, the
sub-network can be further divided into eight modules and 9 hub
proteins are located respectively into different modules that are
displayed in different colors in Fig. [70]3.
Fig. 3.
Fig. 3
[71]Open in a new tab
The sub-network consisted of 84 HL-specific proteins. The sub-network
can be divided into eight modules in which the nodes are colored into
different colors. 9 hub proteins are located into the different modules
and shown in the bigger circle. Their Uniprot ID are also shown with
the corresponding colors
Clustering coefficient is a measure of node aggregation in a network.
We calculate the global clustering coefficient of the sub-network to
evaluate the connection extent of the HL-specific proteins. The global
clustering coefficient is calculated to be 0.17. To confirm whether
that the HL-associate proteins are more closely connected together, we
generated 10,000 random networks consisted of the same number of nodes
as the sub-network. Subsequently, we also calculated the global
clustering coefficients of random networks and compared them with that
of the HL-basic network. The comparison results are shown in
Fig. [72]4. It can be seen that the global clustering coefficient of
HL-basic network lies within the same range as those of 10,000 random
networks. The result indicates that the HL-associate proteins are not
so densely connected together compared with the randomly selected
proteins. According to Local hypothesis that proteins involved in the
same disease tend to interact with each other [[73]1], it implies that
in this study the list of collected HL-specific proteins is not
entirely comprehensive. Moreover, 60 isolated nodes in the HL-basic
network also confirm this observation.
Fig. 4.
Fig. 4
[74]Open in a new tab
Comparison of global clustering coefficients of HL-basic network and
HL-expand network with their corresponding random networks. The box
plot displays the distribution of global clustering coefficients of
10,000 random networks that have the same numbers of nodes with equal
degree as the HL-basic network and HL-expanded network, respectively.
The black rectangle represents the global cluster coefficient of the
HL-basic network or the HL-expanded network
HL-expanded network
On the basis of the above results, we think the HL-basic network is yet
incomplete. In order to construct a more comprehensive HL-related
network, we regarded the 144 HL-specific proteins as seed proteins and
then selected their neighbors that directly connected with them in the
background network. The newly selected proteins and the involving
interactions were integrated to build a network called as HL-expanded
network. This resulting network comprises 541 nodes and 5057
connections.
Compared with the HL-basic network, the HL-expanded network contains
more hub proteins. There are a total of 61 hub proteins identified from
the background network. These hub proteins make the nodes in the
HL-expanded network densely connect to each other. Similarly, we also
generated 10,000 random networks where nodes have the same degree
distribution as those of the HL-expanded network and compared the
global clustering coefficient between HL-expanded network and random
networks. The global clustering coefficient of HL-expanded network is
computed to be 0.135, which is higher the average value of 10,000
random networks (0.124) as shown in Fig. [75]4. The statistics analysis
using Kolmogorov-Smirnov test (p-value =2.2 × 10^− 16) also validates
the observation that the global clustering coefficient of HL-expanded
network significantly differs from those derived from the random
networks. It indicates that, as expected, the HL-specific proteins are
densely connected together.
In addition, as components of the background network, the nodes in the
HL-expanded network simultaneously connect with other nodes out of the
HL-expanded network. To evaluate the extent of connection between the
nodes inside and outside the HL-expanded network, we calculate Z-score
value that is based on the degree values in the HL-expanded network and
the background network. If the Z-score of a node is larger than 0, it
means this node has more interaction with the nodes within the
HL-expanded network. On the contrary, the node is more connected with
the nodes in the background network.
Figure [76]5 displays the Z-score distribution of all nodes in the
HL-expanded network with their degree values. From this figure, we can
clearly find that Z-scores of all nodes are basically correlated with
their degree. Moreover, the Z-scores of all nodes in the HL-expanded
network are larger than 0, meaning that all nodes in HL-expanded
network tend to connect with the intra-network nodes and form a
relatively isolated network from the background network.
Fig. 5.
Fig. 5
[77]Open in a new tab
Z-score distribution of 541 nodes in HL-expanded network along with
their degree values
In summary, the HL-expanded network is a relatively compact network, in
which 144 HL-associate proteins are tightly linked together. Based on
Local hypothesis that proteins involved in the same disease tend to
interact with each other, the HL-expanded network can be considered as
HL-specific network and all proteins in this network are regarded to be
related to HL.
The above results display that the HL-expanded network possesses higher
cluster coefficient compared with the random network. It suggests that
the HL-expanded network may be a small-world network. Hence we adopted
a measurement of S^△ index proposed by Humphries and Gurney [[78]31] to
quantify the small-worldness of this network. The calculated S^△ value
is 4.55, greater than 1. It means that HL-expanded network is a small
world network. Because the small-world network tends to contain
cliques, we further perform clustering analysis for the HL-expanded
network.
The results of cluster analysis show that the HL-expanded network can
be divided into five sub-networks, in which the nodes have a high
tendency to connect with each other. The HL-expanded network and its
sub-networks are shown in Fig. [79]6. Subsequently, we conducted
functional enrichment analysis and KEGG pathway analysis on five
sub-networks respectively.
Fig. 6.
[80]Fig. 6
[81]Open in a new tab
The HL-expanded network and its constitued sub-networks. Based on dense
connections of nodes, the HL-expanded network is divided into 5
sub-networks. The nodes within each sub-network are colored in
different colors. The nodes with bigger size are the hub proteins in
each sub-network and are labeled with Uniprot ID
The enrichment results are listed in Additional file [82]1: Table S2.
From this table, we can observe that five sub-networks are separately
involved in the different functions and pathway. For example, the
proteins in sub-network 2 mainly participate in the process of
cell-cell adhesion, which may be related to the migration of lymphoma
cells. Meanwhile, the proteins in this sub-network are also involved in
pathway of Epstein-Barr virus infection, which has been confirmed to be
an important cause for HL [[83]32]. In addition, the proteins contained
in sub-network 3 are associated with kinase activity and signaling
pathway, particularly NF-kappa B signaling pathway. The aberrant
NF-kappa B activity has been recognized as a critical pathogenic factor
in lymphoma [[84]33]. Moreover, the pathway enrichment results also
shown that the proteins in Sub-network 4 are participated in the
process of human T-lymphotropic virus I (HTLV-I) infection and
colorectal cancer. This is consistent with the fact that HILV-I
infection is the cause of adult T-cell lymphoma [[85]34] and colorectal
cancer is a common secondary cancer in HL survivors [[86]35]. These
results directly validate the rationality of the constructed
HL-expanded network.
Because in the network the clusters is generally formed by the
high-connectivity hubs proteins [[87]36], we can further simplify the
network to the connection of hub proteins. By extracting the hub
proteins in five sub-networks and their mutual connections, we build a
simplest form of HL-expanded network. This simplified network consists
of 49 nodes shown in the Fig. [88]6, which can directly connect 470 out
of remaining 492 nodes in HL-expanded network and play an important
role in maintaining the HL-expanded network structure. Therefore, these
nodes can be considered to make up the backbone of HL-expanded network
and the corresponding proteins represented by these nodes are
considered as the key proteins for HL. The Uniprot ID numbers of 49
proteins together with their name and possible functions in HL are
listed in Additional file [89]1: Table S3.
Among 49 key proteins, 18 proteins are the manually collected
HL-related proteins and 4 proteins, [90]P54529, [91]P04637, [92]Q13287
and [93]P12931 have also been proven to be related with the development
of HL. The result directly confirmed the correctness of the
identification of key proteins based on the context of network.
Remaining 27 proteins as the candidates can be further studied using
the experimental methods. Meanwhile all 49 key proteins can also be
regarded as the potential targets for treatment of HL.
Prediction of miRNA targets
In addition to the related proteins, many studies have confirmed that
miRNAs are closely associated with the HL. Some specific miRNAs can be
used to differentiate HL lymph nodes from reactive lymph nodes and HRS
cells from germinal center B cells [[94]37]. They are also utilized to
track treatment response for HL [[95]38]. However, regarding how miRNAs
participate in the development of HL and regulate the interaction
between HL-specific proteins, it is not completely clear. Hence, we
further obtained the regulatory relationships between miRNAs and
HL-specific proteins from two miRNAs target databases and analyzed the
regulations of miRNAs on protein interaction network. In this study, we
extracted a total of 14,614 and 14,693 experimentally validated
miRNA-target interactions from miRWalk and miRTarbase, respectively.
The intersection of two datasets is retained for further analysis.
Based on the obtained experimentally validated miRNA-target data, we
construct a HL-specific miRNA-protein network (shown in Fig. [96]7), in
which there are 497 HL-specific proteins and 1628 miRNAs as well as
14,299 miRNA-protein interactions. Although, in this network, 40
HL-specific proteins are regulated only by one miRNA and 152 miRNAs
modulate one protein, most of miRNAs and proteins have many-to-many
regulatory relations. Among the 1628 miRNAs, 20 miRNAs can directly
regulate approximately 80% of proteins in this network. So these 20
miRNAs can be considered as key miRNAs for HL and 3 out of 20 miRNAs
are included in the previously identified HL-related miRNAs.
Fig. 7.
[97]Fig. 7
[98]Open in a new tab
The HL-specific miRNA-protein network consisted of HL-specific proteins
and miRNAs. 497 HL-specific proteins are depicted as black circles and
1628 miRNAs are shown in pink triangle. But the HL-proteins that are
regulated only by one miRNA and miRNAs that only modulate one
HL-protein are colored by gray. 20 key miRNAs are colord as red and
their name are labeled
Analysis of miRNAs regulation from the network perspective
It has been demonstrated that the targets of miRNAs are generally more
connected in the protein-protein interaction network than expected by
chance [[99]18, [100]39]. The protein-protein interaction may enhance
regulatory effect of miRNAs on targets. Therefore, we integrate
protein-protein interactions with miRNA-protein regulation to explore
the miRNAs-mediated regulation on the protein network. In this study,
we will consider three simplest types of regulatory patterns. The first
pattern is that a miRNA can simultaneously regulate two interacting
proteins shown in Fig. [101]8a. By means of the interaction between two
proteins, miRNA may strengthen the regulatory effect on them. In the
HL-expanded network, out of total 5057 interacting protein pairs, 2336
pairs are regulated in this way. This result demonstrates this kind of
regulation is a common pattern in the HL-expanded network and it is
agreement with the previous study [[102]39]. If taking account into
this type of regulation, 20 key miRNAs can not only target 80% of
HL-specific proteins, but also regulate approximately 60% of
interacting protein pairs in the HL-expanded network. Hence 20 key
miRNAs are playing an important role in regulating HL-specific proteins
and network.
Fig. 8.
[103]Fig. 8
[104]Open in a new tab
Illustration of three types of miRNA regulations on HL-specific
network. a A miRNA simultaneously regulates two interacting proteins. b
A miRNA mediates three sequentially interacting proteins. c. A miRNA
directly regulates two out of three sequentially interacting proteins
and indirectly mediates one out of three sequentially interacting
proteins
Besides regulation of the interacting protein pairs, we further analyze
the regulatory pattern that three sequentially interacting proteins are
mediated by a miRNA (shown in Fig. [105]8b). Compared with the first
type of pattern, this pattern can more efficiently strengthen the
regulatory effect of miRNA through combination of double
protein-protein interactions. In the case of HL-specific proteins,
total 341 proteins are found to be mediated in this way by 550 miRNAs,
and 20 key miRNAs are found to regulate up to 54% of all HL-specific
proteins.
The third type of regulation pattern is similar with the second one and
is also involved in mediating three sequentially interacting proteins
(shown in Fig. [106]8c). But differing from the second pattern, the
protein that interacts with other two proteins is not a target of the
miRNA, but it may be indirectly regulated by this miRNA through
mediating two interacting proteins. It means that by means of protein
interactions, miRNA not only enhance the regulatory effect, but also
expand the regulatory scope.
Thus, when we only consider the directly regulation of miRNAs, 45
miRNAs can regulate approximately 90% of proteins in the HL-expand
network. On the contrary, when all three types of regulations are taken
into account, only 5 miRNAs are able to regulate the same number of
proteins. Therefore, the 5 miRNAs are thought to be the core miRNAs
that they can regulate almost all HL-related proteins in the
HL-expanded network. Moreover, the 5 core miRNAs also rank in the top 5
among 20 key miRNAs identified above.
Construct a simplified network consisting of core miRNAs and key proteins
Based on the miRNA regulation on the protein network, we identified 5
core miRNAs from 1628 miRNAs. To better understand relation between
miRNAs and HL-specific proteins, we construct a simplified network only
consisting of 5 core miRNAs and 49 key proteins. Fig. [107]9 displays
this network where the edges represent two types of information,
miRNA-protein regulation and protein-protein interaction.
Fig. 9.
[108]Fig. 9
[109]Open in a new tab
Simplified miRNA-regulated HL-specific protein network consisting of 5
core miRNAs and 49 key HL-specific proteins. The core miRNAs are shown
in blue rectangle and their name are also labeled in blue. The
HL-specific proteins are shown in circle with different colors. The
proteins in red color can be directly regulated by 5 core miRNAs. The
direct miRNA regulation on protein is also shown in solid red line. The
proteins in green color are indirectly regulated by the means of two
interacting proteins. These proteins interaction is shown in solid
black line. The proteins in gray color can’t be regulated by the core
miRNAs in direct and indirect ways
As the main backbone of the HL-expanded network, the 49 key proteins
are highly important for maintaining structural integrity of network.
Therefore, by targeting the 49 key proteins, the 5 core miRNAs are
nearly able to regulate the entire HL-expanded network. In terms of
influence on the network, the 5 core miRNAs are thought to be closely
related with HL. Three out of five miRNAs, miR-92a, miR-26b and let-7b,
are specifically expressed in Hodgkin lymphoma cell line [[110]40]
[[111]41] and the remaining two miRNAs, miR-335 and miR-16, have
identified to be breast cancer [[112]42] and acute myelogenous leukemia
(AML) [[113]43]. However, it is not entirely clear how these 5 core
miRNAs are involved in HL pathology. Because the function of miRNAs may
be determined by regulating the function of their targeting proteins
[[114]44, [115]45], we explore the role of 5 core miRNAs in HL based on
the function of key proteins. Additional file [116]1: Table S4 lists
the proteins regulated directly and indirectly by 5 core miRNAs and
their possible functions in HL derived from their the functions of
their targets. Among 49 key proteins, 24 proteins are directly
regulated by all 5 core miRNAs. The pathway enrichment was performed
using these 24 key proteins and several pathways were found to be
associated with these proteins, including ErbB signaling pathway, Focal
adhesion, Viral carcinogenesis, Sphingolipid signaling pathway, VEGF
signaling pathway and Epstein-Barr virus infection. It implies that the
5 core miRNAs may be associatd with HL by regulating these pathways.
According to the enrichment results, virus infection especially
Epstein-Bar virus infection may contribute to the development of HL,
which has been discussed in details elsewhere [[117]46]. In addition,
most of key proteins are enriched in four signaling pathways associated
with cancer development and progression, suggesting that HL may not be
related with a single or unique pathway and the abnormalities of
several pathways may cause the occurrence and development of HL.
Discussion
Currently, the application of high-throughput techniques in HL
generated a larger amount of data. Based on these data, many HL-related
proteins and miRNAs have also been identified. But it remains
thoroughly unclear how these HL-related molecules participate in the
pathology of HL and how the HL-related miRNAs regulate the HL-related
proteins and their constituted PPI network. These information may help
to search for key proteins and miRNAs that can be considered as
biomarkers and drug targets for HL. The purpose of this study is to
obtain important proteins and miRNAs and to reveal their regulatory
relationship under the scale of network.
In this study, we constructed a series of network models. Initially, we
built three different but related PPI networks. By analyzing those
three networks, we investigate the connection characteristic of the
HL-related proteins and find that these proteins are prone to connect
with each other compared with other proteins. Subsequently, we obtained
a PPI network closely associated with HL and 49 key proteins. These key
proteins play imperative role in maintaining the integrity of the
HL-related PPI network. Hence these key proteins have a higher
probability to involve into initial and development of HL. They can be
further studied for being the reliable biomarkers and drug targets for
HL using the experimental methods.
In addition, we also investigated the miRNA regulation on HL-related
PPI network and analyzed three kinds of simple regulation patterns.
Based on these regulations on HL-related PPI network, we identified 5
core miRNAs that can mediate approximately 90% of proteins in the
HL-related PPI network. When the expression of these 5 miRNAs is
altered, the proteins in this network can be to some extent influenced
by the regulation of these miRNA, which may cause the occurrence of HL.
Therefore, these 5 miRNAs can be considered as the potential biomarkers
for the diagnosis of HL.
To better understand the relation between 49 key proteins and 5 core
miRNAs, we finally constructed a PPI network combined with the
regulation of miRNAs. This network indicates that it is necessary for
comprehensive understanding the regulation of miRNAs on targets to
fully take into account of the related protein interactions. Based on
the analysis of this combined network, we identified several protein
pathways closely associated with HL, including ErbB signaling pathway,
Focal adhesion, Viral carcinogenesis, Sphingolipid signaling pathway,
VEGF signaling pathway and Epstein-Barr virus infection. These
information will be helpful to elucidate HL mechanisms and identify
pharmaceutical targets.
Conclusion
In this study, we use a three-step strategy to construct a HL-specific
network that is as complete as possible. Firstly we constructed a
background protein-protein interaction network based on the current PPI
information. According to the background network, we then build a
HL-basic network only consisting of the HL-associated proteins.
Finally, we obtained a complete HL-specific protein-protein network.
The HL-specific network consists of 541 proteins and 5057 protein
interactions. Moreover, the HL-specific network is further divided into
five sub-networks and 49 proteins are identified as the important nodes
that make up and connect these 5 sub-networks. Therefore, we consider
the 49 proteins as the key proteins of HL.
In addition, based on the experimentally validated information about
miRNA-target, we get the regulatory relation between miRNAs and
HL-specific network. Furthermore, we investigate three simple
regulatory patterns of miRNA in the HL-specific network, The
co-regulation of protein pairs is the main regulatory pattern of miRNAs
on the protein network in the HL-specific network.
Finally, we identified 5 core miRNAs and 49 key proteins from the point
of view of network. These molecules can be thought as the potential
biomarker in the diagnosis of HL. Their mutual regulatory interactions
provide a foundation for further studying the mechanism of HL and
identifying the potential drug targets for treatment of HL.
Methods
Collection of HL-related proteins and miRNAs
The proteins associated with HL were obtained by collecting
experimental data from published studies and searching public
databases. The experimental data mainly came from two high-throughput
proteomics-based studies that aimed to identify proteins specifically
expressed in HL-derived cells [[118]24, [119]47]. A total of 120
proteins were identified to be highly associated with HL. In order to
obtain more HL-associated proteins, we further conducted a general
database search on two databases, i.e., NCBI and Uniprot, using
“Hodgkin lymphoma and Homo sapiens” as query keywords. Altogether, 92
proteins were retained after filtering out duplicate entries. After
gathering all proteins and removing duplicate ones, we finally obtained
178 HL-associated proteins for subsequent network analysis.
HL-associated miRNAs were also obtained from the specific experimental
data and related database. Based on a miRNA microarray analysis, 77
miRNAs exclusively expressed in Hdgkin and Reed Sternberg cells were
extracted and considered to be relevant to HL. In addition, a group of
HL-associated miRNAs was obtained from dbDEMC [[120]48], a database of
differentially expressed miRNA in human cancers. Finally, a total of
121 miRNAs were included for subsequent analysis.
Construction of PPI networks
In this study, we used a three-step strategy to construct a
comprehensive and reliable protein interactions network related to HL
from the collected HL-associated proteins. Firstly, we built a
background protein interactions network that includes as many proteins
as possible. The protein-protein interaction (PPI) data for the network
was mainly extracted from five primary PPI databases, DIP [[121]49],
MINT [[122]50], IntAct [[123]51], BioGrid [[124]52] and HPRD [[125]53].
Only the experimentally validated PPI, such as physical interactions
(MI:0218), direct interactions (MI:0407) and physical associations
(MI:0915), are selected from these databases. Additional file
[126]1: Table S5 lists the respective number of PPI data from five
databases. All extracted PPI data were merged together and duplicate
data were deleted. A total of 146,295 PPI data involving 17,076
proteins were retained to construct the PPI background network.
Next, we chose the PPI data involving HL-associated proteins from the
background network and built a small network only consisting of the
HL-associated proteins. This small network can be considered as a
HL-basic network. Finally, based on the “guilt by association”
principle that two interacting proteins in a PPI network might also
share a function or involve the same disease [[127]54, [128]55], we
took the HL-associated protein in the HL-basic network as seed protein
and select their all connected nodes in the background network to
construct a expanded PPI network. This resulting network could be
considered as a comprehensive and reliable network specific to HL for
further analysis.
Identification of hub proteins
In this study, we applied the method proposed by Raval et al. [[129]29]
to identify the hub proteins in PPI network. This method is just based
on the topology of network. Firstly, all nodes in the PPI network were
ranked in decreasing order of degree. Subsequently, a succession of
subgraphs was generated by successively adding nodes in descending
order of degree. The relative connectivity of each subgraph was
calculated as the number of nodes in the largest component of a
subgraph divided by the total number of nodes in this subgraph. Because
the interactions between hubs are suppressed in the network [[130]30],
the connectivity of subgraphs consisting of hub proteins is relatively
small. With the addition of no-hub proteins into the subgraph, its
relative connectivity becomes gradually larger. Therefore, when the
connectivity of subgraphs begins to rise and eventually reaches the
connectivity of the entire network, the nodes included in this subgraph
could be considered as the hub proteins.
Generation of random networks
From the protein background network, we randomly selected the nodes
that had the same degree distribution as the network of interest.
Moreover, we also extracted the interaction between the selected nodes.
Ultimately, based on these nodes and their interaction, the random
network was generated. Compared with the network of interest, the
random network does not have any biological meaning.
Calculation of Z-score
In order to quantitatively evaluate the connection extent of the nodes
with the nodes in two respective networks, we calculate Z-score of node
using the binomial proportion test as follows:
[MATH: z=ac−
bdbd1−bd<
/mrow>d :MATH]
where a is the links of node in a network, c represents the total links
in this network. Similarly, b equals the links of this node in another
network and d is the total links in this network. If the Z-score of
node is larger than 0, it indicates this node is more highly connected
with the nodes in one network than another network and vice versa.
Analysis of PPI network
The igraph package is used to calculate clustering coefficient of
network, evaluate the small-worldness and perform modular analysis. GO
and pathway enrichment analysis are conducted using the package
clusterProfiler. All packages used in this study are run in R
environment 3.3.2. The visualization of network is performed using
Cytoscape Version 3.2.1.
Identification of miRNAs regulating HL-specific proteins
Two major databases, miRTarBase [[131]56] and miRWalk [[132]57], are
used to obtain miRNA-target interactions. Firstly, we extracted all
experimentally validated miRNA-target interactions of Homo sapiens from
two databases, respectively. Subsequently, we only selected the
interactions involving the HL-specific proteins based on the gene name.
Finally, the intersection between two data sets is retained for further
analysis.
Additional file
[133]Additional file 1:^ (336KB, doc)
Table S1. Uniprot ID and protein name for 132 hub proteins in
Background network. Table S2:The enrichment results of five
sub-networks in the HL-expanded network. Table S3. Uniprot ID of 49 key
proteins and their related information in the HL-extended network.Table
S4. Proteins mediated directly and indirectly by five core miRNAs and
the possible functions of miRNA in HL. Table S5. Number of PPI data
extracted from five databases and the database version. Figure S1.
Degree distribution of PPI background network. (DOC 336 kb)
Acknowledgements