Abstract Although more than 1 in 4 men develop symptomatic inguinal hernia during their lifetime, the molecular mechanism behind inguinal hernia remains unknown. Here, we explored the protein-protein interaction network built on known inguinal hernia-causative genes to identify essential and common downstream proteins for inguinal hernia formation. We discovered that PIK3R1, PTPN11, TGFBR1, CDC42, SOS1, and KRAS were the most essential inguinal hernia-causative proteins and UBC, GRB2, CTNNB1, HSP90AA1, CBL, PLCG1, and CRK were listed as the most commonly-involved downstream proteins. In addition, the transmembrane receptor protein tyrosine kinase signaling pathway was the most frequently found inguinal hernia-related pathway. Our in silico approach was able to uncover a novel molecular mechanism underlying inguinal hernia formation by identifying inguinal hernia-related essential proteins and potential common downstream proteins of inguinal hernia-causative proteins. Introduction In general surgery, inguinal hernia repair is one of the most routine operations worldwide. More than one in four men can expect to undergo inguinal hernia repair during their lifetime [[34]1, [35]2]. Annual health care costs directly attributable to inguinal hernia exceed $2.5 billion in the US [[36]3]. Inguinal hernias can be classified as either indirect, where the bowel herniates through a defective inguinal ring, or direct, where the bowel protrudes through the weakened lower abdominal muscle wall [[37]3–[38]6]. At times, inguinal hernias can cause severe complications, such as incarceration and strangulation and currently, surgery is the only treatment option for the management of inguinal hernia. Unfortunately, complications such as postoperative pain, nerve injury, infection, and recurrence continue to challenge surgeons and patients [[39]5, [40]7–[41]9]. Despite its prevalence in patients, the molecular mechanisms that predispose individuals to develop inguinal hernias are still unknown. There are several key risk factors for inguinal hernias. For instance, men are more predisposed to developing inguinal hernias and have a lifetime risk of 27% compared with a 3% lifetime risk in women [[42]1]. Old age is also a significant risk factor for hernia with incidence peaking between 60 and 75 years of age, with approximately 50% of men developing an inguinal hernia by the age of 75 [[43]10–[44]13]. The risk of inguinal hernia also increases among first-degree relatives of inguinal hernia patients, indicating a genetic risk factor for inguinal hernia development [[45]14, [46]15]. Additionally, individuals with connective tissue genetic diseases such as cutis laxa [[47]16], Marfan syndrome [[48]17], and Ehlers-Danlos syndrome [[49]18] have a greater risk of developing inguinal hernias. To date, only a small number of those candidate genes have been investigated [[50]19–[51]24]. Among those findings, a large genome-wide association study recently identified four novel inguinal hernia susceptibility loci in the regions of EFEMP1, WT1, EBF2, and ADAMTS6. Moreover, mouse connective tissue and network analyses showed that two of these genes (EFEMP1 and WT1) are critical for connective tissue maintenance/homoeostasis given their expression [[52]21]. However, inguinal hernia-causative genes and their corresponding proteins in the pathophysiology of inguinal hernia are still unknown. Recently, big data analysis has enabled the discovery of crucial disease-causative genes and pathogenic mechanisms by exploring publicly available Online Mendelian Inheritance in Man (OMIM) databases. Furthermore, the protein-protein interactions (PPIs) between corresponding proteins of disease-causative genes were studied by construction of the PPI network [[53]25]. The PPI network for studying human diseases has achieved noteworthy results [[54]26–[55]30]. Several previous studies showed the feasibility of computational approaches to predict gene essentiality and morbidity [[56]31–[57]34]. For example, the topological properties of PPIs have been employed to identify essential proteins in various organisms [[58]35, [59]36]. The main concept is the “centrality-lethality rule”, in which highly connected hub proteins with a central role are more essential to survival in the PPI network [[60]34]. Although there is still significant debate regarding this rule, several studies suggest a correlation between topological centrality and protein essentiality [[61]27, [62]37, [63]38]. Additionally, there may be common downstream proteins, which maximally connect with those inguinal hernia-causative proteins through either direct or indirect interaction. We applied these concepts to calculate the essentiality of each protein in the inguinal hernia-PPI network to define crucial inguinal hernia-causative proteins and their downstream proteins. In the present study, we constructed a PPI network based on inguinal hernia-causative genes imported from the OMIM database. We then identified key protein nodes of significant influence using topological network indices, namely, degree, betweenness, closeness, and eigenvector centrality. Our integration of network topological properties and protein cluster information revealed several highly ranked essential proteins related to inguinal hernia formation. We also performed the functional enrichment analysis of those essential proteins and identified several common downstream key proteins. Our results revealed the novel molecular mechanisms associated with human inguinal hernias which may serve as the potential drug targets to combat this prevalent disease. Materials and methods The analytical framework To investigate the essential and common proteins related to inguinal hernia, the analytical framework is schematically illustrated in [64]Fig 1. The whole process in this study consists of three main steps–construction, processing, and detecting: 1) Construction involves obtaining the inguinal hernia causal genes from the OMIM database and creating the inguinal hernia PPI networks by inputting the inguinal hernia-causative genes into Interologous Interaction Database (I2D); 2) Processing involves measuring topology-based features in the protein interaction, identifying clusters, and analyzing the functional enrichment and pathways; and 3) Detecting involves defining essential and common downstream proteins. Fig 1. The overall framework to detect essential and common downstream proteins. [65]Fig 1 [66]Open in a new tab Database Two main databases, OMIM and I2D, were used in this study. The OMIM database is a comprehensive research resource of curated descriptions of human genes and genetic disorders [[67]39]. I2D is a comprehensive database integrating experimental and predicted PPIs which possesses 38 well-known human protein interaction databases (e.g., Inact, BINT, HRPD, and MINT) containing over 230,000 experimental and approximately 70,000 predicted PPIs from human sources ([68]http://ophid.utoronto.ca/i2d). Identified proteins were unified using the protein IDs defined in the Uniprot database [[69]25]. The database versions of OMIM and I2D were updated in December 2017. Inguinal hernia-related genes and construction of PPI networks Our input of the term “inguinal hernia” into the OMIM database yielded a list of hereditary genes of inguinal hernia from the OMIM morbid map ([70]http://www.omim.org). Then, these gene names were submitted to the I2D database with human as the chosen target organism and the resultant PPIs were generated including predicted and experimental protein interactions. To increase the data reliability of protein interactions, all predicted homologous protein interactions were excluded. The remaining protein interactions were employed to construct the inguinal hernia PPI network in which proteins serve as nodes and protein-protein interactions serve as edges. Protein identifiers were unified using the protein IDs defined in the UniProt database [[71]24]. Because some proteins were given multiple names, the results in tables and figures were presented in the format of gene names and UniProt IDs to avoid the ambiguous referring. Identifying clusters To further understand the biological function of the PPI network generated by using inguinal hernia-causative genes, a clique percolation clustering method (CPM) [[72]30], which is a partition algorithm, was first used to identify dense subgraphs with various k-cliques (k is the size of a clique, a k-clique at k = 3 is equivalent to a triangle) and then explore overlapping clusters. A cluster is composed of a series of adjacent k-cliques, which can be reached from one another through the overlapping protein nodes. In the present study, the inguinal hernia PPIs were further analyzed using CFinder-2.0.6, an open source CPM software platform, for detecting the densely connected regions in the PPI network that have possible overlapping clustering between various k-cliques, leading to speculating their specific biological functions [[73]29]. Detecting essential proteins Essential proteins exert vital roles in cellular processes and are indispensable for survival or reproduction [[74]29, [75]40]. Essential proteins are also critical for the development of human diseases [[76]41]. Here, the topological features of the PPI network were used to detect the role of essential proteins in inguinal hernia disease. These methods are based on the centrality-lethality rule, which means essential proteins tend to form hubs (highly connected protein nodes) in the PPI network. The removal of essential proteins causes the PPI network to break down [[77]28]. Genome-wide studies show that deletion of a hub protein is more likely to be lethal than deletion of a non-hub protein [[78]27]. A number of categories for defining centralities, such as degree centrality (DC) [[79]27], closeness centrality (CC)[[80]26], betweenness centrality (BC) [[81]42, [82]43], and eigenvector centrality (EC) [[83]44], have been proposed to characterize the inguinal hernia PPI network and the participating proteins for predicting essential proteins. Suppose a PPI network is regarded as an undirected graph G (V, E) with proteins as nodes (V) and interactions as edges (E), where u represents a protein node in the PPI network and v is any protein nodes other than u in the network, four features of the inguinal hernia PPI network were characterized as follows: (1) Degree centrality (DC) measures the number of interactions that a protein has. It can be defined as the following equation [[84]27]. [MATH: DC(u)=uedg(u,v), :MATH] (Eq 1) where edg(u,v) is the interaction between u and v. if such interaction does exist, the edg(u,v) is one. If not, it is zero. (2) Closeness centrality (CC) is a measurement of how close a protein is to others. The CC of a protein node in the PPI network is considered as the reciprocal sum of its distance to all other nodes. It can be defined as the following equation [[85]26]. [MATH: CC(u)=N1 vVdis(u,v), :MATH] (Eq 2) where N is the number of the protein node in V and dis(u,v) is the distance between u and v. (3) Betweenness centrality (BC) measures the positional influence of a protein in the networks. The BC of a node k in the PPI network is defined as the relative stress centrality that can quantify the extent to which node k monitors the communication between other nodes. It can be defined as the following equation [[86]42, [87]43]. [MATH: δuv(k)=p(u,k,v)p(u,v),ukv,BC(k)=uVvVδuv(k), :MATH] (Eq 3) where δ[uv](k) is the fraction of the shortest paths that pass through the node k in the PPI network. (4) Eigenvector centrality (EC) measures the relative number of interaction connecting one protein to its surrounding proteins. The EC of a protein node in the PPI network assumes that the centrality value of a protein node depends on the values of each adjacent node, which is defined as the following equation [[88]44]. [MATH: EC(u)=emax(u), :MATH] (Eq 4) where e[max] denotes the principal eigenvector of the adjacency matrix (PPI network considered as a matrix) and e[max](u) denotes the u-th component of the principal eigenvector. Although the computation of centrality based on the network topology has become an important method for identifying essential proteins, it is difficult to identify many essential proteins that have low connectivity in the PPI network [[89]45]. Recently, the majority of studies have shown that the essentiality of proteins has a strong correlation with clusters [[90]46, [91]47], which indicates that essential proteins tend to gather in clusters. To further analyze the PPI network employing both topology features and the cluster characteristics, a novel edge clustering coefficient (ECC) algorithm was designed to better detect essential proteins [[92]46]. First, the cluster centrality of a protein i, which means the overlapping cluster number of a protein, is defined as follows: [MATH: fun_c(i)=k=1m(|e(i,j)|),i,j< mo>∈V(Ck) :MATH] (Eq 5) where fun_c(i) is the cluster centrality of a protein i, m is the number of clusters containing i, j is any proteins other than i in the PPI network, e(i, j) is an edge between i and j, C[k] is the k^th cluster (1≤k≤m), and V(C[k]) is the node set of C[k]. Next, together with cluster centrality, the PPI network topology features are added to measure the essential protein via the shape of the network [[93]46]. Suppose the most appropriate topology feature was defined as TopoCentrality, to integrate TopoCentrality(i) and cluster centrality fun_c(i), the harmonic centrality (HC) of a protein i was defined as follows: [MATH: HC(i)=δ*Top< mi>oCentrality(i)Topo Central< /mi>itymax+(1δ)*fun< mo>_c(i)fun_ cmax, :MATH] (Eq 6) where TopoCentrality(i) is the most appropriate topology centrality of i, TopoCentrality[max] is the maximum value of TopoCentrality(i), fun_c[max] is the maximum value of fun_c(i), δ is a tunable factor in the range [0,1] which is used to adjust the weights of TopoCentrality(i) and fun_c(i). Generally, δ is set to 0.5. Gene ontology and pathway enrichment analyses To further explore the biological roles of the genes in clusters, Gene Ontology (GO) term and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were conducted using the tools from the Database for Annotation, Visualization and Integrated Discovery (DAVID, Version 6.8), which is a web-based bioinformatics resource, an integrated analysis tool, and a biological knowledge base [[94]48]. GO term enrichment and KEGG pathway analyses were performed using the GO knowledgebase ([95]http://www.geneontology.org) and KEGG ([96]http://www.genome.jp/kegg/) database, respectively. Finding common downstream proteins To determine the common downstream proteins related to inguinal hernia, a novel deformation breadth-first search (DBFS) algorithm was designed. All causative proteins related to inguinal hernia in the PPI network were assumed as the destination set D. Proteins closely linked to destination set D were considered as the common downstream proteins because these proteins in the same clusters closely interact with each other to play a critical role in inguinal hernia development. The brief procedure for finding common downstream proteins are summarized here. Firstly, the DBFS algorithm found all adjacent proteins (i.e., one hop proteins) for every destination protein in set D. These one hop proteins, if they were not being visited before and were not included in the destination set D, were able to be visited. Then, all adjacent proteins (i.e., two hops proteins) of every one hop protein were searched and operated using the same rule. This step was repeated until the path length reached the constrain threshold. The detailed procedure of DBFS is shown as below: Input: G = ,G∈ inguinal hernia PPI networks; L = the constrain threshold of path length; D = (d[1],d[2],…,d[n]),d[i]∈destination protein set. InitQueue(Q); InitQueue(T); //set Q and T as queue For (i = 0; i = 0;w = NextAdj(G,u,w)) //obtain all adjacent nodes of u If ((!visited[w]) && (w not in D)) //the adjacent point w is not visited and not belongs to D Visited[w] = TRUE; Visited(w,distance); EnQueue(T,w); Endfor DeQueue(T, u); EndWhile EnQueue (T, -1); distance = distance+1; DeQueue(T,u); EndWhile EndWhile Statistical analysis To determine whether the value significantly deviates from the mean value, the modified Thompson tau technique was employed. Its basic concepts are as follows: [MATH: τ=tα /2(n1)n n2+tα/2 2 :MATH] (Eq 7) where n is the number of data points, t[α/2] is the critical student’s t value, SD is sample standard deviation, α = 0.05, and df = n-2. * > mean ± (τ·SD)/2, ** > mean ± τ·SD. Results Inguinal hernia-causative genes and the inguinal hernia PPI network construction A total of 83 inguinal hernia-causative genes were obtained from the OMIM database as shown in [97]Table 1. Based on known PPIs, the interactions of those gene products (inguinal hernia-causative proteins) were investigated by exploring the I2D database to build the inguinal hernia PPI network. To achieve this, all genes had to contain the loci with known encoding protein profiles in Uniprot. After removal of four genes without corresponding known coding proteins (i.e., SRS, DIH2, ICR1, and H19) from the inguinal hernia-causative gene list, 79 genes were used to explore the I2D database. Our input of inguinal hernia-related proteins into I2D yielded 8,215 interactions in inguinal hernia PPI networks. After removal of homologous predicted interactions, 4,201 interaction edges accompanied by 2,666 protein nodes were eventually utilized to construct the inguinal hernia PPI networks ([98]Fig 2). Table 1. Inguinal hernia-causative genes from OMIM. Uniprot ID Gene Uniprot ID Gene Uniprot ID Gene Uniprot ID Gene [99]P42345 MTOR [100]O75369 FLNB [101]P53667 LIMK1 [102]P49918 CDKN1C [103]Q02809 PLOD1 [104]Q7Z494 NPHP3 [105]P08123 COL1A2 [106]P19544 WT1 [107]P98160 HSPG2 [108]P58012 FOXL2 [109]P13569 CFTR [110]O95967 EFEMP2 [111]P60953 CDC42 [112]O00469 PLOD2 [113]P43694 GATA4 [114]P50454 SERPINH1 [115]Q8TAD8 SNIP1 [116]Q8NEZ3 WDR19 [117]P13497 BMP1 [118]O60706 ABCC9 [119]Q04721 NOTCH2 [120]P16234 PDGFRA [121]P11362 FGFR1 [122]P01116 KRAS [123]P35354 PTGS2 [124]Q86XX4 FRAS1 [125]Q8WW38 ZFPM2 [126]P02458 COL2A1 [127]P00797 REN [128]Q99697 PITX2 [129]P07951 TPM2 [130]P13647 KRT5 [131]O95259 KCNH1 [132]Q9NQX1 PRDM5 [133]Q01974 ROR2 [134]Q16671 AMHR2 [135]P61812 TGFB2 [136]P27986 PIK3R1 [137]P37058 HSD17B3 [138]Q06124 PTPN11 [139]Q07889 SOS1 [140]Q16637 SMN1 [141]P36897 TGFBR1 [142]Q5SZK8 FREM2 [143]P22888 LHCGR [144]P50443 SLC26A2 [145]Q13285 NR5A1 [146]Q9Y625 GPC6 [147]Q12805 EFEMP1 [148]A1X283 SH3PXD2B [149]P20908 COL5A1 [150]O95455 TGDS [151]P02461 COL3A1 [152]O95450 ADAMTS2 [153]P52895 AKR1C2 [154]P12644 BMP4 [155]P05997 COL5A2 [156]P35916 FLT4 [157]P17516 AKR1C4 [158]P10600 TGFB3 [159]O14793 MSTN [160]Q99519 NEU1 [161]P07949 RET [162]Q9UBX5 FBLN5 [163]O00755 WNT7A [164]P22105 TNXB [165]P36894 BMPR1A -- SRS [166]P37173 TGFBR2 [167]Q9ULC3 RAB23 [168]P54886 ALDH18A1 -- DIH2 [169]P16278 GLB1 [170]Q9NWM8 FKBP14 [171]Q02962 PAX2 -- ICR1 [172]Q9BWF2 TRAIP [173]Q8N8U9 BMPER [174]P05093 CYP17A1 -- H19 [175]P41221 WNT5A [176]P15502 ELN [177]P01112 HRAS [178]Open in a new tab -- means no-Uniprot-ID Fig 2. PPI networks related to inguinal hernia with 4,201 interactions and 2,666 protein nodes. [179]Fig 2 [180]Open in a new tab Red circles represent protein nodes and blue lines indicate the protein-protein interactions. Cluster analysis To further determine inguinal hernia-causative proteins based on the overlapping clusters (cliques) in the inguinal hernia PPI networks, cluster analysis was conducted using the ClusterOne plug-in of the CFinder 2.0.6 software. The clusters with densely connected nodes in the inguinal hernia PPI network were detected. The numbers of clusters were 784, 245, and 62, which corresponded to the clique percolation parameter k = 3, 4, and 5, respectively. A network diagram of clusters at k = 4 was shown in [181]Fig 3. The overlapping cluster numbers for a protein that participated in clusters are shown in [182]Table 2. PIK3R1, PTPN11, SOS1, TGFBR1, TGFBR2, CDC42, KRAS, HRAS, RET, and PDGFRA were listed as the top ten proteins based on the overlapping cluster number of hernia-causative genes, in which PIK3R1 and PTPN11 were significantly involved in the inguinal hernia PPI network. Fig 3. The clusters of inguinal hernia-causative genes in the PPI network. [183]Fig 3 [184]Open in a new tab 245 clusters at k = 4. The yellow core clusters are defined by significant involvement ranking calculated in [185]Table 2 using the Thompson Tau test. Table 2. Top 20 inguinal hernia-causative proteins based on the number of overlapping clusters. Rank Uniprot Protein Cluster # Rank Uniprot Protein Cluster # 1 [186]P27986 PIK3R1 207[187]^** 11 [188]P11362 FGFR1 23 2 [189]Q06124 PTPN11 127[190]^* 12 [191]P0CG48 UBC 19 3 [192]Q07889 SOS1 101 13 [193]O75369 FLNB 16 4 [194]P36897 TGFBR1 68 14 [195]Q13285 NR5A1 14 5 [196]P37173 TGFBR2 56 15 [197]P43694 GATA4 11 6 [198]P60953 CDC42 50 16 [199]P61812 TGFB2 11 7 [200]P01116 KRAS 40 17 [201]P62993 GRB2 11 8 [202]P11112 HRAS 37 18 [203]P36894 BMPR1A 9 9 [204]P07949 RET 26 19 [205]P10600 TGFB3 8 10 [206]P16234 PDGFRA 23 20 [207]P35916 FLT4 8 [208]Open in a new tab * > mean ± (τ·SD)/2 ** > mean ± τ·SD Detecting essential proteins Four topological features (i.e., DC, CC, BC, and EC) were calculated for identifying the essential proteins in the inguinal hernia PPI network. DC, CC, BC, and EC were weighted equally when calculating the essential proteins. The top 20 of those essential proteins were ranked and shown in [209]Table 3. PIK3R1 ([210]P27986), CDC42 ([211]P60953), CTFR ([212]P13569), TGFBR1 ([213]P36897), and PTPN11 ([214]Q06124) were the top five proteins with visibly higher DC. These results suggest the importance and extensive involvement of these proteins in inguinal hernia pathogenesis. In addition, the top five proteins with higher CC were UBC ([215]P0CG48), PIK3R1 ([216]P27986), CDC42 ([217]P60953), TGFBR1 ([218]P36897), and PTPN11 ([219]Q06124); with higher BC were PIK3R1 ([220]P27986), UBC ([221]P0CG48), CDC42 ([222]P60953), CTFR ([223]P13569), and TGFBR1 ([224]P36897); and with higher EC were PIK3R1 ([225]P27986), PTPN11 ([226]Q06124), CDC42 ([227]P60953), TGFBR1 ([228]P36897), and SOS1 ([229]Q07889). Consequently, PIK3R1, CDC42, and TGFBR1 were always listed in the top five proteins for all topological categories, and PTPN11 was listed in the top five proteins under three topological categories (DC, CC, and EC). UBC was listed as the first and second protein in CC and BC, respectively. CTFR was listed as the third and the fourth protein under two topological categories (DC and BC), and SOS1 was listed as the fifth protein in the topological category EC. Thus, proteins PIK3R1, CDC42, TGFBR1, PTPN11, UBC, CTFR, and SOS1 may play an important role in the inguinal hernia PPI network. Table 3. Top 20 inguinal hernia-causative proteins ranked by DC, CC, BC, and EC. Rank Uniprot DC Uniprot CC Uniprot BC Uniprot EC 1 [230]P27986 345 [231]P0CG48 0.4685 [232]P27986 1419876.10 [233]P27986 0.4584 2 [234]P60953 294 [235]P27986 0.4210 [236]P0CG48 1182660.50 [237]Q06124 0.2933 3 [238]P13569 264 [239]P60953 0.3880 [240]P60953 1163367.80 [241]P60953 0.2296 4 [242]P36897 264 [243]P36897 0.3867 [244]P13569 1051267.10 [245]P36897 0.1914 5 [246]Q06124 241 [247]Q06124 0.3778 [248]P36897 1025756.25 [249]Q07889 0.1813 6 [250]Q16637 226 [251]P37173 0.3737 [252]Q16637 906896.44 [253]P01116 0.1387 7 [254]P42345 191 [255]Q07889 0.3732 [256]Q06124 789004.06 [257]P01112 0.1352 8 [258]P01116 173 [259]Q13501 0.3689 [260]P42345 636229.44 [261]P0CG48 0.1323 9 [262]P01112 138 [263]P01112 0.3632 [264]P01116 589670.75 [265]P37173 0.1093 10 [266]P13647 126 [267]P01116 0.3617 [268]P01112 426118.50 [269]P07949 0.0993 11 [270]Q07889 126 [271]P62993 0.3617 [272]P13647 395877.22 [273]P13569 0.0989 12 [274]P50454 96 [275]P16234 0.3560 [276]Q07889 322209.06 [277]P42345 0.0963 13 [278]O75369 88 [279]P07949 0.3552 [280]P50454 301589.00 [281]P62993 0.0943 14 [282]P37173 87 [283]P22681 0.3549 [284]O95967 278135.72 [285]P11362 0.0859 15 [286]Q13285 69 [287]P11362 0.3541 [288]O75369 269789.28 [289]O75369 0.0835 16 [290]O95967 68 [291]P13569 0.3538 [292]Q13285 231566.27 [293]P16234 0.0804 17 [294]P11362 64 [295]O75369 0.3524 [296]P37173 212951.39 [297]P22681 0.0756 18 [298]P36894 62 [299]Q16637 0.3490 [300]Q8TAD8 178868.39 [301]P35222 0.0693 19 [302]P54886 58 [303]Q13285 0.3485 [304]P11362 172819.90 [305]Q16637 0.0691 20 [306]Q8TAD8 56 [307]P49841 0.3482 [308]P36894 156695.28 [309]P12931 0.0666 [310]Open in a new tab As expected, we observed different proteins present under various topological features, because each topological feature depicts only certain features of the PPI network and cannot include the entire topological information of the PPI network. Thus, we further identified the essential proteins in the PPI network using a comprehensive edge clustering coefficient (ECC) algorithm. The ECC method considers both the clustering characteristics and the topological features of a protein. Because the shape of the inguinal hernia PPI network is similar to a star topology as shown in [311]Fig 1, DC is more suitable to find essential proteins. As a result, the topological feature of harmonic centrality (HC) calculating from the ECC algorithm was used to further define DC. The essential proteins ranked by HC are shown in [312]Fig 4, in which PIK3R1, PTPN11, TGFBR1, CDC42, SOS1, and KRAS are shown as the significantly enriched top-ranking hub proteins in the inguinal hernia PPI network using the Thompson Tau test. Together, our results show that PIK3R1, PTPN11, TGFBR1, CDC42, and SOS1 are probably the most essential proteins involved in human hernia formation. Fig 4. Inguinal hernia-related essential proteins ranked by HC. [313]Fig 4 [314]Open in a new tab * > mean ± (τ·SD)/2, ** > mean ± τ·SD. Gene ontology and pathway enrichment analysis We performed functional enrichment analysis including GO term enrichment and KEGG pathway analysis on 784 clusters, k = 3 of the inguinal hernia PPI network. In GO term analysis, three categories were used including biological processes, cellular components, and molecular function. The top ten GO terms in three categories are described in [315]Table 4 with p values < 0.01. The top seven significant terms in the biological processes category were peptidyl-tyrosine phosphorylation, transmembrane receptor protein tyrosine kinase (RTK) signaling pathway, vascular endothelial growth factor (VEGF) receptor signaling pathway, transforming growth factor beta (TGFβ) receptor signaling pathway, signal transduction, MAPK cascade, and regulation of phosphatidylinositol 3-kinase (PI3K) signaling. The Jak-STAT signaling pathway, insulin signaling pathway, fibroblast growth factor (FGF) receptor signaling pathway, and estrogen signaling pathway were also significantly enriched in this category of the GO term analysis ([316]S1 Table). The most significant term was cytosol (p = 4.96E-37) in the cellular component category. The top five significant terms in the molecular function category were protein binding, protein tyrosine kinase activity, transmembrane RTK activity, phosphatidylinositol-4,5-bisphosphate 3-kinase activity, and protein kinase binding. These results show that the signaling pathways of growth factors, insulin, estrogen, transmembrane RTK, MAPK, and PI3K may play a vital role in the pathogenesis of inguinal hernia. Table 4. The most significantly enriched GO terms. Category Terms P-value Biological Process peptidyl-tyrosine phosphorylation 2.89E-41 transmembrane receptor protein tyrosine kinase signaling pathway 1.64E-32 vascular endothelial growth factor receptor signaling pathway 8.75E-31 transforming growth factor beta receptor signaling pathway 3.33E-30 signal transduction 2.35E-29 MAPK cascade 2.77E-29 regulation of phosphatidylinositol 3-kinase signaling 4.29E-28 phosphatidylinositol-mediated signaling 9.76E-27 epidermal growth factor receptor signaling pathway 3.87E-26 positive regulation of GTPase activity 1.39E-25 Cellular Component cytosol 4.96E-37 plasma membrane 5.80E-32 focal adhesion 3.01E-29 cell-cell junction 3.82E-20 receptor complex 5.36E-19 membrane raft 5.48E-17 extrinsic component of cytoplasmic side of plasma membrane 5.49E-16 cytoplasm 1.07E-14 cell-cell adherens junction 1.49E-14 cell surface 1.81E-13 Molecular Function protein binding 6.13E-56 protein tyrosine kinase activity 6.68E-40 transmembrane receptor protein tyrosine kinase activity 2.11E-29 phosphatidylinositol-4,5-bisphosphate 3-kinase activity 4.31E-28 protein kinase binding 1.07E-24 Ras guanyl-nucleotide exchange factor activity 8.92E-22 protein phosphatase binding 1.03E-21 ATP binding 8.75E-21 phosphatidylinositol-3-kinase activity 3.34E-16 phosphatidylinositol 3-kinase binding 2.85E-15 [317]Open in a new tab The KEGG pathway enrichment analysis revealed the significantly enriched genes in the enriched pathways ([318]S1 Table). The top 15 enriched pathways are shown in [319]Table 5. The most significantly changed pathway was proteoglycans in cancer. The next two pathways were pathways in cancer and the ErbB signaling pathway. In addition, the PI3K-Akt signaling pathway, MAPK signaling pathway, insulin signaling pathway, VEGF signaling pathway, TGFβ signaling pathway, Jak-STAT signaling pathway, and estrogen signaling pathway were listed as the 15^th, 26^th, 34^th, 36^th, 40^th, 48^th, and 50^th significantly enriched pathways, respectively ([320]S1 Table). All of these pathways are related to cell growth and proliferation. Among the top 15 enriched pathways, essential proteins PIK3R1, PTPN11, TGFBR1, CDC42, and SOS1 were involved in 14, 5, 5, 9, and 11 pathways, respectively ([321]Table 5). Table 5. Most significantly enriched pathways determined by the KEGG pathway enrichment analysis and the involvement of top five essential proteins in the top 15 enriched pathways. Terms P-value Top five essential proteins PIK3R1 PTPN11 TGFBR1 CDC42 SOS1 Proteoglycans in cancer 1.12E-41 + + - - + Pathways in cancer 5.98E-41 + - + + + ErbB signaling pathway 9.34E-33 + - - - - Focal adhesion 9.72E-32 + - - + + Ras signaling pathway 1.00E-31 + + - + + Rap1 signaling pathway 2.68E-31 + - - + - Chronic myeloid leukemia 1.37E-30 + + + - + Pancreatic cancer 1.56E-26 + - + + - Neurotrophin signaling pathway 1.35E-25 + + - + + Adherens junction 7.97E-24 - - + + - Renal cell carcinoma 9.88E-24 + + - + + Fc epsilon RI signaling pathway 4.25E-23 + - - - + Regulation of actin cytoskeleton 4.57E-23 + - - + + FoxO signaling pathway 1.18E-22 + - + - + PI3K-Akt signaling pathway 2.54E-22 + - - - + [322]Open in a new tab Common downstream proteins To investigate how these functionally diverse pathogenic proteins led to inguinal hernia formation, we further analyzed inguinal hernia PPI profiles using the DBFS algorithm to identify common downstream proteins of 79 inguinal hernia-causative proteins ([323]Table 1). The top 21 common downstream proteins are shown in [324]Table 6, in which UBC, GRB2, CTNNB1, HSP90AA1, PLCG1, CBL, and CRK were listed as the top 7 common downstream proteins ([325]Fig 5). Most importantly, UBC, GRB2, and CTNNB1 were significantly enriched in the inguinal hernia PPI network. ([326]Table 6). Table 6. Top 21 common downstream proteins defined by the number of direct interactions with inguinal hernia-causative proteins. Rank Uniprot Protein Interacting # Rank Uniprot Protein Interacting # 1 [327]P0CG48 UBC 50[328]^** 11 [329]O00459 PIK3R2 8 2 [330]P62993 GRB2 19[331]^* 12 [332]P00533 EGFR 8 3 [333]P35222 CTNNB1 12[334]^* 13 [335]P05067 APP 8 4 [336]P07900 HSP90AA1 10 14 [337]P45983 MAPK8 8 5 [338]P19174 PLCG1 10 15 [339]P84022 SMAD3 8 6 [340]P22681 CBL 10 16 [341]Q13547 HDAC1 8 7 [342]P46108 CRK 10 17 [343]P01137 TGFB1 7 8 [344]P12931 SRC 9 18 [345]P12830 CDH1 7 9 [346]P29353 SHC1 9 19 [347]P40763 STAT3 7 10 [348]P49841 GSK3B 9 20 [349]Q03135 CAV1 7 21 [350]Q15796 SMAD2 7 [351]Open in a new tab * > mean ± (τ·SD)/2 ** > mean ± τ·SD Fig 5. The detailed interactive profiles of top 7 common downstream protein of inguinal hernia-causative proteins. [352]Fig 5 [353]Open in a new tab These common downstream proteins, UBC (A), GRB2 (B), CTNNB1 (C), HSP90AA1 (D), PLCG1 (E), CBL (F), and CRK (G), are highlighted in purple. The green proteins directly interacted with purple common downstream proteins. Blue proteins are the secondary contact proteins to purple common downstream proteins. The shortest distance between blue proteins and purple protein is 2. Discussion Inguinal hernia is a multifactorial disease caused by endogenous factors including age, gender, anatomic variations, and inheritance as well as exogenous factors such as smoking, comorbidity, and outcomes from surgery [[354]14]. Recently, we found that the conversion of testosterone to estradiol by the aromatase enzyme in lower abdominal muscle tissue in a humanized aromatase transgenic mouse model activates pathways for fibroblast proliferation and fibrosis, leading to intense lower abdominal muscle fibrosis, muscle atrophy, and inguinal hernia; fortunately, an aromatase inhibitor entirely prevents this phenotype [[355]49]. In the present study, we explored the inherited aspects of inguinal hernia formation via data mining of inguinal hernia-causative genes exported from the OMIM database. Five essential proteins (PIK3R1, PTPN11, TGFBR1, CDC42, and SOS1) and three downstream common proteins (UBC, GRB2, and CTNNB1) were found to be related to inguinal hernia development. We also found that the signaling pathways of growth factors, transmembrane RTK, MAPK, and PI3K are highly associated with inguinal hernia disease. Furthermore, this data mining technique can be utilized for the analysis of the PPI networks of various human diseases to identify critical essential proteins contributing to human diseases. The RTK pathways were shown in the biological process category of enriched GO terms including signaling pathways for VEGF receptor, TGFβ receptor, epidermal growth factor receptor, and insulin. The common downstream signaling of the RTK pathways such as the MAPK cascade and PI3K signaling was also found in the GO biological process analysis [[356]50]. Furthermore, these growth factor-mediated RTK pathways were listed in the top 40 significantly enriched pathway via the KEGG pathway enrichment analysis. The GO term analysis also indicated that protein tyrosine kinase activity, transmembrane RTK activity, phosphatidylinositol-4,5-bisphosphate 3-kinase activity, as well as binding and activity of PI3K were related to human inguinal hernia diseases. Additionally, others also show that inguinal hernia-related essential proteins, PTPN11, CDC42, and SOS1 regulate the RAS/MAPK signaling pathway, which is another downstream effector of RTKs [[357]51–[358]54]. Furthermore, essential proteins, PIK3R1 and TGFBR1 are involved in the regulation of both the PI3K/AKT and RAS/MAPK signaling pathways [[359]55–[360]57]. Together, all of these analyses suggest that the RTK pathways such as the PI3K and MAPK pathways may play a critical role in the development of inguinal hernias. To further reveal the genetic mechanism behind inguinal hernia, the DBFS algorithm was used to detect a series of common downstream proteins that directly interacted with inguinal hernia-causative proteins, in which UBC, GRB2, CTNNB1, HSP90AA1, PLCG1, CBL, and CRK are listed as the top seven downstream common proteins. Ubiquitin C, encoded by the UBC gene, maintains the cellular ubiquitin levels during stress. Protein ubiquitylation plays a key role in the regulation of multiple cellular events including the recognition of interacting proteins [[361]58]. It is not surprising that UBC was highly positioned on the list of the downstream proteins related to inguinal hernia-causative proteins. It was also the highest ranked protein measured by CC and BC and may not have many direct neighbors as measured by DC. According to the previous study [[362]28], if a node had low DC and high BC and EC, it would locate “centrality”, where the exchange and transition of multitudes of data and resources allow the centrality node to arrive prior to the other nodes in the PPI network because of its short-distance path. Thus, UBC is likely to be adjacent to numerous inguinal hernia causative proteins. Furthermore, Akt ubiquitination plays an important role in the activation of Akt signaling [[363]59]. GRB2 (Growth factor receptor-bound protein 2) connects RTKs to RAS, leading to the activation of the RAS/MAPK pathway [[364]60]. Another common downstream protein, CTNNB1 (Beta-catenin) is an essential part of the Wnt signaling pathway and is regulated by the PI3K/Akt pathways [[365]61]. HSP90AA1 (heat shock protein 90 alpha family class A member 1) promotes autophagy and inhibits apoptosis through the PI3K/Akt/mTOR pathway [[366]62]. PLCG1 (Phospholipase C, gamma 1) can be activated by RTKs [[367]63]. CBL is an E3 ubiquitin-protein ligase involved in the formation of a covalent bond between ubiquitin and RTKs, leading to RTK protein ubiquitination and downregulation [[368]64]. CRK is an adapter protein that binds to several tyrosine-phosphorylated proteins [[369]65]. As discussed above, all of the top seven common downstream proteins are linked to the RTK pathways. A major downstream mediator of RTKs, PI3K is a family of related intracellular signal transducer enzymes involved in cell growth, proliferation, differentiation, motility, survival, and intracellular trafficking [[370]66]. It can be stimulated by diverse oncogenes and growth factor receptors such as receptors for insulin, insulin-like growth factor 1, TGFβ, VEGF, and platelet-derived growth factor. Estrogen can also up-regulate PI3K signaling [[371]67]. The PI3K family is divided into three different classes (I, II, and III)[[372]68]. Class IA PI3K is a heterodimeric enzyme containing a p85 regulatory and a p110 catalytic subunit. In the basal state, the interaction between p85 and p110 stabilizes and inhibits p110 catalytic activity. Upon activation by growth factors or other signals, ligand binding to RTKs promotes receptor activation. The p85 subunit binds to activated RTKs, recruiting p110 to the plasma membrane, where a conformational change induced by binding relieves inhibition of p110 catalytic activity. Three genes, PIK3R1, PIK3R2, and PIK3R3, encode the p85α, p85β, and p55γ isoforms of the p85 regulatory subunit of PI3K, respectively. The PIK3R1 gene also gives rise to two shorter isoforms, p55α and p50α, through alternative transcription-initiation sites [[373]50, [374]66]. Interestingly, we found that PIK3R1 was listed as the first essential protein in relation to inguinal hernia disease. A previous study demonstrated that heterozygous mutations of PIK3R1(R649W) and the resultant impairment of the PI3K activation have been identified in patients with SHORT syndrome—a disorder characterized by Short stature, Hyperextensibility of joints and/or inguinal hernia, Ocular depression, Reiger anomaly and Teething delay [[375]69]. This finding further emphasizes that PIK3R1 in the PI3K pathway is closely associated with the development of inguinal hernia. Inguinal hernia formation is associated with increased lower abdominal muscle tissue fibrosis and muscle atrophy [[376]49]. The exact role of these essential and common downstream proteins and the related signaling pathway in fibroblasts and myocytes of lower abdominal muscle tissue is unknown. Future studies will reveal the underlying molecular mechanisms for these proteins and pathways in fibroblast proliferation and fibrosis, myocyte function, and hernia formation and further target these essential and common downstream proteins for developing novel pharmacological approaches for preventing and treating recurrent inguinal hernia in high risk individuals. Our data along with previous findings regarding the effect of inhibitory mutation of PIK3R1 on SHORT syndrome-associated inguinal hernia indicate that the PI3K pathway, especially the essential protein, PIK3R1 is necessary for inguinal hernia development. A previous study showed that direct targeting of PIK3R1 in hepatic stellate cells inhibits liver fibrosis, indicating that PIK3R1 is probably participating in hernia-associated fibrosis [[377]70]. In addition, overexpression of pik3r1 in rat myotubes reduced insulin-stimulated PI3K/AKT activation [[378]71], and pik3r1 overexpression in mice decreased skeletal muscle insulin signaling [[379]72]. Mice lacking both pik3r1 and pik3r2 in skeletal muscles exhibited severely impaired PI3K signaling in those muscles [[380]73]. These animals showed reduced myocyte size and insulin-resistance in their skeletal muscles, demonstrating that in vivo class IA PI3K is both a vital regulator of muscle growth and a critical mediator of insulin signaling in the muscle. With these findings, we are planning to selectively delete PIK3R1 in fibroblasts and/or myocytes to define the relative roles of PIK3R1 in fibroblasts and myocytes for maintaining abdominal muscle function and in pathologic processes such as fibrosis, atrophy, and hernia formation. In summary, the present study deeply analyzed the protein-protein interaction on known inguinal hernia-causative genes from the OMIM database. Several essential proteins and common downstream proteins related to inguinal hernia diseases have been identified. The downstream signaling pathways of activated RTKs have been found to be highly associated with inguinal hernias. In the future, we will further determine how these essential proteins and the RTK signaling pathways such as the PI3K/Atk pathway contribute to the pathogenesis of the inguinal hernia formation. Supporting information S1 Table. The GO term and KEGG pathway enrichment analysis on the 784 clusters, k = 3 of the inguinal hernia PPI network. (XLSX) [381]Click here for additional data file.^ (199.2KB, xlsx) Data Availability All relevant data are within the manuscript and its Supporting Information files. Funding Statement The authors received no specific funding for this work. References