Abstract Background: The incidence of Oral Cancer (OC) is high in Asian countries, which goes undetected at its early stage. The study of genetics, especially genetic networks holds great promise in this endeavor. Hub genes in a genetic network are prominent in regulating the whole network structure of genes. Thus identification of such genes related to specific cancer types can help in reducing the gap in OC prognosis. Methods: Traditional study of network biology is unable to decipher the inter-dependencies within and across diverse biological networks. Multiplex network provides a powerful representation of such systems and encodes much richer information than isolated networks. In this work, we focused on the entire multiplex structure of the genetic network integrating the gene expression profile and DNA methylation profile for OC. Further, hub genes were identified by considering their connectivity in the multiplex structure and the respective protein-protein interaction (PPI) network as well. Results: 46 hub genes were inferred in our approach with a high prediction accuracy (96%), outstanding Matthews coefficient correlation value (93%) and significant biological implications. Among them, genes PIK3CG, PIK3R5, MYH7, CDC20 and CCL4 were differentially expressed and predominantly enriched in molecular cascades specific to OC. Conclusions: The identified hub genes in this work carry ontological signatures specific to cancer, which may further facilitate improved understanding of the tumorigenesis process and the underlying molecular events. Result indicates the effectiveness of our integrated multiplex network approach for hub gene identification. This work puts an innovative research route for multi-omics biological data analysis. Keywords: Hub genes, Multiplex network, Multi-omics data, Oral cancer __________________________________________________________________ Hub genes; Multiplex network; Multi-omics data; Oral cancer 1. Introduction Cancer has emerged as a deadly disease killing people worldwide. Head and neck squamous cell carcinoma (HNSCC) is the most occurring cancer in south Asia with an annual incidence of more than 58% in India [29]Kulkarni (2013); [30]Joshi et al. (2014). Oral cancer (OC) is the most common subtype of HNSCC. In India, more than two-third of the oral cancer patients present in the advanced stage of cancer [31]Kulkarni (2013). Despite the significant progress in molecular diagnosis, early detection of OC is still not available. Early detection of any cancer type is challenging [32]Schiffman et al. (2015). Hence, precisely predicting the disease will add significant value in treating patients. Historically it is known that cancer occurs due to mutation in genes [33]Hanahan and Weinberg (2011). This boosts the extensive analysis of genetic networks using data mining algorithms and machine learning tools for the identification of significant genes involved in the disease progression. Furthermore, the study of gene expression levels to detect or discriminate cancer or its subtypes justifies the existence of a strong correlation between gene expression and disease types [34]Marisa et al. (2013); [35]Randhawa and Acharya (2015). However, gene expression data is limited with high noise, fewer number of samples in comparison to a large number of genes. DNA methylation is a heritable epigenetic mark that is essential for normal cell development [36]Jin et al. (2011). Aberrant methylation pattern significantly modulates gene expression levels and results in the development of cancer [37]Wajed et al. (2001). Hence, a combined study of gene expression and DNA methylation pattern can help in identifying potential genetic indicators of cancer and allow for prevention and early detection, which is crucial to cancer treatment. In network theory, a group of highly connected nodes is known as hub nodes and these sets of nodes are found to be significant in many networks. Hub nodes in a genetic network are expected to play an influential role in terms of regulating the whole network structure. Langfelder et al. addressed the identification of intramodular hubs in multiple genomic networks [38]Langfelder et al. (2013). The selected hub genes were biologically meaningful and can be considered for prognostic applications. The growing availability of diverse omics data enhances the study of biological systems at different levels. In this context, different integrative pipelines of multi-omics data analysis have been suggested. Study reveals that transcriptomic data along with the know-how of the existing gene network provides a new way of exploring the genetic associations for candidate gene selection and functional module identification. This combination gives in-depth biological interpretation and enhanced statistical analysis. A dense subgraph based integrated approach was deployed in [39]Swarnkar et al. (2015), that combined expression analysis of genes with protein-protein interaction (PPI) networks to identify functionally enriched genes. The extracted dense subgraphs spoke for densely connected and highly co-expressed clusters from biological networks. [40]Gadaleta et al. (2017) integrated gene expression data and DNA methylation data for glioblastoma multiforme using Regression2Net and identified potential candidate genes showing significant over representation in different cancer related pathways. Eight most connected hub genes and candidate differentially expressed genes were identified in [41]Liu et al. (2018) by integrating multiple cohort profile data sets of B-cell lymphoma. The identified biomarkers were associated with significant biological communications and could be the therapeutic targets to mark the prognostic difference between the subtypes of B-cell lymphoma. A gene co-expression network based approach was suggested in [42]Randhawa and Acharya (2015) to identify few vital genes with prognostic significance in various stages of Oral Squamous Cell Carcinoma (OSCC). Predictive accuracy of 81% was observed for the selected genes when tested against the developed predictive model to classify the preliminary and advanced stage of OSCC tumor. Lei Chen et al. proposed an integrated computational method based on random walk with restart and shortest path algorithms to identify novel genes associated with oral squamous cell carcinoma which opens an opportunity to the in-depth study of OC [43]Chen et al. (2017). Data from different experiments, web sources, and PPIs were integrated in [44]Kumar et al. (2017) to identify candidate genes of OC. The 39 selected candidate genes from this approach were significantly enriched with different biological processes involved in OC. [45]Li et al. (2018) compared the mean expression levels of genes between diseased and control samples to identify differentially expressed and differentially variable genes for OSCC. In our previous work [46]Mahapatra et al. (2018), we had proposed an integrated pipeline of biological networks for the identification of candidate genes specific to a disease. In the first step of the pipeline, the constructed network of co-expressed genes in combination with a known PPI network resulted in an induced network that preserved the interplay and relationship of genes/proteins. Further, densely connected modules were extracted and hub genes in the modules were identified. The selected dense modules were observed to be biologically significant and statistically significant with high predictive ability. Nevertheless, the in-depth analysis of biological networks holds a great promise in the identification of such candidate biomarkers and plays a decisive role in deciphering the biological mechanisms behind complex diseases. However, these biological networks consist of different types of links among the interacting units, each link encoding for a type of interaction. As a result, these networks show a higher complexity describing inter-dependencies within and across various networks which may fluctuate. Subsequently, it is not able to record the information characterizing the complex biological system [47]Doncheva et al. (2018). Traditional study of genetic networks assumes that gene-gene associations are exclusive, unvarying and encapsulating all information between them. Simultaneously it ignores the presence of multiple types of gene-gene interactions (multiplexity), which may lead to misleading of results and generation of false positives [48]De Domenico et al. (2013); [49]Kanawati (2015). In this context, multiplex network provides a way to model and analyze such type of complex real world networks by integrating multiple types of interactions among different biological units. A multiplex framework was proposed to study the effect of social relationships on employee performance, which represented the entire employee social network as a multiplex structure [50]Cai et al. (2018). In comparison to the singleton network model, better reasoning of employee performance was found through the multiplex network model that integrated the various types of social relations. Finn et al. outlined the use of multilayer network in animal behavior analysis [51]Finn et al. (2019). The effectiveness of the multilayer network was analyzed to uncover eco-evolutionary dynamics of animal social behavior. Chierici et al. [52]Chierici et al. (2020) proposed an integrative network fusion pipeline, using Similarity Network Fusion (SNF) in a machine learning framework for biomarker identification in cancer. The exploitation of multilayer network analysis in biological networks is sparse; however, clearly relevant with the context. In a very recent study, a constraint-based PageRank algorithm: iRank was proposed to prioritize cancer genes for hepatocellular carcinoma (HCC) using a multiplex network of omics data [53]Shang and Liu (2020). Gene Regulatory Network (GRN) was utilized as a core-level network in the multiplex system and other omics level networks were used to define interactions across different levels (DNA level, RNA level). The identified cancer genes were observed with improved rank and better classifier performance by empirically integrating more levels of omics data. Wang et al. proposed a multiplex network based data integrative approach and identified 3 subtypes for glioblastoma multiforme (GBM) and breast invasive carcinoma (BIC) [54]Wang et al. (2016). The result obtained was highly consistent with the state-of-the-art techniques (Normalized Mutual Information>0.8). A multiple network based algorithm: EMDN, was proposed in [55]Ma et al. (2017) that integrates multiple networks of gene differential co-expression and gene differential co-methylation to identify the epigenetic modules. EMDN algorithm proceeds in three steps: (i) prioritizing the seed (ii) identification of module by seed expansion (iii) refinement of candidate modules. The spotted epigenetic modules were highly enriched by known biological pathways and can serve as biomarkers to predict breast cancer subtypes. The multi-network analysis of different types of omics data was highly effective in correlating and integrating multiple data levels and discovering complex disease patterns with multiple facets. To the best of our knowledge, the outlined studies of biomarker identification for OC potentially investigated the driving force of a gene on a single omics level for cancer progression. Gene-gene co-expression was adopted as the most common type of correlation to understand the network of genes. The comprehensive contribution of gene correlation at multiple biological levels was not considered. In the present study, we proposed a multiplex network based integration strategy for the identification of the hub genes for Oral Cancer (OC). The novelty in our approach lies in (1) Construction of three integrated networks for analyzing the comprehensive contribution of gene correlations across transcriptomic and epigenetic levels in OC. (2) Identification of few hub genes fulfilling the defined ensemble norm of multi-degree centrality and intramodular connectivity. Concretely, individual networks of gene co-expression (GCoExNW) and gene co-methylation (GCoMythNW) were constructed out of 120 samples of gene expression and DNA methylation data. Further, along with the multiplex network, an union and intersection networks were constructed by integrating the co-expression and co-methylation relation between genes. Hub genes were identified from the constructed integrated networks employing two different selection criteria defined in the following sections. Extracted hub gene sets from the constructed multiplex network were compared with the hub genes identified from the union network and intersection network. The workflow of our proposed approach is illustrated in [56]Fig. 1. The identified hub genes were evaluated in terms of: i) Statistical Competence ii) Differential Expression Analysis iii) Biological significance analysis. Figure 1. [57]Figure 1 [58]Open in a new tab Systematic illustration of pipeline for Hub Gene selection using integrated multiplex network based approach. Preprocessing step removes noise present in the data; Correlation NW construction step constructs two separate networks of gene co-expression and co-methylation; Integrated network Construction constructs three different integrated networks using the individual networks of gene co-expression and co-methylation; Hubgene identification step filters significant hub genes from three integrated networks; Finally the last step is statistical and biological validation of handpicked hub genes. 2. Materials and methods 2.1. Dataset Gene expression and DNA methylation data for head and neck squamous cell carcinoma (HNSCC) were considered for our study, comprising of 13 primary sites of head and neck region The data were downloaded from GDC data portal of TCGA website.[59]1 Specifically, 120 matched samples for level-3 data set of gene expression and DNA methylation were obtained, consisting of 100 tumor samples and 20 normal samples. The gene expression data were generated using RNA sequencing (RNA-seq) technology in TCGA project. For DNA methylation data, β intensity of the signal of probes was used for analysis. The β values are continuous valued, with a minimum value of 0 (unmethylated) and a maximum value of 1 (completely methylated). Annotation table of the Illumina Human-Methylation27 platform was used to map the probe IDs to corresponding gene symbols. For the gene symbols corresponding to multiple probe IDS, the β values of probes were averaged representing the β value of the gene. In the present study, the NCBI gene dataset[60]2 was utilized for biological relevance analysis of the distinguished genes. It includes 8933 cancer related genes and 316 oral cancer genes. STRING interactome database (Version 11.0)[61]3 was used to further validate the identified genes by mapping them to the respective PPI network of genes/proteins. 2.2. Weighted gene correlation network analysis Correlation among expression level and methylation intensity of genes was used to decipher their functionality. In this context, Weighted Gene Correlation Network Analysis (WGCNA) [62]Langfelder and Horvath (2008) in R, was used to construct a weighted network of correlation and to discover modules of highly correlated genes. WGCNA is a well established method in the field of biological science for the analysis of gene co-expression networks [63]Padi and Quackenbush (2015); [64]Nangraj et al. (2020). It computes Pearson correlation coefficient to measure the correlation among genes. Construction of scale free network strongly depends on soft threshold power β. Co-expression similarity is raised to the power of β to compute adjacency [65]Langfelder et al. (2007). The functions pickSoftThreshold and scaleFreePlot of the WGCNA package, assists respectively in selecting a proper β value to compute adjacency and for achieving a network with scale free topology. The best value of β is chosen when the curve showing the relation between scale free topology fit index [MATH: R2 :MATH] and β reaches a saturation point. 2.3. Multiplex network construction A multiplex network can be defined as assemblages of distinct networks. Individual network encodes for a different layer of the multiplex system. Each layer in the multiplex system contains the same set of entities (nodes) but interconnected by different types of links (edges). The layers are connected (and hence coupled) to each other via interlayer edges [66]Kanawati (2015); [67]Boccaletti et al. (2014). A multiplex network can be utilized for modeling different types of networks such as: * • Multi relational network: Multi relational network is a platform for modeling a complex system of networks that encodes different types of association among participating units. The link between nodes in a layer depicts a type of interaction. In this work we have considered co-expression and co-methylation similarity among genes as multiple types of relationships. * • Dynamic Network: where edges within a layer represent the network state at a particular timestamp. * • Attributed Network: where based on the similarity measure applied to a group of nodes, a similarity graph can be constructed which can be redefined as an additional layer. MuxViz is an open-source R software package with a graphical user interface, used specially for the visualization and analysis of multiplex networks [68]De Domenico et al. (2015). MuxViz consists of a collection of algorithms for measuring the “node centrality”, “interlayer correlation and reducibility”, “identifying community structure in the network” and “motif analysis” [69]De Domenico et al. (2013). We focused on “node centrality”, specifically degree centrality and eigenvector centrality of nodes to identify the hub genes in the integrated network. 2.4. Proposed model The steps of our proposed integrated multiplex network based approach of hub gene identification is explained below. Workflow of the proposed work is demonstrated in [70]Fig. 1. 2.4.1. Preprocessing The sample matched data of Gene expression and DNA methylation were inputted separately to the preprocessing step of our proposed approach. Firstly, genes with more than 30% of missing values across the samples were filtered and further, missing values in the data were replaced with the mean of the sample [71]Mahapatra et al. (2018). It brought about 18283 genes and 120 samples in gene expression data and 20523 genes and 120 samples in DNA methylation data. Hierarchical clustering was performed on both the data sets to remove outliers in the samples, producing 116 gene expression samples and 115 DNA methylation samples. Out of these 114 samples were common in both data set consisting of 100 tumor and 14 normal samples, which was further considered in the next step of network construction. Additionally, genes were filtered showing fold change (FC) less than 0.5 in their expression levels in normal and tumor samples, producing 10376 genes [72]Dembele and Kastner (2014); [73]Zhao et al. (2018). 2.4.2. Correlation network construction Preprocessed data of gene expression and DNA methylation with 10376 genes and 20523 genes respectively were separately inputted to WGCNA functions for the construction of gene co-expression network (GCoExNW) and gene co-methylation network (GCoMythNW). WGCNA defines a correlation similarity which is raised to power β to compute adjacency similarity, so that degree distribution fits a small world network. The power value of 6 was taken where the relation between [MATH: R2 :MATH] and β reaches saturation. Additionally, to obtain the similarity between genes at network topology level, a Topological Overlap Matrix (TOM) was computed from the adjacency similarity matrix. The resultant network meets the scale free independence criterion. Keeping a threshold of 0.1, the edges in the GCoEx network and GCoMyth network were scrutinized to facilitate the formation of integrated networks. BlockWiseModules function performs automatic network construction followed by identification of clusters of interconnected genes termed as modules of co-expressed genes in a block-wise manner. If the number of genes in the data set exceeds maxBlockSize, genes will be pre-clustered into blocks whose size should not exceed maxBlockSize and the function constructs network in a block wise manner. We have set maxBlockSize to be 30. WGCNA utilizes unsupervised hierarchical clustering for module detection. Later modules in blocks are merged based on mergeCutHeight threshold. mergeCutHeight is the dendrogram cut height for module merging. Observing the cluster dendrogram and experimenting with different values for mergeCutHeight, the cut height value .30 resulted in a good set of modules of correlated genes. The intramodular connectivity of the genes within the module was computed for all genes in both the networks of GCoExNW and GCoMythNW. Intramodular connectivity is a measure of connection strength or co-expression level of a given gene with respect to all the genes of a particular module. 2.4.3. Integrated network construction Three different integrated networks were constructed from the correlation similarity networks i.e. GCoExNW and GCoMythNW. The integrated networks formed were: * 1. Union Network: Considering the union of all the edges in the generated GCoExNW and GCoMythNW, an integrated Union Network was formed consisting of 6189 nodes and 304656 edges. The edges in the created Union Network convey either co-methylation, co-expression or co-methylation along with the co-expression relationships among genes. Alteration of genes can be at single or multiple biological levels. * 2. Intersection Network: An integrated Intersection Network was created by taking the common edges between the two singleton networks i.e. GCoExNW and GCoMythNW. This intersection network consisted of 259 nodes and 357 edges. Here, genes that alter at multiple biological levels were focused. i.e. genes in this Intersection Network not only co-expressed but co-methylated also. Genes in this Intersection Network co-alter in the epigenetic as well as transcriptomic levels. * 3. Multiplex Network: Multiplex network provides a platform for modeling of the complex system of networks where nodes in the network are connected through the variety of links. It is represented as a network of networks, each encoding different types of association among participating units/nodes [74]Kivela et al. (2014). Each network encodes for different layers in a multiplex system with coupling links between layers. More specifically, in the present study, we exploited the multiplex networks to model the association between functionally similar genes at multiple biological levels. Different types of functional dependencies exist among the essential molecular components of the cell across diverse biological levels [75]Elling and Deng (2009). Because of this interdependency, the disruption caused due to aberrant methylation of a single gene can be quickly propagated to the PPI network. It leads to abnormal functioning of tissue or cell which culminate in disease. In this work, the co-expression and the co-methylation relationship between genes was utilized to construct the multiplex system. MuxViz software [76]De Domenico et al. (2015) was used for the construction and visualization of the multiplex network. The multiplex network was constructed as an edge colored multigraph consisting of two layers by taking into consideration the co-expression and co-methylation similarity among genes, where each layer contains the same sets of nodes. The two layers of the multiplex network were namely GCoEx (representing gene co-expression network) and GCoMyth (representing gene co-methylation network). 1068 number of genes were found to be common in the constructed GCoEXNW and GCoMythNW. This set of 1068 genes and their involved edges in the respective GCoExNW and GCoMythNW were considered for the construction of a multiplex network. An aggregate network was also generated, where data from different interactions or layers is added or somehow packed up into a monoplex structure. However, the aggregated network may lead to loss of information because different types of links were treated indifferently [77]Kanawati (2015). We cross-examined the 1068 genes, for the presence of disease-related genes, which resulted in 704 cancer related genes and 29 oral cancer genes. The multiplex network formed is shown in [78]Fig. 2. For clarity in visualization, only the edges involving oral cancer genes are shown in the figure. Multiplex network consolidates the analysis of different biological networks and simultaneously examines the importance of individual interacting unit with and across networks. To quantify the importance of a node in the multiplex network, the diagnostic measures adopted were degree centrality (multi-degree centrality) and eigenvector centrality. Multi-degree centrality of a node in the multiplex network is the number of edges of any type that are incident on it, i.e. the participation of the node within and across different layers. Such individual units have a larger impact on whole network dynamics. Eigenvector centrality is the measure of the influence of a node in a network. It assigns each node a centrality score that depends on both the number and strength of connections [79]Negre et al. (2018). Node with high eigenvector centrality (score>0.5) indicates that the node is connected to many nodes who themselves have high scores. Figure 2. [80]Figure 2 [81]Open in a new tab Multiplex Network formed from GCoExNW and GCoMyth. 2.4.4. Hub gene identification A highly connected gene in a network is termed as a hub gene. In our earlier report [82]Mahapatra et al. (2018), a gene was marked as a hub if it is connected to 10%, 20% or 30% of other genes in the whole network. However, going through the cited approach, an adequate number of hub genes were not identified in this work. Two different criteria were set for screening the hub genes in the three integrated networks. In network theory, a node is said to be a hub if its connectivity is larger than the average connectivity of the whole network [83]Das et al. (2017); [84]Barabasi and Oltvai (2004). Fixing this as the first criteria, 1710 genes in Union Network, 78 genes in Intersection network and 527 genes in the Multiplex Network were marked as hub genes, showing high connection degree in their respective integrated network. Furthermore, in criteria 2, we introduced an ensemble of gene ranking and intramodular connectivity to select the hub genes in the integrated networks. Three sets of hub genes were identified by taking the top 20% of genes, ranked according to their connectivity (degree centrality) in the respective integrated networks. Moreover, the marked hub gene has to meet the normalized value of its intramodular connectivity >0.20 in their respective GCoExNW and GCoMythNW as well. Going through the above steps, 721 genes in the Union Network, 40 genes in the Intersection Network and 233 genes in the Multiplex network were figured out as hub genes. Criteria 2 of hub gene identification resulted in genes that were not only highly connected in the integrated structure but are also well connected in their neighborhood. 2.4.5. Validation of hub genes * (A) Statistical Validation Matthews coefficient correlation (MCC) is a widely accepted parameter in various machine learning platforms to quantify the quality of a binary classifier. It represents the correlation coefficient between observed and predicted binary classifications, with a value ranging between −1 to +1. MCC value of +1 signifies perfect prediction, whereas −1 indicates a mismatch between prediction and observation. A coefficient value of 0 designates for random prediction. It is known to be a balanced measure as it works well even if the classes are of varying sizes. In this work predictive ability of the hand-picked hub genes from Union, Intersection and Multiplex networks were calculated and compared in terms of MCC scores, as the OC data set we had considered for analysis contains an imbalanced ratio between the number of disease and normal samples. Along with MCC, other measures such as Sensitivity, Specificity, Predictive Accuracy, Precision and F-measure were adopted for statistical analysis in comparison to the known true classes [85]Mahapatra et al. (2018). * (B) Differential Expression (DE) Analysis The objective of DE analysis aims at identifying genes showing quantitative changes in expression levels between tumor and normal samples. Comparison of sample variance between two groups was performed using t-test [86]Pan (2002). To control for multiple testing, we used Benjamini and Hochberg's method to adjust for p-values of DE tests so that the false discovery rate padj(FDR)<0.05. Finally, logarithmic Fold Change values for the genes were computed to identify the differentially expressed genes. The result was plotted using R package ggplot. * (C) Biological Validation The following approaches were employed for biologically validating the identified hub genes. + (i) Correspondence between the hub genes and the disease of interest: gene dataset from NCBI was utilized for biological relevance analysis of the distinguished genes with respect to Oral Cancer. The data set includes 8933 cancer related genes and 316 oral cancer genes. The identified hub gene sets were compared in terms of the presence of oral cancer genes and cancer related genes. + (ii) Functional Association Analysis: o (a) Association of the distinguished hub genes with the seed genes: tumorigenesis is a complex multi-step process involving mutation in genes due to dysregulated biological processes and pathways [87]Levine and Steffen (2001). The functionality of a gene can not be studied completely as an isolated component. Because genes always interact with each other to be expressed. A group of genes can be marked as functionally analogous by mapping them to a known network. Study suggests, direct protein-protein interactions are the strongest indicator of functional association among genes and this information can be exploited for the identification of candidate genes [88]Vinayagam et al. (2014). In this study, a set of 28 universally accepted OC genes that play a crucial role in Oral Cancer progression were considered as seed genes (Supplementary Table S2). The marked hub genes along with the seed genes were mapped to the PPI network to unveil the functionality of hub genes and strengthen the study of disease pathogenesis. STRING database [89]Szklarczyk et al. (2015) and Cytoscape [90]Shannon et al. (2003) were used to construct and visualize the connectivity of the seed genes and hub genes in the PPI network. Cytoscape is an open-source software used for the analysis and visualization of biological networks. stringApp of Cytoscape plug-in was employed for the analysis. o (b) Gene Ontology (GO) functional enrichment analysis of putative hub genes: DAVID [91]Sherman et al. (2009) was adopted to uncover and further analyze the biological meaning behind the identified optimal hub genes. DAVID is a tool for functionally classifying genes using a set of functional annotation tools and analyzing the biological role of genes. It is widely used to perform GO and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analyses of genes. 3. Results The Hub Gene Identification step of our proposed work gave rise to six different sets of hub genes selected from the three integrated networks: Union network, Integration network and Multiplex network. The different sets of hub genes, marked by deploying criteria 1 and criteria 2 of our approach are given in [92]Table 1. Table 1. Hub genes marked in the three integrated networks by deploying selection parameters C1 and C2. Criteria 1 (C1) __________________________________________________________________ Criteria 2 (C2) __________________________________________________________________ Hub NoG CR OC Hub NoG CR OC C1-HubUn 1710 706(41%) 25(2%) C2-HubUn 721 279(38%) 10(2%) C1-HubIn 78 33(42%) 0(0%) C2-HubIn 40 14(35%) 0(0%) C1-HubMN 527 218(42%) 8(2%) C2-HubMN 233 97(42%) 8(4%) [93]Open in a new tab C1-HubUn, C1-HubIn, C1-HubMN are three hub gene sets identified from Union Network, Intersection Network and Multiplex Network using criteria 1; C2-HubUn, C2-HubIn, C2-HubMN are three hub gene sets identified from Union Network, Intersection Network and Multiplex Network using criteria 2; NoG Number of genes, CR Number of Cancer related genes, OC Number of Oral cancer genes. Before moving further we did a priori validation of six hub gene sets by investigating their correspondence with the disease. The spotted hub gene sets were examined for the presence of cancer related (CR) genes and oral cancer (OC) genes as described in the section [94]2.4.5 (C) “Biological validation - Correspondence between the hub genes and the disease of interest”. The percentage of CR and OC genes found in the hub gene sets is summarized in [95]Table 1. All the designated hub gene sets were observed with a good proportion of co-expressed cancer candidate genes. This witness the fact that disease related genes have high interacting partners compared to other genes [96]Wu et al. (2008). The Criteria 2 of our approach was narrower than that of Criteria 1. In Criteria 2, genes were mined based on an ensemble of gene ranking and intramodular connectivity, which results in a reduced fraction of CR genes in the respective hub gene sets. However, the hub gene sets namely C1-HubMN and C2-HubMN identified from the constructed multiplex network were showing a highest fraction of CR genes which is consistent both for Criteria 1 and Criteria 2. Although small, a fraction of OC genes were also spotted in the hub gene sets extracted from the integrated Union network and Multiplex network, whereas no OC genes were spotted in the intersection network. A set of 74 genes, named as C1-CmnHub found to be common in three hub gene sets extracted using criteria 1 and 38 genes were found common in the six hub gene sets extracted from the integrated networks. Out of them genes ACDY2, AKAP6, ASPM, F8, FCGR3A, FOXO4, HSDL2, LILRB1, ME3, MUC21, NDRG2, PIK3CG, TNNC1, SIGLEC9 were examined as related to cancer. Additionally, eight genes: CCL18, CCL4, TERT, RECK, LEPR, CDC20, LTA, HLADQA which are known to be oral cancer were spotted both in the hubs C1-HubMN and C2-HubMN. An intersection chart showing the common genes spotted in different hub gene sets is illustrated in [97]Figs. 3a and 3b. These genes were highly connected as well as showing a significant correlation between expression and methylation levels in the multiplex network which may have a regulatory effect on related genes. The identified hub gene sets were further validated in terms of their prediction accuracy in the succeeding section. Figure 3. [98]Figure 3 [99]Open in a new tab Venn diagram showing different hub gene sets and their intersection. (a) Intersection of hub gene sets C1-HubUn, C1-HubIn, C1-HubMN extracted using criteria 1 shows 74 common genes: C1-CmnHub. (b) Intersection of hub gene sets C2-HubUn, C2-HubIn, C2-HubMN extracted using criteria 1 and C1-CmnHub shows 38 number of genes. 3.1. Classifier performance Hub genes in a gene network are high degree nodes having more interacting partners than non-hub nodes. These genes direct the whole network structure and play a very important role in multiple regulatory systems [100]He and Zhang (2006). Three different classifiers KNN (k-nearest neighbors, with k=3, based on the different repeated experimental trials), Random Forest (RF) and Support Vector Machine (SVM) had been applied for evaluating the identified sets of hub genes to determine how good or bad they were in predicting the disease state of a test data. To avoid one time occasionality, the class outcome of the hub genes as predictor entity were examined using 10 fold cross validation. All the sets of hub genes showed a low variance in MCC for all three classifiers, which is summarized in [101]Table 2. Each set of hub genes had achieved a standard measure of MCC value of >= 0.5. However, the hub genes selected from the integrated multiplex networks were showing the highest value of MCC. All the hub gene sets achieved a comparable prediction accuracy between 92% and 96% and showed good performing scores of sensitivity, specificity, precision and F-measure with respect to various classes for different classifiers. However, among all, hubs C1-HubMN and C2-HubMN score highest in terms of sensitivity and specificity. In the medical context, sensitivity and specificity are the key measures for evaluating a diagnostic model. It is noteworthy to mention here that the genes marked as the hub in the multiplex network were co-related across transcriptomic as well as epigenetic levels and highly connected in the constructed multiplex system. Results revealed that the integrated multiplex network approach effectively identifies hub genes with improved prediction accuracy. Table 2. Classifier performance of Hub genes marked in the integrated networks. Hub NoG Cl 3NN __________________________________________________________________ RF __________________________________________________________________ SVM __________________________________________________________________ Sn Sp Pr FM MCC Sn Sp Pr FM MCC Sn Sp Pr FM MCC C1-HubUn 1710 N 0.90 0.95 0.95 0.92 0.89 0.89 0.94 0.94 0.91 0.90 0.92 0.98 0.95 0.94 0.91 T 0.95 0.90 0.93 0.94 0.89 0.94 0.89 0.92 0.93 0.90 0.98 0.92 0.97 0.98 0.91 C1-HubIn 78 N 0.82 0.99 0.97 0.89 0.86 0.77 0.89 0.96 0.87 0.84 0.82 0.99 0.97 0.89 0.86 T 0.99 0.82 0.93 0.96 0.86 0.89 0.77 0.92 0.96 0.84 0.99 0.82 0.93 0.96 0.86 C1-HubMN 527 N 0.85 0.99 0.97 0.90 0.96 0.87 1.00 1.00 0.93 0.93 0.92 1.00 1.00 0.96 0.95 T 0.99 0.85 0.94 0.97 0.96 1.00 0.88 0.95 0.98 0.93 1.00 0.92 0.97 0.99 0.95 C2-HubUn 721 N 0.92 0.97 0.92 0.92 0.89 0.87 1.00 1.00 0.93 0.91 0.82 1.00 1.00 0.90 0.88 T 0.97 0.92 0.97 0.97 0.89 1.00 0.87 0.95 0.98 0.91 1.00 0.82 0.94 0.97 0.88 C2-HubIn 40 N 0.77 0.84 0.95 0.87 0.84 0.77 0.99 0.97 0.86 0.82 0.69 1.00 1.00 0.82 0.79 T 0.85 0.77 0.92 0.96 0.84 0.99 0.77 0.92 0.95 0.82 1.00 0.69 0.89 0.94 0.79 C2-HubMN 233 N 0.82 0.99 0.97 0.89 0.98 0.80 1.00 1.00 0.89 0.93 0.95 0.99 0.97 0.96 0.95 T 0.99 0.82 0.93 0.96 0.98 1.00 0.80 0.93 0.96 0.93 0.99 0.95 0.98 0.99 0.95 

 Cmn-Hub 46 N 0.96 0.98 0.98 0.91 0.98 0.90 0.95 0.97 0.97 0.95 0.92 0.99 1.00 0.96 0.96 T 0.99 0.95 0.91 0.98 0.98 0.85 0.90 0.99 o.99 0.95 0.99 0.92 0.97 0.99 0.96 [102]Open in a new tab The hub gene sets marked in bold showing significant predictive accuracy measures. C1-HubUn, C1-HubIn, C1-HubMN are three hub gene sets identified from Union Network, Intersection Network and Multiplex Network using criteria 1; C2-HubUn, C2-HubIn, C2-HubMN are three hub gene sets identified from Union Network, Intersection Network and Multiplex Network using criteria 2; Cmn-Hub is the hub gene set consisting of 46 common hub genes from all six integrated networks; NoG Number of genes, Cl class label, 3NN 3 nearest neighbors, RF random forest, SVM support vector machine, Sn sensitivity, Sp specificity, Pr precision, FM F-measure, MCC Matthews correlation coefficient; N normal and T tumor Oral cancer sample. The objective of this work was to identify few driver genes which are crucial for understanding the genetic etiology behind a complex disease and potential candidate for identifying drug targets for OC. As all the hub gene sets showed competitive performance as a predictor variable, the proposal of identifying hub genes boiled down to forming another hub gene set: Cmn-Hub, consisting of 46 genes. These 46 hub genes comprised of 38 common genes (found in all six hub gene sets) and 8 OC genes (spotted in both the hubs C1-HubMN and C2-HubMN). To our surprise, these 46 hub genes were outperforming with average sensitivity value of 93% and specificity value of 94% when applied to the classifiers. The observed prediction accuracy of hub gene set Cmn-Hub was 96% with an average MCC value of 0.96. ROC curve showing the comparative performance of Cmn-Hub, C1-HunMN, C2-HubMN is illustrated in [103]Fig. 4. The common hub gene set with a smaller size (46 genes) was observed with significantly better performance as a classifier in comparison to other identified hubs. Additionally, the eigenvector centrality of hub genes in the multiplex network was observed to be greater than 0.5, indicating the strong connectivity of these hub genes with other hub genes in the network. This suggests that these genes are highly influential in the biological network and contribute most as a predictor variable. The selected hub genes were further validated in our next step in terms of their connectivity and functional association with the known set of genes for oral cancer. The list of hub genes is given as Supplementary Table S1. Further, the differential expression analysis and biological importance of these hand picked 46 number of genes is discussed in succeeding sections. Figure 4. [104]Figure 4 [105]Open in a new tab Comparative performance of the identified Hub gene sets (Cmn-Hub, C1-HubMN, C2-HubMN) for three different classifiers: k-nearest neighbor for k = 3 (3NN), random forest (RF) and support vector machine (SVM). 3.2. Functional association analysis 3.2.1. Association of the selected hub genes with the seed genes To further validate the filtered gene set, we looked for the interconnection and functional association of our studied gene sets with the 28 seed genes (Supplementary Table S2) taken based on its association in the progression of OC or Head and Neck Squamous Carcinoma (HNSCC). Identified 46 number of hub genes along with the 28 seed oral cancer genes were mapped to PPI network using stringApp of Cytoscape which maps the genes to STRING database of interacting proteins [106]Szklarczyk et al. (2015). STRING includes both physical interactions from experimental data and functional associations from curated pathways, automatic text mining, and prediction methods. A minimum interaction score greater than 0.5 (optimum confidence) was used and only query proteins were displayed. The co-expressed hub genes identified in the integrated multiplex network were also relatively closer in the PPI network, which is shown in [107]Fig. 5. Genes viz. FOXO4, NDGR2 which are known to get involved in cancer were marked as strongly connected with the widely known cancer gene TP53. PIK3CG, CCL4 were strongly connected with HRAS, PIK3CA, NOTCH1, JUN which are known specific to OC. CDC20, RECK, NDRG2 were connected with seed gene PTEN. Hub gene TERT was highly co-related with PTEN, TP53, HRAS, PIK3CA which are benchmark genes for OC. This shows, genes that are co-altered at multiple levels of expression, also are more closer in the known biological network. These genes can be considered significant for further study of prognostic influence in oral cancer. To more accurately select the hub genes, a biological significance analysis of these hub genes had been done in our next step. Figure 5. [108]Figure 5 [109]Open in a new tab Association of 46 significant hub genes and 28 seed genes in PPI network. 3.2.2. Identification of deferentially expressed hub genes (DEHs) DE analysis was carried out to identify the hub genes showing significant expression changes between tumor and normal samples. A filtering threshold of padj(FDR)<0.05 and Log2Fold-change>2 was set to identify the DEHs. Out of 46 identified hub genes, 28 genes were found to be significantly deferentially expressed, which is shown in [110]Fig. 6a. The list of DEHs is given in Supplementary Table S3. [111]Fig. 6b demonstrates the 16 up-regulated DEHs and 12 down-regulated DEHs. The most significant up-regulated genes were CDC20, CCL4, LCP2, ASPM, FCGR3A, whereas genes NDGR2, MUC21, PERM1, TNNC1, HSPB7 were observed to be significantly down-regulated. Among them, genes CCL4, CDC20 are Oral cancer genes and ASPM, FCGR3A, TNNC1 are related to cancer. Figure 6. [112]Figure 6 [113]Open in a new tab Differential expression analysis of hub genes. (a) A Volcano plot: Genes represented by blue color dots are showing significantly altered expression (FDR<0.05 and LogFC>2) (b) 16 up-regulated and 12 down-regulated DEHs with their p-values. 3.2.3. Functional association analysis of the selected hub genes To further unveil the biological implications behind the selected genes, Gene Ontology (GO) functional and pathway enrichment analysis of hand picked hub genes along with the considered seed genes were done using DAVID [114]Sherman et al. (2009). A count>=2 and EASE>0.1 were set as cut off parameters. EASE is a GO enrichment analysis score based on the Fisher Exact Test [115]Hosack et al. (2003). A threshold of p<0.05 was used to indicate significant functionality and pathway categories. GO functional enrichment analysis was carried out based on three functional groups: (1) biological processes, (2) cellular components and (3) Molecular functions. Enriched list of the biological processes involving the hub genes is summarized in Supplementary Table S4. Result revealed that the identified hub genes were significantly enriched in different biological processes including immune response, cell proliferation, apoptosis, cell cycle arrest, signal transduction, Ras protein signal transduction along with the benchmark OC genes NOTCH1, JUN and HRAS. JUN is responsible for OC metastasis whereas deviated expression of NOTCH1 is responsible for cell proliferation. Proper functioning of the immune system requires a balanced interaction between immune and non-immune cells. Chemokines regulate leukocyte migration and trafficking to their destined tissues [116]Chakraborty et al. (2018). Encoded proteins from hub genes CCL4, CCL18 has chemokinetic and inflammatory functions. Aberrant expression of these genes may disrupt the immune system and lead to resistant tumor formation. The presence of cancer related gene PIK3CG in many biological processes regulating the immune system gives an indication to the researchers for further analysis to uncover the genetic etiology behind the disease. The rate of growth in normal and cancer tissues is determined by a balance between cell proliferation and apoptosis. As cancer cell continues to grow uncontrollably, biological activities are also disturbed in cell cycle regulation. Somatic mutation in TP53 plays a significant role in regulating cell cycle, cellular differentiation, DNA repair, apoptosis [117]Nanda et al. (2011); [118]Polyak et al. (1994). Gene CDC20 which was observed to be significantly up-regulated is significantly enriched in the biological process related to cell proliferation. Hub gene CDC20 was observed to be significantly up-regulated in OC. This implies that deregulated expression of CDC20 can mediate a series of downstream genes to drive cellular proliferation and activate cell death apoptosis. Genes CASQ2, MYH6, MYH7 which are not known to be cancer related, were found to be highly enriched in GO terms related to muscle contraction. Muscle contraction has been reported as a key biological process in OC, which witnesses the presence of myofibroblasts in tumor stroma of patients with lymph node involvement [119]Shaikh et al. (2019). A series of genes including LILRB1, FOXO4 (cancer related) were found to be predominantly enriched in cell cycle arrest, immune response, regulation of the apoptotic process. Apoptosis is one of the key processes in oral cancer progression. Additionally, gene PIK3CG, TERT, CCL4, CCL18 were found to be highly enriched with biological processes such as neutrophil chemotaxis and angiogenesis. Angiogenesis is one of the crucial factors in cancer progression [120]Nishida et al. (2006). The presence of neutrophils at the primary tumor site activates a broad spectrum of functions leading to cancer. It generates the secretion of proangiogenic factors as well as the proteolytic activation of proangiogenic factors which causes angiogenesis [121]Granot and Jablonska (2015). [122]Fig. 7 shows the most significant biological processes observed in the identified hub genes. Figure 7. [123]Figure 7 [124]Open in a new tab Most significant biological processes (P-value<0.05) observed in the identified hub genes. The enumerated list of cellular components (CC) and molecular functions (MF) involved in the putative list of hub genes are summarized in Supplementary Tables S5 and S6 respectively. Our list of hub genes was found to be mostly enriched in the cytosol (CC), phosphatidylinositol 3-kinase complex (CC), plasma membrane (CC), protein complex (CC), muscle myosin complex (CC) along with the verified list of seed genes such as TP53, NOTCH1, JUN, PIK3CA, HRAS. Moreover, we also noted protein binding (MF), enzyme binding (MF), transcription factor binding (MF), kinase activity (MF), phosphatidylinositol-4,5-bisphosphate 3-kinase activity (MF) in the enriched list of molecular functions entailing genes PIK3CG, CDC20, CCL4, ADCY2 and TERT. Results suggest that there is a direct or indirect association between the functional activity and localization of the proteins encoded by the scrutinized hub genes and OC. DAVID also produce the KEGG pathway enrichment analysis for the 46 identified hub genes with respect to the widely accepted seed genes. The highlighted pathways were p53 signaling pathway, Chronic myeloid leukemia, small cell lung cancer, Pathways in cancer, ErbB signaling pathway, PI3K-Akt signaling pathway and focal adhesion. In general, these pathways are responsible for cell cycle regulation, signal transduction and many cancer mediated process. Pathways in cancer, p53 signaling pathway, small cell lung cancer are directly attributed to OC. p53 protein regulates cell cycle and functions and acts as a tumor suppressor. Disrupted regulation of p53 signaling elements deregulates numerous tumor suppressing process including DNA repair, cell cycle arrest and apoptosis [125]Stegh (2012). p53 gene is one of the most frequently mutated genes in oral cancer [126]Rowley et al. (1998). Isoforms of oncoprotein BCR-ABL activates tyrosine kinase and cause chronic myeloid leukemia. Deregulation of these isoforms causes altered cellular adhesion, activation of mitogenic signaling pathways, inhibition of apoptosis process [127]Melo and Deininger (2004). Hub genes PIK3CG and PIK3R5 were found as markers in chronic myeloid leukemia and play an important role in cell growth, proliferation, differentiation, motility, survival and intracellular trafficking. Mutation of proto-oncogene PIK3CA is a key factor in cell proliferation and survival through activation of the PI3K/Akt signaling pathway. PIK3CA along with PTEN plays a pivotal role in the PI3K-Akt signaling pathway and hence accepted as a biomarker for oral carcinogenesis [128]Chang et al. (2013). Hub genes PIK3CG and PIK3R5 along with the seed gene PIK3CA were observed in many regulating pathways in OC. Overexpression of PIK3CG in tumor cells suppresses cell proliferation. Loss of function of this protein contributes to tumorigenesis process [129]Semba et al. (2002). Focal adhesion kinase is a cytoplasmic tyrosine kinase identified as a key mediator of signaling by integrins, a major family of cell surface receptors and other receptors in both normal and tumor cells [130]Zhao and Guan (2009). It is known to be an important mediator of cell growth, cell proliferation, cell survival and cell migration, all of which are often dysfunctional in cancer cells. The pathway enrichment analysis unveils that the hub genes marked by our approach are related to the development of cancer. The highly significant pathways observed in this analysis are demonstrated in [131]Fig. 8. A detailed list of enriched pathways is given in Supplementary Table S7. Figure 8. [132]Figure 8 [133]Open in a new tab Highly enriched list of pathways (P-value<0.05 observed in the identified hub genes. 4. Discussion Oral cancer ranked among the top three cancers and counts about thirty percent of prevalent cases in India. Early diagnosis and prevention of OC are prioritized at the global health forums [134]Kulkarni (2013). Genomic studies show that hub genes in a genetic network plays an influential role in regulating the whole network structure. Thus, identification of such genes related to specific cancer types can help in reducing the gap in OC prognosis. In this study, a biological network based integrative approach was proposed for the identification of hub genes related to OC. Data from both genomic and epigenomic levels were employed for the study. Gene co-expression network and gene co-methylation network were constructed using WGCNA. In the next step three integrated networks: Union Network, Integration Network and Multiplex network were generated from the constructed correlation networks. Multiplex network was constructed where isolated networks of gene co-expression and co-methylation encode for different layers in the multiplex system with coupling edges in between. Hence interplay among the interacting units across layers is preserved. Degree centrality (multi-degree centrality) and eigenvector centrality of genes were computed to find the connectivity of genes in the multiplex network. Based on the connectivity of genes in the integrated networks and the correlation networks, criteria 1 and criteria 2 were fixed for the identification of hub genes from the integrated network which results in six sets of hub genes. Among all, 38 hub genes were found to be common in all the hub gene sets. To examine the performance of these genes as a predictor variable, the classifier accuracy of these gene sets was determined on the classifiers KNN, RF and SVM. The classifier performance indicates that all the hub gene sets attained good values for MCC. However, hub gene sets constructed from Multiplex network (C1-HubMN and C2-HubMN) achieved significantly high scores for MCC, Sensitivity and Specificity. This corroborates the ability of the Multiplex network in exploiting different types of interactions among biological entities. We further identified 46 genes as our putative hub genes, comprising of 38 common genes (found common in all six hub gene sets) and 8 OC genes (spotted in both the hubs C1-HubMN and C2-HubMN). To our surprise, the classifier performance of Cmn-Hub with 46 genes was significantly high with a prediction accuracy of 96% in comparison to other hub gene sets whose size varies between 40 - 1710. We also investigated the interconnection and functional association of our studied hub genes (46 number) with the 28 seed genes which are known to get associated with OC, by mapping them to the PPI network using stringApp plugin of Cytoscape. Multiplex centrality analysis considers the participation of a gene at multiple levels of the biological system. These genes are significantly influential both in transcriptomic and epigenetic level with eigenvector centrality >0.5. The differential expression analysis identified 28 hub genes showing a significant difference in expression levels between two experimental groups. Among them, genes CDC20, CCL4, LCP2, ASPM, FCGR3A were observed to be most significantly up-regulated and genes NDGR2, MUC21, PERM1, TNNC1, HSPB7 were significantly down-regulated. The DAVID biological enrichment analysis conveyed that the marked set of genes including PIK3CG, PIk3R5, CCL4, CDC20, TERT, LTA, MYH7 significantly takes part in various biological processes including immune response, cell proliferation, apoptosis, cell cycle arrest, signal transduction, muscle contraction, neutrophil chemotaxis, angiogenesis and pathways such as p53 signaling pathway, Chronic myeloid leukemia, Pathways in cancer, ErbB signaling pathway, PI3K-Akt signaling pathway along with the seed genes. More importantly, this approach focuses on identifying the hub genes that co-express at multiple levels including both genetic and epigenetic. The proposed multi-layered data integration approach provides the researchers with new insight for detailed exploration of tumorigenesis and progression in OC by systematically enhancing the study of gene action. 5. Conclusions Diagnosis of OC at its early stage results in better therapeutic implications. The study reveals that genes encoded for hub proteins are enriched for disease genes and preferentially targeted by pathogens. Thus identification of such hub genes may give a further clue to the researchers for a thorough understanding of the disease pathogenesis. This work aimed at identifying a set of hub genes crucial for OC using an integrated framework of multi-omics data. Three integrated networks: Union Network, Integration Network and Multiplex Network were constructed using expression level as well as methylation intensity of genes. A set of 46 genes were identified as hub genes which are validated in terms of their statistical competence and biological importance. We spotted five genes PIK3CG, PIK3R5, MYH7, CDC20 and CCL4, which were observed to be predominantly enriched in various biological processes and pathways crucial for OC. This work suggests an alternative research direction for biological network analysis. The identified genes in our approach can be a potential prognostic biomarker in the process of tumorigenesis and underlying molecular events in OC. Our findings enlighten the researchers for a more extensive investigation of OC. Declarations Author contribution statement S. Mahapatra: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Wrote the paper. R. Bhuyan, J. Das: Analyzed and interpreted the data. T. Swarnkar: Conceived and designed the experiments; Wrote the paper. Funding statement This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Data availability statement Data associated with this study can be found at The Cancer Genome Atlas (TCGA). Declaration of interests statement The authors declare no conflict of interest. Additional information Supplementary content related to this article has been published online at [135]https://doi.org/10.1016/j.heliyon.2021.e07418. No additional information is available for this paper. Footnotes ^1 [136]https://portal.gdc.cancer.gov/. ^2 [137]http://www.ncbi.nlm.nih.gov/geo/. ^3 [138]https://string-db.org/. Supplementary material The following Supplementary material is associated with this article: Supplementary-S1.xlsx List of identified 46 Hub genes (Cmn-Hub). [139]mmc1.xlsx^ (8.8KB, xlsx) Supplementary-S2.xlsx List of 28 seed genes. [140]mmc2.xlsx^ (7.5KB, xlsx) Supplementary-S3.xlsx List of DEHs identified in the Hub gene set: Cmn-Hub. [141]mmc3.xlsx^ (9.4KB, xlsx) Supplementary-S4.xlsx Significantly enriched list of biological processes of selected 46 Hub Genes. [142]mmc4.xlsx^ (12.5KB, xlsx) Supplementary-S5.xlsx Significantly enriched list of cellular components involved in selected 46 Hub Genes. [143]mmc5.xlsx^ (8.1KB, xlsx) Supplementary-S6.xlsx Significantly enriched list of molecular functions involved in selected 46 Hub Genes. [144]mmc6.xlsx^ (8.2KB, xlsx) Supplementary-S7.xlsx Significantly enriched pathway terms involved in selected 46 Hub Genes. [145]mmc7.xlsx^ (11.5KB, xlsx) References