Abstract

Background:

   High-throughput data generation is developing in the cancer area and
   offers a better opportunity of understanding molecular pathways
   involved in the progression of tumors. Meta-analysis of gene expression
   based on data integration makes it possible to determine changes in
   gene expression with more accuracy. This approach and downstream
   analysis were utilized for colorectal cancer to identify promising
   biomarkers and drug targets.

Materials and Methods:

   First, a systematic search was performed in the Gene Expression Omnibus
   (GEO) database. Meta-analysis was used to obtain differentially
   expressed (DE) genes from the NetworkAnalyst database. Moreover,
   microRNAs (miRNAs), long non-coding RNAs (lncRNAs), and transcription
   factors (TFs) enriched with DE genes were determined by the Enrichr
   database. The integrated DE genes-miRNAs-lncRNAs-TFs network was
   constructed and analyzed using Cytoscape software. Then, the downstream
   analyses of hub genes were performed.

Results:

   The primary candidate genes, HNF4A, FLI1, MITF, E2F4, SPI1, FOXA2, VDR,
   and SMAD3, which are overlapped nodes with the highest degree,
   betweenness, and closeness centrality parameters in the top 30 nodes
   were selected. Also, the hsa-miR-26b-5p, hsa-miR-16-5p, hsa-miR-124-3p,
   and hsa-miR-92a-3p were introduced as key non-coding RNAs with the
   highest degree and betweenness centralities. For the determination of
   the clinical significance of hub genes, the mutation, differential
   expression, and survival analysis were assayed. Also, the drugs related
   to candidate hub genes were determined. The functional analysis
   revealed pathways significantly related to cancer progression.

Conclusion:

   Employing systems biology approaches with holistic insight can identify
   essential genes and their regulation as possibilities for further
   experimental testing.

   Keywords: Gene regulatory network, meta-analysis, microarray, protein
   interaction network

INTRODUCTION

   Colorectal cancer (CRC) is the third most commonly diagnosed cancer
   worldwide and the second and fourth leading cause of cancer death for
   men and women, respectively.[[30]1] Consequently, it signifies a
   serious health issue, and early detection is imperative to facilitate
   prompt medical intervention. Approximately 25% of CRCs are diagnosed at
   the metastatic stage when treatment is less likely to be effective, and
   survival is low.[[31]2]

   Pathologically, molecular biomarkers, along with the Tumor, Nodes,
   Metastases (TNM) staging system and histological features, are critical
   in diagnosing and treating CRC.[[32]3] In particular, KRAS, BRAF, and
   microsatellite instability (MSI) molecular markers are currently
   considered CRC prognostic factors by the American Joint Committee on
   Cancer (AJCC).[[33]4] A significant increase in overall survival in
   patients with cancer has been documented over the past 20 years. This
   is attributed to identifying novel molecular pathways and using
   biological agent targeting for diagnosis and treatment.[[34]5]

   New biomarkers can be uncovered using traditional molecular biology
   techniques. Combining distinct data sources from different approaches
   provides less biased and more systematic discovery of biomarkers in
   clinical contexts. Bioinformatics and computational research advances
   have paved the way to data processing and understanding biomarker
   discovery. Applied bioinformatics research leads attempts to
   disseminate biomarker study results and develops advanced
   infrastructures empowering researchers to go beyond “single-marker”
   analysis.[[35]6]

   High-throughput data analysis approaches provide a robust platform for
   analyzing biological pathogenesis.[[36]7] The survey of protein
   interaction maps provides the best picture of how each protein
   contributed to the molecular function. As a result of the rapid
   development of microarray technology, there is an abundance of
   microarray data in the cancer genetics field. There is no doubt that
   meta-analysis and other tools based on systems biology can be relied on
   for a more precise assessment of tumorigenic processes and an accurate
   choice of treatment targets. Our study aims to identify the key genes
   obtained from the meta-analysis; then, integrated network survey with
   microRNAs (miRNAs), long non-coding RNAs (lncRNAs), and transcription
   factors (TFs) were enriched. Moreover, gene set enrichment analysis
   identifies the main signaling pathways involved in CRC. Then, the rate
   of mutation, the expression range for hub genes across tissues in
   normal and tumor, and also survival analysis of hub genes were
   identified. Drugs related to hub genes are also determined for the
   identification of proper treatment options. The results of this study
   may provide clues to suggest proper candidate biomarkers for CRC.

MATERIALS AND METHODS

Microarray analysis and meta-analysis

   Datasets of expression profiling by array were searched in the Gene
   Expression Omnibus (GEO) database.[[37]8] The keywords “colorectal
   cancer” and “colon cancer” were used in a systematic search in homo
   sapiens samples. Datasets that criteria included; less than three
   samples in each group, cell line studies, datasets with no healthy
   control samples, treated, and unrelated to the study hypothesis were
   ignored. We focused on datasets with Affymetrix and Illumina platforms.
   Due to the importance of evaluating the quality control of the datasets
   in the first step,[[38]9] the quality control was identified by
   principal component analysis (PCA) with the ggplot2 package[[39]10] of
   the prcomp function of R software.[[40]11] The eight microarray
   datasets were selected for further analysis, including [41]GSE110224,
   [42]GSE23878, [43]GSE37364, [44]GSE110223, [45]GSE77953, [46]GSE54986,
   [47]GSE74602, and [48]GSE25070. The NetworkAnalyst V3.0 database was
   used for meta-analysis.[[49]12] After normalization by log2
   transformation and batch effect, the meta-analysis was performed by the
   random effect model (REM) approach. A significance level of adjusted P
   value less than 0.05 was considered for determining significant DE
   genes. Ethics approval and consent to participate:
   IR.MUI.MED.REC.1399.558.

The miRNAs, lncRNAs, and transcription factors (TFs) prediction

   Enrichr database[[50]13] was utilized for investigation at other
   levels. The mRNA-miRNA interaction was determined using the miRTarBase
   library.[[51]14] At the transcriptional level, using the ChEA library
   identified regulatory TFs.[[52]15] LncHUB library provided predictions
   of the lncRNAs correlated with DE genes. The adj. P value < 0.05 was
   considered for significant results.

Network construction and analysis

   After miRNAs, lncRNAs, and TFs interactions with DE genes were
   determined, the gene regulatory networks (GRNs) were created using
   Cytoscape software V7.2.0.[[53]16] The protein-protein interaction
   (PPI) network was made via the STRING database[[54]17] in the CluePedia
   application.[[55]18] The interactions were retrieved based on
   activation, inhibition, binding, and post-translational modification
   with a confidence cut-off 0.8. The Merge tool of Cytoscape software was
   used to create an integrated network from four networks. Using the
   Network Analyzer tool, we measured the centrality parameters based on
   treating the network as directed to ranked hub nodes in the merged
   network. Clustering and module mining were performed by the MCODE
   plug-in (version 1.6.1),[[56]19] a Cytoscape plug-in, to identify
   sub-networks that have a significant function with degree cut-off = 2,
   node score cut-off = 0.2, k-core = 2, and max. depth = 100.

Survival analysis of candidate hub genes

   For the clinical significance assay, the survival evaluation of
   selected candidate hub genes in the Kaplan-Meier (KM) plotter online
   database (kmplot.com) was performed to estimate the effect of genes on
   survival. This correlation analysis utilizes the data from 30,000
   samples of 21 tumor types obtained from gene arrays, RNA sequence, or
   next-generation sequencing.[[57]20] In this study, relapse-free
   survival (RFS) data were achieved from colon cancer.

Mutation and expression analysis of candidate hub genes

   The information on somatic mutation and nucleotide changes of selected
   candidate hub genes were determined. Data are generated by the
   Catalogue of Somatic Mutations in Cancer (COSMIC),[[58]21] The
   International Cancer Genome Consortium (ICGC),[[59]22] and The Cancer
   Genome Atlas (TCGA) ([60]www.cancer.gov/tcga). Then, the comparison of
   gene expression in normal, tumor, and metastatic tissues based on the
   integrated database using available transcriptome-level datasets, such
   as NCBI’s Gene Expression Omnibus (GEO), TCGA, Therapeutically
   Applicable Research to Generate Effective Treatments (TARGET), and The
   Genotype-Tissue Expression (GTEx), was obtained.[[61]23] Significant
   differences by a Mann-Whitney U-test are marked with red color.

Pathway and functional analysis of candidate hub genes, genes clustered in
modules, miRNAs, and lncRNAs

   Functional analysis is conducted to perceive the main molecular
   mechanisms in the pathogenesis of cancer. The ClueGo plug-in (version
   2.5.4)[[62]24] was employed based on the Kyoto Encyclopedia of Genes
   and Genomes (KEGG)[[63]25] and Reactome pathway database.[[64]26] P
   value < 0.05 corrected with Bonferroni’s step-down is considered for
   significant results. Using DIANA-miRPath v3.0[[65]27] an online
   software, the miRNA function was analyzed. The enrichment analysis of
   lncRNAs was performed by LncSEA, a platform for lncRNA.[[66]28] For
   significant results, adj. P value < 0.05 was considered a cut-off.

   The framework of the steps mentioned is depicted in [67]Figure 1.

Figure 1.

   Figure 1
   [68]Open in a new tab

   Framework designed and flowed in this study is represented

RESULTS

Analysis of the significant differentially expressed genes

   The [69]GSE110224, [70]GSE23878, [71]GSE37364, [72]GSE110223,
   [73]GSE77953, [74]GSE54986, [75]GSE74602, and [76]GSE25070 datasets
   selected for meta-analysis are a total number of 308 samples, which
   included 151 control and 157 cancer samples [[77]Table 1]. The quality
   check result shows the acceptable samples in each group
   [[78]Supplementary Figure 1^ (1.2MB, tif) ]. The 5245 significant
   differentially expressed (DE) genes were recognized using meta-analysis
   based on the REM method. A total of 2727 DE genes with combined effect
   size (ES) >|1|, which included 1444 up-regulated and 1283
   down-regulated genes, were selected for further analysis
   [[79]Supplementary Table 1^ (320KB, pdf) shows DE genes with P
   value-corrected and combined ES].

Table 1.

   Studies retrieved from the GEO database for meta-analysis are described
   Serial Sample (tumor/ctrl) DEGs Platform
   [80]GSE110224 n=26 (13/13) 8395 Affymetrix Human Genome U133 Plus 2.0
   Array
   [81]GSE23878 n=55 (31/24) 6510 Affymetrix Human Genome U133 Plus 2.0
   Array
   [82]GSE37364 n=63 (25/38) 11116 Affymetrix Human Genome U133 Plus 2.0
   Array
   [83]GSE110223 n=25 (13/12) 5591 Affymetrix Human Genome U133A Array
   [84]GSE77953 n=24 (17/7) 7867 Affymetrix Human Genome U133A Array
   [85]GSE54986 n=11 (5/6) 2187 Illumina HumanHT-12 V4.0 expression
   beadchip
   [86]GSE74602 n=55 (29/26) 10593 Illumina HumanRef-8 v2.0 expression
   beadchip
   [87]GSE25070 n=49 (24/25) 5330 Illumina HumanRef-8 v3.0 expression
   beadchip
   [88]Open in a new tab

Prediction of miRNAs, lncRNAs, and transcription factors that regulate the
target genes

   We identified miRNAs, lncRNAs, and transcription factors -targeted DEGs
   to reveal the regulatory of DEGs at transcriptional and
   post-transcriptional levels and construction of GRN. The analysis
   revealed 175 miRNAs, 187 lncRNAs, and 41 TFs significantly enriched
   with DE genes [Supplementary Tables [89]2^ (246.7KB, pdf) -[90]4^
   (178.8KB, pdf) show the non-coding RNAs and TFs with related genes and
   adjusted P value]. The hsa-miR-26b-5p (adj. P value < 0.0001),
   hsa-miR-16-5p (adj. P value < 0.0001), hsa-miR-124-3p (adj. P value <
   0.0001), hsa-miR-92a-3p (adj. P value < 0.0001), hsa-let-7b-5p (adj. P
   value < 0.0001), H2AZ1-DT (adj. P value < 0.0001), PRC1-AS1 (adj. P
   value < 0.0001), TMPO-AS1 (adj. P value < 0.0001), DEPDC1-AS1 (adj. P
   value < 0.0001), and RRM1-AS1 (adj. P value < 0.0001) are non-coding
   RNAs having highest connection with genes. HNF4A (adj. P value <
   0.0001), FLI1 (adj. P value < 0.01), MITF (adj. P value = 0.0336),
   FOXP1 (adj. P value = 0.0325), SPI1 (adj. P value = 0.0006), E2F4 (adj.
   P value < 0.0001), and FOXA2 (adj. P value < 0.0001) are the TFs
   enriched with high connectivity.

Integrated network construction and analysis

   The characteristics of nodes in the network were investigated to
   identify their significance. In the first step, GRNs were created from
   the miRNA-mRNA, lncRNA-mRNA, and TF-mRNA interaction network with 2311,
   1512, and 2435 nodes, respectively. The protein interaction network
   with 1177 interconnected nodes was created. Then, a merged network with
   3063 nodes and 41328 edges was constructed. The topological network
   analysis revealed hub nodes that 30 (1%) of them and their interaction
   are represented in [91]Figure 2. HNF4A, FLI1, MITF, E2F4, SPI1, FOXA2,
   VDR, and SMAD3 are the central nodes with overlapped centrality
   parameters. [92]Table 2 also shows the 10 nodes in each class with the
   highest value. Sub-network and module analysis showed 35 clusters, of
   which six key clusters were selected based on functional analysis.
   Modules c, d, e, and f have interaction with non-coding RNAs. Clusters
   are the dense interaction sub-networks that could have crucial
   functions in the network. SMNDC1, PSMD5, ANAPC1, INO80B, and GNAO1 are
   seed genes in clusters a-e, respectively. Cluster f is the lack of seed
   genes [[93]Figure 3].

Figure 2.

   Figure 2
   [94]Open in a new tab

   Network with hub nodes. Nodes with the highest degree, betweenness, or
   closeness centrality are considered the hub genes

Table 2.

   Hub nodes. The 10 top nodes with the highest index in each group in the
   gene regulatory network are shown. Nodes with the highest degree that
   is the number of the total number of edges connected to a node,
   betweenness centrality is used to determine the nodes that are in the
   shortest paths, nodes that bridge from one part of a graph to another,
   and closeness centrality is used to identify the nodes that rapidly
   interact with other nodes in the network. These nodes are considered
   the hub genes
   Term Degree Betweenness centrality Closeness centrality
   Targeted genes FOXM1 227 FOXM1 0.01 PLK1 0.47
   UBA52 169 NR1H3 0.01 AURKB 0.47
   NR1H3 152 GNG12 0.01 HSP90AA1 0.47
   PLK1 129 − − TNRC6B 0.47
   NCBP2 125 − − NDE1 0.46
   POLR2D 123 − − CCND1 0.46
   RNPS1 121 − − PTBP1 0.46
   ESR1 119 − − GSK3B 0.46
   HSPA8 117 − − RANGAP1 0.46
   POLR2F 113 − − WASL 0.46
   TFs HNF4A 979 HNF4A 0.12 FLI1 0.53
   FLI1 871 FLI1 0.11 HNF4A 0.51
   MITF 820 MITF 0.08 SMAD3 0.49
   E2F4 587 E2F4 0.04 MITF 0.48
   FOXP1 560 FOXP1 0.04 VDR 0.48
   SPI1 509 SPI1 0.04 E2F4 0.47
   FOXA2 492 FOXA2 0.03 SPI1 0.46
   GABP 427 SOX2 0.03 FOXA2 0.46
   SOX2 425 GABP 0.02 FOXP1 0.45
   VDR 403 VDR 0.02 TTF2 0.45
   miRNAs hsa-miR-26b-5p 404 hsa-miR-26b-5p 0.03 hsa-miR-124-3p 0.44
   hsa-miR-16-5p 337 hsa-miR-124-3p 0.02 hsa-miR-155-5p 0.43
   hsa-miR-124-3p 296 hsa-miR-16-5p 0.01 hsa-miR-26b-5p 0.42
   hsa-miR-92a-3p 281 hsa-miR-92a-3p 0.01 hsa-miR-16-5p 0.42
   hsa-let-7b-5p 229 hsa-let-7b-5p 0.01 hsa-miR-92a-3p 0.4
   hsa-miR-484 219 hsa-miR-484 0.01 hsa-miR-193b-3p 0.4
   hsa-miR-193b-3p 215 hsa-miR-193b-3p 0.01 hsa-miR-17-5p 0.4
   hsa-miR-93-5p 209 hsa-miR-192-5p 0.01 hsa-let-7b-5p 0.39
   hsa-miR-17-5p 205 hsa-miR-155-5p 0.01 hsa-miR-484 0.39
   hsa-miR-192-5p 195 hsa-miR-1-3p 0.01 hsa-miR-192-5p 0.39
   LncRNAs H2AZ1-DT 65 − − PRC1-AS1 0.36
   PRC1-AS1 58 − − TMPO-AS1 0.36
   TMPO-AS1 54 − − DDX11-AS1 0.36
   DEPDC1-AS1 53 − − PRMT5-AS1 0.36
   RRM1-AS1 53 − − LINC01775 0.36
   CSRP3-AS1 52 − − DIAPH 3-AS1 0.36
   HMMR-AS1 52 − − LINC01224 0.36
   DDX11-AS1 51 − − GTF3C2-AS1 0.36
   PRMT5-AS1 50 − − LINC01842 0.36
   LINC01357 48 − − ODF2-AS1 0.36
   [95]Open in a new tab

Figure 3.

   Figure 3
   [96]Open in a new tab

   Prominent modules with cluster mining. The (a, b, c, d, e, and f) are
   the selected modules, (clusters 1, 3, 8, 10, 11, and 15 from MCODE
   analysis results). To classify the sub-networks, densely connected
   region in the network with significant functions, the default setting
   was considered. The degree cut-off = 2, node score cut-off = 0.2,
   k-core = 2, and max. depth = 100

Survival analysis of candidate hub genes

   The correlation of selected hub genes with survival times was
   identified. The correlation analysis between the target gene expression
   level and colon cancer survival rate was conducted. For HNF4A, upper
   quartile survival is 19 and 60 months for low and high expression in
   cohort samples, respectively. Also, low and high expression of FLI1 and
   MITF correlated with approximately 45 and 18 months and 56 and 19
   months, respectively. The results of survival analysis of E2F4, SPI1,
   FOXA2, VDR, and SMAD3 are shown in [97]Figure 4. Also, from hub miRNAs,
   the low and high expression hsa-miR-484 is correlated with changes in
   survival times in 21 tumor types.

Figure 4.

   Figure 4
   [98]Open in a new tab

   Results of survival analysis are shown in the Kaplan-Meier survival
   plots. The parameters hazard ratio (HR) and 95% confidence are
   represented as the reliable range. The P value less than 0.05 indicates
   a significant results

Mutation and expression analysis of candidate hub genes

   The analysis of mutation distribution of HNF4A, FLI1, MITF, E2F4, SPI1,
   FOXA2, VDR, and SMAD3 based on COSMIC, ICGC, and TCGA databases was
   conducted. Cancer mutation distribution of HNF4A is 5342 mutations, 179
   high-impact mutations, 269 mutations, and 3,760 copy number variation
   (CNV) based on COSMIC, ICGC, and TCGA databases, respectively. The
   results of other hub genes are shown in [99]Table 3. Based on the ICGC
   database, only the VDR gene has targeting compounds. The dn-101
   (ZINC000100015048) and Dovonex (ZINC000003921872) are the FDA-approved
   components. The level of gene expression of each gene in several types
   of cancer was obtained. The results are shown in [100]Figure 5.

Table 3.

   Results of mutation distribution of selected hub genes are based on
   data collected from the COSMIC, ICGC, and TCGA databases
   Genes COSMIC ICGC TCGA
   HNF4A 5342 mutations 179 high-impact mutations 252 cases affected by
   269 mutations across 35 projects
   3,886 cases affected by 3,760 CNV events across 33 projects
   FLI1 4563 mutations 17 high-impact mutations 196 cases affected by 194
   mutations across 31 projects
   3,401 cases affected by 3,300 CNV events across 33 projects
   MITF 10611 mutations 37 high-impact mutations 172 cases affected by 176
   mutations across 28 projects
   3,585 cases affected by 3,493 CNV events across 33 projects
   E2F4 197 mutations 72 high-impact mutations 68 cases affected by 69
   mutations across 25 projects
   3,671 cases affected by 3,630 CNV events across 33 projects
   SPI1 1047 mutations 8 high-impact mutations 80 cases affected by 74
   mutations across 26 projects
   2,609 cases affected by 2,561 CNV events across 33 projects
   FOXA2 637 mutations 93 high-impact mutations 150 cases affected by 145
   mutations across 31 projects
   3,752 cases affected by 3,684 CNV events across 33 projects
   VDR 2067 mutations 112 high-impact mutations 107 cases affected by 102
   mutations across 25 projects
   2,683 cases affected by 2,623 CNV events across 33 projects
   SMAD3 2990 mutations 135 high-impact mutations 160 cases affected by
   155 mutations across 25 projects
   2,769 cases affected by 2,681 CNV events across 33projects
   [101]Open in a new tab

Figure 5.

   Figure 5
   [102]Open in a new tab

   DE of selected hub genes in the most common tumor types is presented

Downstream functional analysis

   The downstream functional analysis determines the relation of selected
   hub genes with significant biological states. Pathway enrichment
   analysis results of the top hub genes shown are related to cell cycle,
   oncogene-induced senescence, signaling by transforming growth
   factor-beta (TGF-β) receptor complex, signaling by FGFR2, signaling by
   PTK6, and transcriptional regulation by RUNX1-3. The enrichment
   analysis of genes in modules was conducted. Terms were summarized based
   on the repeated terms; the results are shown in [103]Figure 6. The
   miRNA functional analysis revealed the hub miRNAs strongly related to
   cancer and especially enriched with proteoglycans in cancer, pathways
   in cancer, focal adhesion, signaling pathways regulating pluripotency
   of stem cells, gap junction, fatty acid metabolism, and regulation of
   actin cytoskeleton. The hub lncRNA enrichment analysis discovered was
   correlated with cancer phenotypes, such as breast carcinoma, lung
   adenocarcinoma, esophageal carcinoma, stomach carcinoma, and colon
   carcinoma.

Figure 6.

   Figure 6
   [104]Open in a new tab

   Pathway enrichment analysis from each of the selected modules based on
   KEGG and Reactome pathway database, respectively (1: a; 2: b; 3: c; 4:
   d; 5: e; 6: f). The results are depicted based on significant results.
   Signaling pathways with an adjusted P value < 0.05 were selected. For P
   value correction, Bonferroni’s step-down was considered. The number of
   involved genes and the percentage of association genes related in each
   term are shown

DISCUSSION

   In this study, we identified 2727 DE genes obtained with a
   meta-analysis of CRC from the microarray datasets with 308 samples.
   Genes were enriched with miRNAs, lncRNAs, and TFs at transcriptional
   and post-transcription levels. The integrated network was created by
   merging the protein interaction network and GRNs, including miRNA-mRNA,
   lncRNA-mRNA, and TF-mRNA interaction. According to the merged network
   analysis with 3063 nodes, the hub nodes, including all of the
   parameters measured, are HNF4A, FLI1, MITF, E2F4, SPI1, FOXA2, VDR, and
   SMAD3. A high degree indicates they have many connections and are vital
   for the surveillance of the network. Betweenness centrality measures
   the number of shortest paths going through a node. Closeness centrality
   is the physically nearest node to all nodes.[[105]29] As mentioned, the
   hub genes are frequently mutated in several types of cancers and
   correlated with DE genes in different types of cancers, especially in
   CRC. Also, these genes are related to RFS time in patients with colon
   cancer. There was a significant relationship between these hub genes
   with cell cycle, oncogene-induced senescence, and TGF-β receptor
   complex signaling based on functional analysis.

   TFs particularly had higher centrality scores. Also, the top main
   miRNAs named miR-26b-5p, miR-16-5p, miR-124-3p, miR-92a-3p, let-7b-5p,
   miR-484, miR-193b-3p, and miR-192-5p. The miR-26b-5p as a
   tumor-suppressive miRNA was known in various cancer tissues[[106]30]
   and was connected with 404 genes in this study. The miR-16-5p is known
   as a tumor suppressor by inhibiting proliferation and invasion[[107]31]
   and controlling 337 genes in the network. The miR-124-3p acts as an
   inhibitor of tumor growth and is down-regulated in multiple human
   neoplasms[[108]32] and here linked to 296 genes.

   miR-92a-3p correlated with 281 nodes is involved in several tumorigenic
   processes, including cell proliferation, survival, and
   invasion.[[109]33] Other non-coding RNAs, lncRNA, were screened,
   including PRC1-AS1, TMPO-AS1, DDX11-AS1, and PRMT5-AS1. In the
   GeneCards database reference, PRC1-AS1 is related to hepatocellular
   carcinoma. TMPO-AS1 contributes significantly to CRC
   progression.[[110]34] DDX11-AS1 has potential roles in proliferation
   and invasion in CRC cells linked to 51 genes.[[111]35] FOXM1, NR1H3,
   and PLK1 could be pointed out as key genes. FOXM1 was shown to be
   required for malignancy of CRC[[112]36] correlated with 227 nodes.
   NR1H3 is a regulatory lipid metabolism and has an important role in
   cancer progression[[113]37] and is connected with 152 nodes. PLK1,
   linked with 129 nodes, acts as a progression factor of invasion in CRC
   and many human cancers.[[114]38] Results of previous experimental
   studies demonstrate the accuracy of explored approaches and emphasize
   the significance of their roles in CRC progression. Modules are
   high-density regions in the network that detect functional genes. The
   critical module assessment was shown to be closely linked to cancer
   development. The first cluster-related signaling is Rho GTPases,
   metabolism of RNA, signaling by FGFR2, interferon signaling, and cell
   cycle. Rho GTPases play a vital role in forming dynamic actin-rich
   membrane protrusions and cell-cell turnover and cell-extracellular
   matrix (ECM) adhesions required for efficient cancer cell
   invasion.[[115]39] It has been reported that Rho GTPases could regulate
   the expression of Wnt target genes in colon cancer cells.

   Moreover, the knockdown of Rac1, as a member of Rho GTPases, could
   suppress the expression of Wnt-induced genes NKD1, BAMBI, and MMP-7,
   which are involved in invasion and metastasis.[[116]40] The second
   module is correlated to the regulation of apoptosis, stabilization of
   p53, signaling by NOTCH4, signaling by Wnt, MAPK signaling cascades,
   signaling by Hedgehog, cellular response to hypoxia, and regulation of
   PTEN stability and activity. The Wnt/β-catenin signaling pathway is a
   developmentally preserved and one-of-a-kind signaling pathway that
   regulates gene expression and cell invasion, migration, proliferation,
   and differentiation within the start and progression of CRC. Wnt
   signaling cascades and at least one protein of the Wnt/β-catenin
   signaling pathway are mutated in more than 94% of CRC cases.

   The development of these mutations is thought to be an early occasion
   and the essential driving force in early-stage CRC.[[117]41]
   ECM-receptor interaction, cell cycle, and MET-promoted cell motility
   were enriched with hubs of the third cluster. The ECM is a noncellular
   component of tissue that biochemically and structurally supports cells.
   The ECM comprises different glycoproteins, such as collagens, laminins,
   and fibronectins. Interactions of the ECM and cellular receptors
   constitute one of the crucial pathways in CRC progression and
   metastasis.[[118]42] Smooth muscle contraction, PPAR signaling pathway,
   leukocyte transendothelial migration, NF-κB signaling pathway, and
   signaling by erythropoietin are the main terms linked with genes of the
   fourth cluster. NF-κB activation supports tumorigenesis by enhancing
   cell proliferation and angiogenesis, inhibiting apoptosis, and
   promoting cell attack and metastasis. It plays a central part in
   regulating cell death in the development and progression of CRC. The
   antiapoptotic activity of NF-κB is mediated through Bcl2, Bcl-xL,
   cFLIP, and cIAP2, among other genes. Antiapoptotic protein BAG-1
   (Bcl-2-associated athanogene-1) is included in crucial processes, such
   as proliferation, cell signaling, transcription, and apoptosis.
   Overexpression of BAG-1 has been reported in colorectal adenomas and
   carcinomas.[[119]43] Interaction between L1 and ankyrins, Eph-ephrin
   signaling, circadian clock, and Ca2+ pathway are the critical pathways
   related to the fifth module.

   The physiologic function of the colon is to treat and assimilate
   undigested nourishment, electrolytes, and water—the colon exchanges
   Na+, K+, and Ca2+ to preserve electrolyte adjustment. Appropriately,
   ion channels are commonly communicated on the colonic epithelial cell
   film. Within the case of colorectal colitis inflammation and
   electrolyte unsettling influence, the event of colon cancer may be
   encouraged. Colonic inflammation could be a risk factor for the
   development of CRC. Chronic intestinal inflammatory diseases regularly
   lead to CRC through a handle called colitis-associated carcinogenesis.
   As a result, maintaining adjust of electrolytes within the environment
   of colon epithelial cells is vital to avoid CRC. Among those
   electrolytes, Ca2+ is the foremost high-profile ion in CRC progression,
   and its impact on cancer development is controversial.[[120]44]

   Finally, the sixth sub-network is associated with mTOR signaling,
   collagen formation, TP53-regulated metabolic genes, and macroautophagy.
   mTOR signaling pathway plays a significant part in cancer development,
   counting proliferation, metastasis, survival, and angiogenesis. The CRC
   pathogenesis is exceptionally complex and expects the presence of
   numerous genetic mutations that will be included in cancer progression.
   Besides, CRC progression is affected by inflammatory cells and their
   inflammatory mediators, such as cytokines, that can enact signaling
   pathways, leading to tumoral development.[[121]45] Each node in each
   module, particularly seed genes, has the potential to select as a
   candidate for experimental validation.

CONCLUSION

   Here, using computational tools we generate a systematic view of the
   CRC disorder to highlight the central genes, miRNAs, lncRNAs, and TFs
   as effective target therapies. With the determination of key pathways,
   we explored the comprehensive molecular mechanisms involved in the
   progression of this disorder.

Conflicts of interest

   There are no conflicts of interest.
   Supplementary table-1

   Differentially expressed genes obtained with meta- analysis.
   [122]ABR-14-99_Suppl1.pdf^ (320KB, pdf)
   Supplementary table-2

   micro-RNAs Enriched with DEGs.
   [123]ABR-14-99_Suppl2.pdf^ (246.7KB, pdf)
   Supplementary table-3

   Long- non coding RNAs Enriched with DEGs.
   [124]ABR-14-99_Suppl3.pdf^ (155.1KB, pdf)
   Supplementary table-4

   Transcription factors enriched with DEGs.
   [125]ABR-14-99_Suppl4.pdf^ (178.8KB, pdf)
   Supplementary Figure 1

   The quality control assessment were performed. Some of samples were
   omitted due to inappropriate quality, including: ([126]GSM2982932,
   GSM29829324, [127]GSM2982935, [128]GSM2982936, [129]GSM2982937,
   [130]GSM2982939, [131]GSM2982941, and [132]GSM2982956) in
   [133]GSE110224, ([134]GSM588834, [135]GSM588857, [136]GSM588858, and
   [137]GSM588860) in [138]GSE23878, ([139]GSM916734 and [140]GSM916721)
   in [141]GSE37364, ([142]GSM292928) in [143]GSE110223, ([144]GSM2062358,
   [145]GSM2062359, [146]GSM2062360, [147]GSM2062361, [148]GSM2062362, and
   [149]GSM2062363) in [150]GSE77953, ([151]GSM1327497) in [152]GSE54986,
   (GSM192368, GSM192369, [153]GSM192370, GSM192371, and GSM192377) in
   [154]GSE74602, and ([155]GSM615880, [156]GSM615882, and [157]GSM615895)
   in [158]GSE25070. Last PCA plot were obtained from NetworkAnalyst
   database based on merged data
   [159]ABR-14-99_Suppl1.tif^ (1.2MB, tif)

Acknowledgment