Abstract Introduction More and more findings have demonstrated that right-sided colon cancers (RCC) and left-sided colon cancers (LCC) are distinct clinical and biological entities and suggest that they should be treated as different diseases. However, the reasons why RCC and LCC harbor different clinical and biological features remain unclear. Materials and methods To identify the genomic expression differences between RCC and LCC and uncover the mechanisms underlying these differences, we chose the gene expression profiles of [37]GSE14333 from the Gene Expression Omnibus (GEO) database as an object of study. Then, a systematic and integrative bioinformatics analysis was performed to research the possible mechanism of the differentially expressed (DE) genes from the Gene Expression Omnibus dataset including gene ontology (GO) analysis, pathway enrichment analysis, protein–protein interaction (PPI) network construction, and module analysis. Totally, we extracted 3,793 DE genes from samples of colon cancer including 1,961 genes upregulated in RCC and 1,832 genes upregulated in LCC from the selected dataset. Results The results of GO and pathway enrichment analysis indicated that RCC and LCC could predispose to different pathways regulated by different genes. Based on the PPI network, PCNA, TP53, HSP90AA1, CSNK2A1, UBB, LRRK2, ABL1, PRKACA, CAV1, and JUN were identified as the key hub genes. Also, significant modules were screened from the PPI network. Conclusion In conclusion, the present study indicated that the identified genes and pathways may promote new insights into the underlying molecular mechanisms contributing to the difference between RCC and LCC and might be used as specific therapeutic targets and prognostic markers for the personalized treatment of RCC and LCC. Keywords: colon cancer, location, differentially expressed genes, bioinformatics analysis Introduction Colon cancer is one of the most commonly diagnosed malignancies and a major cause of cancer-related death all around the world.[38]^1 Despite great achievements in the diagnosis and treatment techniques in recent years, the mortality of colon cancer remains high.[39]^2 Up to date, accumulating evidence has demonstrated that right-sided colon cancers (RCC) and left-sided colon cancers (LCC) differ in terms of embryological origins, physiology, and pathology.[40]^3 It has previously been proposed that RCC and LCC should be treated as different diseases for different drug sensitivities, different therapeutic efficacies, and thus different prognoses.[41]^4 Therefore, consideration of the location of colon cancer would help in proceeding with improved diagnosis, prognosis, treatment adjustment, as well as therapeutic assessment. While extensive research has been conducted on the clinical features regarding RCC and LCC, there has been little progress in uncovering the precise mechanisms at the molecular level underlying RCC and LCC progression, which may limit the ability of therapeutic strategy adjustment. Recent evidence suggests that the mechanism underlying the different clinical outcomes between RCC and LCC may be associated with genomic determinants. Previous studies have identified that several genes play a vital role in the occurrence and development of RCC and LCC, respectively, and are involved in some important pathways.[42]^5 A study evaluating relevant altered genes in colon cancer has identified different pathways accounting for relapse in RCC and LCC.[43]^6 Despite the progress achieved, the potential molecular mechanism is still poorly understood. Consequently, there is a great need to explore the distinct molecular subtypes of colon cancer including the key mRNA biomarkers and corresponding pathways. In our study, we aimed to explore and predict the potential mechanism in RCC and LCC with the use of several bioinformatic approaches. By way of analyzing the gene expression microarray data and constructing the protein–protein interaction (PPI) network, we would like to explore some candidate genes with a differential expression between RCC and LCC and provide potential biomarkers for RCC and LCC regarding the diagnosis, prognosis, and drug targets. After the biological functions and the related signaling pathways were evaluated by an integrated bioinformatic analysis, the findings from our analysis may provide further insight into the establishment and progression of RCC and LCC. Materials and methods Data collection The microarray data ([44]GSE14333) were retrieved from the public database National Center for Biotechnology Information Gene Expression Omnibus.[45]^7 [46]GSE14333 was based on the [47]GPL570 platform, which is determined by Affymetrix Human Genome U133Plus 2.0 arrays. We only evaluated the genome expression data with clinical data of the patients provided by the submitter. Finally, we selected the data consist of 101 RCC patients and 93 LCC patients, which were part of [48]GSE14333. Differential expression analysis Normalized gene expression data were downloaded as the direct form for further analysis, and the dataset contained the expression information of 23,495 mRNAs. The differential expression analysis was carried out on the basis of Student’s t-test in the Limma R package. The P-value of <0.05 was regarded as the cutoff value of statistical significance. Gene ontology (GO) and pathway enrichment analysis The GO analysis was conducted to assess the differentially expressed (DE) genes between RCC and LCC at the functional level.[49]^8 The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was carried out to explore the further cellular function and molecular function (MF) of the DE genes.[50]^9 In this study, we mapped the DE genes between RCC and LCC to the Database for Annotation, Visualization and Integrated Discovery (DAVID) and accomplished the enrichment analysis including GO and KEGG analyses.[51]^10 We selected the top significantly enriched items and confirmed their correlations with the genomic differences by a thorough search in PubMed. PPI network construction and module analysis Search Tool for the Retrieval of Interacting Genes (STRING) database is one of the most powerful tools for the PPI investigation.[52]^11 This online tool has been updated to Version 10.0 with 9,643,763 proteins from 2,031 organisms. STRING is a database of experimentally validated and computationally predicted PPI information. Based on STRING, only the known interactions proved by biological experiments with a combined score of >0.4 were retrieved as significant items for further analysis. The Cytoscape software is a powerful tool for providing a unified conceptual framework by the integration of biomolecular interaction networks.[53]^12 In the study, we used the plug-in Molecular Complex Detection (MCODE) of Cytoscape to identify the most significant module in the PPI network. Then, the function and pathway enrichment analyses were conducted with the DE genes in the selected module. In the present analysis, P-value <0.05 was considered as statistically significant. Results Patient characteristics As mentioned earlier, we concentrated on the RCC and LCC samples with the clinical data available in [54]GSE14333. The data we used contained information of 101 RCC patients and 93 LCC patients. The detailed information for the patients included in our study is listed in [55]Table 1. Table 1. Clinical characteristics of the study population with RCC and LCC Clinical characteristics RCC LCC Patients 101 93 Median age (years) 69±12 62±13 Sex  Male 46 (45.5%) 55 (59.1%)  Female 55 (54.5%) 38 (40.9%) Dukes staging system  A 16 (15.8%) 16 (17.2%)  B 44 (43.6%) 37 (39.8%)  C 41 (40.6%) 40 (43.0%)  DFS 41±27 44±28 [56]Open in a new tab Abbreviations: DFS, disease-free survival; LCC, left-sided colon cancers; RCC, right-sided colon cancers. As shown in [57]Table 1, it was revealed that tumor location may be related to age (P<0.001). Meanwhile, there does not seem to be any prognosis differences between LCC patients and RCC patients as far as disease-free survival (DFS) as the P-value was calculated at 0.23 (>0.05). Identification of DE genes Based on the Limma R package, a total of 3,793 genes were identified to differentially express between RCC and LCC patients (P-value <0.05), of which 1,961 genes were upregulated and 1,832 genes were downregulated in RCC compared with LCC. The top 50 upregulated DE genes in RCC and LCC are presented at [58]Table 2. Table 2. Top 50 upregulated DE genes in RCC and LCC RCC __________________________________________________________________ LCC __________________________________________________________________ Gene P-value Gene P-value Gene P-value Gene P-value HOXC6 7.35E–10 TIGD2 5.88E–05 PRAC1 2.32E–26 FXYD1 1.36E–05 GLOD4 2.11E–07 ME2 6.86E–05 HOXB13 6.94E–08 CPE 1.40E–05 FLRT3 2.50E–07 AHR 7.53E–05 [59]AX748273 1.15E–07 DBNDD2 1.50E–05 PAX9 3.64E–07 MSX2 8.17E–05 MUC12 3.09E–07 ZNF813 1.50E–05 CDC42EP2 6.88E–07 PSMC4 8.80E–05 LY6G6D 5.53E–07 ZNF347 2.19E–05 TC2N 1.04E–06 BSG 8.84E–05 ZNF610 7.34E–07 STMN2 2.19E–05 ZIC2 1.28E–06 [60]AX747191 9.21E–05 ST6GAL2 1.51E–06 GNG4 2.36E–05 FOXD1 2.17E–06 ALAS1 9.58E–05 KHDRBS3 2.07E–06 TMSB15A 2.42E–05 NDUFV2 1.98E–05 CAMK2D 9.60E–05 AKAP2 2.49E–06 MIR6716 2.59E–05 VPS53 1.99E–05 DDIT3 9.61E–05 FGD1 2.50E–06 SLC22A17 2.68E–05 AARS2 2.01E–05 PIK3R3 9.89E–05 ELAVL2 2.70E–06 DDX27 2.77E–05 ESCO1 2.18E–05 RP11-61L19.3 1.00E–05 INSL5 2.88E–06 IFNLR1 2.97E–05 DEGS2 2.23E–05 IRF9 1.19E–04 XPNPEP2 3.32E–06 SERINC3 3.04E–05 SNRPD1 2.28E–05 PRMT5 1.23E–04 MAGEH1 3.52E–06 SLC35D3 3.22E–05 KIAA1468 2.73E–05 GMDS 1.25E–04 DPM1 3.55E–06 GPR153 3.28E–05 RBBP8 2.83E–05 ZIC5 1.37E–04 TLE2 3.78E–06 ZNF134 3.32E–05 OR6A2 2.96E–05 GOT1 1.37E–04 OGN 4.83E–06 PLAGL2 3.34E–05 ABCB11 3.31E–05 FAM46A 1.40E–04 ZNF415 5.73E–06 CLDN8 3.48E–05 GAPDH 3.42E–05 SLC10A2 1.49E–04 WASF3 6.02E–06 KLHL34 3.57E–05 EIF4A1 4.01E–05 DAOA 1.60E–04 BCAS4 6.85E–06 NES 3.65E–05 HOXB6 4.16E–05 MOK 1.66E–04 CERK 7.11E–06 ROCK2 3.73E–05 EIF2B2 4.68E–05 SEMG1 1.73E–04 CHMP4B 7.88E–06 KLF7 4.12E–05 DGKQ 4.84E–05 HSPA4L 1.82E–04 ZFP28 9.09E–06 ZCCHC24 4.73E–05 DNAH2 5.34E–05 GCH1 1.83E–04 ZNF542P 1.25E–05 RP11-524D16_A.3 4.95E–05 SDHA 5.35E–05 TIMM50 1.89E–04 PDE3A 1.25E–05 SPIN3 5.22E–05 [61]Open in a new tab Note: The statistical significance (P-value) was calculated using the Student’s t-test. Abbreviations: DE, differentially expressed; LCC, left-sided colon cancers; RCC, right-sided colon cancers. GO enrichment analysis GO analysis is a common useful annotating method for systematically evaluating the characteristic biological attributes of genes and gene products from all organisms. We performed the GO analysis by separately mapping the DE genes in RCC and LCC to the online software DAVID at the following three different GO levels: MF, cell component (CC), and biological processes (BP). The top 10 items that were significantly enriched by the DE genes at each of the above GO levels are outlined in [62]Figure 1. Figure 1. [63]Figure 1 [64]Open in a new tab GO annotation of DE genes. Notes: (A) Top 10 GO items for DE genes upregulated in RCC. (B) Top 10 GO items for DE genes upregulated in LCC. Abbreviations: DE, differentially expressed; GO, gene ontology; LCC, left-sided colon cancers; RCC, right-sided colon cancers; MF, molecular function; CC, cell component; BP, biological processes. The enriched GO terms in BP for RCC mainly included the process of biosynthesis and metabolism, while the enriched GO terms in BP for LCC were associated with cell adhesion, angiogenesis, collagen catabolic process, and regulation of canonical Wnt signaling pathway. For the CC items, the upregulated DE genes in RCC were enriched in the hallmarks of a cell such as cytosol, nucleoplasm, cytoplasm, and nucleus and the upregulated DE genes in LCC were enriched in extracellular components including extracellular matrix, proteinaceous extracellular matrix, extracellular space, and extracellular exosome. Most GO MF items for RCC converged on the enzymes’ metabolic processes and energy metabolism such as protein binding, ATP binding, oxidoreductase activity, electron carrier activity, GTP binding, and GTPase activity, while most GO MF items for LCC were related to biological combination processes, for instance, heparin binding, protein binding, Wnt–protein binding, and calcium ion binding. The GO annotation results revealed the differences between LCC and RCC to a certain extent. KEGG pathway enrichment analysis KEGG pathway analyses were conducted by DAVID on the whole upregulated DE genes of RCC and LCC, respectively. The top 15 significantly enriched signaling pathways of RCC and LCC are illustrated in [65]Figure 2. The upregulated DE genes in RCC were enriched in biosynthesis and metabolism pathways (glycolysis/gluconeogenesis, fructose and mannose metabolism, carbon metabolism, biosynthesis of amino acids, and so on), proteasomes, cell cycle, and RNA transport, while the upregulated DE genes in LCC were enriched in protein digestion and absorption, pathways in cancer, ECM–receptor interaction, vascular smooth muscle contraction, and several important signaling pathways including PI3K–Akt, Ras, Wnt, cGMP–PKG, calcium, and cAMP signaling pathways. Figure 2. [66]Figure 2 [67]Open in a new tab Pathway enrichment results for DE genes. Notes: (A) Top 15 pathways enriched by DE genes upregulated in RCC. (B) Top 15 pathways enriched by DE genes upregulated in LCC. Abbreviations: DE, differentially expressed; LCC, left-sided colon cancers; RCC, right-sided colon cancers. PPI network construction The information provided by the STRING database was integrated and used to construct the PPI network. A PPI network with statistical significance made up of 804 nodes was screened with the set of 1,961 upregulated DE genes in RCC. Meanwhile, with the set of 1,832 upregulated DE genes in LCC, a statistically significant network consisting of 589 nodes was discovered ([68]Table 3). The degree distributions of the network nodes are illustrated in [69]Figure 3. In the PPI network set up by the upregulated genes in RCC, the top five hub nodes with higher degrees were detected including PCNA, TP53, HSP90AA1, CSNK2A1, and UBB, while the top five nodes with higher degrees were screened in the PPI network constructed by the upregulated genes in LCC, namely LRRK2, ABL1, PRKACA, CAV1, and JUN. The detailed information of the selected hub genes from the PPI network is listed in [70]Table 4. The sub-network was reconstructed with the selected hub nodes and their first neighbor genes, as plotted in [71]Figure 4. Table 3. Topological information of networks detected by the upregulated genes in RCC and LCC, respectively Network Number of nodes Number of edges Average number of neighbors Hub genes RCC 804 1,759 4.38 UBB, HSP90AA1, TP53, PCNA, CSNK2A1 LCC 589 872 2.96 PRKACA, CAV1, LRRK2, ABL1, JUN [72]Open in a new tab Abbreviations: LCC, left-sided colon cancers; RCC, right-sided colon cancers. Figure 3. [73]Figure 3 [74]Open in a new tab Degree distributions of network nodes. Notes: (A) Node degree distributions of network constructed with the DE upregulated in RCC. (B) Node degree distributions of network set up with the DE upregulated in LCC. Abbreviations: DE, differentially expressed; LCC, left-sided colon cancers; RCC, right-sided colon cancers. Table 4. Detailed information of the selected hub genes from the PPI network Gene Network P-value Degree UBB RCC 0.012 81 HSP90AA1 RCC 0.012 75 TP53 RCC 0.003 71 PCNA RCC 0.005 35 CSNK2A1 RCC 0.002 33 PRKACA LCC 0.049 26 CAV1 LCC 0.007 25 LRRK2 LCC 0.021 18 ABL1 LCC 0.002 18 JUN LCC 0.010 18 [75]Open in a new tab Note: The statistical significance (P-value) was calculated using the Student’s t-test. Abbreviations: LCC, left-sided colon cancers; PPI, protein–protein interaction; RCC, right-sided colon cancers. Figure 4. [76]Figure 4 [77]Open in a new tab The sub-network reconstructed with the selected hub nodes and their first neighbor genes. Notes: (A) The sub-network for RCC. (B) The sub-network for LCC. Abbreviations: LCC, left-sided colon cancers; RCC, right-sided colon cancers. Module analysis of the PPI network Using the MCODE package, the most significant modules in the PPI network with the highest score for RCC and LCC were detected. The enrichment analysis of the DE genes involved in the modules was also carried out with DAVID, and the results displayed that these genes were mainly enriched in proteasomes, cell cycle, DNA replication, and ribosomes ([78]Figure 5). Figure 5. [79]Figure 5 [80]Open in a new tab The most significant modules from the PPI network. Notes: (A) The most significant module in the PPI network for RCC. (B) The most significant module in the PPI network for LCC. (C and D) KEGG pathways enriched by all the nodes involved in the identified modules. Abbreviations: KEGG, Kyoto Encyclopedia of Genes and Genomes; LCC, left-sided colon cancers; PPI, protein–protein interaction; RCC, right-sided colon cancers. Discussion Considering that precision medicine emphasizes the importance of personalized and precise treatment, researchers have paid more attention to the impact of tumor location on the diagnosis, prognosis, treatment selection, and therapeutic assessment.[81]^13 RCC and LCC have been elucidated to be different clinically, pathologically, and genetically and could predispose to different clinical assumptions regarding tumorigenesis as well as survival. However, the precise mechanisms have not been fully understood. In this study, the potential genomic determinants that contribute to the difference between RCC and LCC were evaluated by employing a series of bioinformatic approaches. Recent studies have demonstrated the clinical differences between RCC and LCC, showing that patients suffering from LCC were more likely to be younger than RCC patients, while RCC was more often diagnosed in women than men compared to LCC.[82]^14 In the present study, we also analyzed the clinicopathological differences between RCC and LCC patients included in our evaluation, which was in accordance with the age and gender differences of previous evidence. Our results indicated that patients suffering from RCC had worse outcome than patients with LCC, although this was not significant perhaps because of the small sample size, which has been confirmed in an integrated meta-analysis based on 1,437,846 patients.[83]^15 Gene expression profile containing 101 RCC and 93 LCC samples was used in our study, and 1,961 DE genes upregulated in RCC and 1,832 DE genes upregulated in LCC were identified. By setting up the PPI network, we identified some key genes with high degrees in the network: PCNA, TP53, HSP90AA1, CSNK2A1, UBB, LRRK2, ABL1, PRKACA, CAV1, and JUN. PCNA was identified as one of the most important molecular biomarkers for proliferation with its powerful role in replication, cancer cell growth, death, and maintenance.[84]^16 As an indispensable modulator of DNA synthesis, PCNA is involved in the regulation of various essential functions including the repair of DNA damage, avoidance of DNA damage, control of cell cycle, survival of cells, assembly of chromatin, and transcription of gene.[85]^17 TP53, a well-studied tumor suppressor gene that encodes p53, frequently mutates in a majority of human tumors including colon cancer.[86]^18 The HSP90AA1 gene that encodes the heat shock protein 90α (Hsp90α), has been shown to regulate the stability of several proteins that are important for tumor progression and is identified as a promising target for cancer treatment.[87]^19 CSNK2A1 has been reported to participate in tumorigenesis by way of phosphorylating multiple important proteins and the inhibition of CSNK2A1 could decrease the proliferation and invasiveness of cancer cells.[88]^20 The polyubiquitin gene UBB is a regulatory protein involved in ubiquitin, and the knockdown of UBB could effectively downregulate the level of ubiquitin, which is essential for the growth of cancer cells, and thus may be a potential anticancer treatment.[89]^21 LRRK2 was associated with formidable antitumor activity such as suppression of proliferation, migration and invasion of tumor cells, and induction of apoptosis along with arrest of cell cycle.[90]^22 ABL1, which is a nonreceptor tyrosine kinase, has been indicated to dysregulate several cancers. Recent studies have proposed that the activation of ABL1 may be responsible for the tumorigenesis by interacting with the downstream targets and the corresponding signaling pathways.[91]^23 It has been found that the mutation of PRKACA was identified in the pathogenesis of several tumors and involved in some signaling pathways.[92]^24 CAV1 has been demonstrated to be an essential structural and signaling protein, having a role in the formation, maintenance, and function of membrane caveolae. Recently, growing evidence has indicated that the elevated expression of CAV1 contributes to the malignant progression of various human cancers including colon cancer due to the aberrant promoter CpG site hypomethylation.[93]^25 The information gathered so far indicates that the JUN gene could play vital roles in apoptotic responses to DNA damage, cell stress, as well as to cytotoxic drugs.[94]^26 In short, all the above genes have been proved by the published literature to be involved in tumorigenesis and progression, which may provide new ideas for therapeutic studies in RCC and LCC. We supposed that if RCC is homogeneous with LCC, the DE genes may fall into similar functional pathways, modules, or networks and then reach a more consistent state when uploaded to higher functional levels in spite of the inconsistent gene lists. So we performed functional analysis including GO and pathway enrichment analyses. Most GO terms enriched by the upregulated DE genes in RCC were significantly associated with the process of biosynthesis and metabolism at the BP level, basic cell structure at CC level along with the enzymes metabolic processes, and energy metabolism at MF level. The GO term analysis showed that DE genes upregulated in LCC were mainly involved in cell adhesion, angiogenesis, collagen catabolic process, and regulation of canonical Wnt signaling pathway at the BP level, extracellular components at the CC level, and biological combination processes at the MF level. The GO analysis revealed the differences between RCC and LCC at three different functional levels. Furthermore, the enriched KEGG pathways of upregulated DE genes in RCC included biosynthesis and metabolism pathways, proteasome, cell cycle, and RNA transport. The biosynthesis and metabolism pathways consist of many important processes including glycolysis, gluconeogenesis, carbon metabolism, amino acids’ biosynthesis, and changes in cell metabolism, which may result in transformation and tumor progression.[95]^27 The proteasome is a well-studied signaling molecule that is involved in cell survival and proliferation, the inhibitors of which may be used in cancer therapy for their ability in shifting the fine intracellular homeostasis equilibrium toward cell death.[96]^28 Extensive studies have critically reviewed the role of the cell cycle in the occurrence and development of various tumors, meanwhile cell cycle regulators are considered attractive targets in cancer therapy.[97]^29 RNA transport has been proved to be essential for the initiation and progression of cancers.[98]^30 Our research showed that DE genes in upregulated LCC were related to protein digestion and absorption, pathways in cancer, ECM–receptor interaction, vascular smooth muscle contraction, and several important signaling pathways including PI3K–Akt, Ras, Wnt, cGMP–PKG, calcium, and cAMP signaling pathways. Recent evidence indicates that ECM–receptor plays a crucial role in cancer cell biology and tumorigenesis.[99]^31 The PI3K–Akt signaling pathway has an important impact on the initiation and progress in tumorigenesis and regulates critical cellular functions including cell proliferation, differentiation, and apoptosis.[100]^32 Ras is a membrane- anchored protein, which takes part in the generation of multiple intracellular signaling cascades including the above PI3K–Akt signaling.[101]^33 The other signaling pathways also play a pivotal part in cancer development including survival, proliferation, and metabolism.[102]^34^–[103]^38 The above analysis indicated that these DE genes could not reach a consensus at the function level, which contribute to the difference of RCC and LCC. Module analysis was used to seek the most remarkable module of the constructed PPI network. It was revealed from the results that the DE genes of RCC in the most significant module participate in proteasome, cell cycle, and DNA replication signaling pathways, while the DE genes of LCC in the selected module were associated with ribosomes. The proteasome and cell cycle signaling pathways were enriched again, and the results demonstrated their crucial roles in the initiation and progress of RCC. DNA replication stress has been regarded to be a hallmark of cancer as it likely promotes the development of cancer and is very prevalent.[104]^39 Ribosome biogenesis in tumorgenesis has been demonstrated for a long time, and the selective inhibition of ribosomes could lead to selective damage to neoplastic cells.[105]^40 One seed gene PSMB10 is an immunoproteasome gene with a high expression in most cancer types, and it is associated with the predisposition and recurrence of several tumors.[106]^41 The other seed gene EIF6 has been identified to play an important role in cell proliferation, and EIF6 overexpression could enhance the motility and invasiveness of cancer cells.[107]^42 These findings demonstrate that RCC and LCC could predispose to different pathways showing the differences of RCC and LCC once again. Taken together, our study provides a comprehensive bioinformatic analysis of the DE genes and pathways involved in RCC and LCC, which may be used as specific therapeutic targets in the treatment of RCC and LCC. Our results may provide new insights into the underlying molecular mechanisms contributing to the differences between RCC and LCC. However, further molecular biological experiments will be needed in the future in order to determine the function of the identified genes in RCC and LCC. Acknowledgments