Graphical abstract [29]graphic file with name ga1.jpg [30]Open in a new tab Keywords: Aging, Colon cancer, Transcriptome, Transcription factor, Survival Abstract Colon cancer is the fourth leading cause of cancer-related death, and exhibited clinical differences among patients of different ages, including malignancy, metastasis, and mortality rate. Few studies, however, focus on the communications between aging and colon cancer. Here we identified age-dependent differentially expressed genes (DEGs) in colon cancer using TCGA transcriptome data. Through analyzing multi-omics high throughput data, including ATAC-Seq, DNaseI-Seq and ChIP-Seq, we obtained six age-dependent transcription factors in colon cancer, and their age-dependent targets, significantly affecting patients’ overall survivals. Transcription factor ETS1 potentially functioned in both aging process and colon cancer progression through regulating its targets, RGL2 and SLC2A3. In addition, comparing with its relative lower expression levels in elderly patients, higher levels of RGL2 were detected in young patients, and significantly associated with larger tumor size, higher metastasis, and invasions of colon cancer, consistent with the clinical traits that young patients’ colon cancer exhibited late stages with more aggressiveness. Thus, these elements may serve as keys linking aging and colon cancer, and providing new insights and basis for mechanism researches, as well as diagnosis and therapies of colon cancer, especially in young patients. 1. Introduction Colorectal cancer is exceedingly prevalent, as the third most commonly diagnosed cancer worldwide, whose lethal rate ranks fourth, after lung cancer, liver cancer, and stomach cancer [31][1], [32][2]. The risk of developing colorectal cancer has been highly associated with demographic, behavioral, and environmental factors, including inflammatory bowel disease (IBD), colorectal cancer history, body mass index (BMI), dietary habits, and physical activities [33][1], [34][3]. At genomic levels, colorectal cancer was quite commonly accompanied by genetic and epigenetic alternations, which promoted tumorigenesis [35][4]. One of the responsible factors was chromosomal instability, which contained the loss of tumor suppressor genes and amplifications of oncogenes [36][4], [37][5], [38][6], affecting major signaling pathways (WNT, MAPK/PI3K, TGF-β, P53) [39][2]. Telomere dysfunction and DNA damage response alterations were two main causes underlying chromosomal instability, affecting genes responsible for critical cellular functions, including APC, KRAS, PI3K and P53 [40][2]. In addition, telomere dysfunction and DNA damage response alterations were also two leading factors responsible for aging process. Telomere dysfunction was highly correlated with limitations of cellular proliferative capacity, and could activate canonical DNA damage response pathway, leading to cellular apoptosis or senescence [41][7]. Telomere dysfunction could induce expression of p21 and p16 to initiate cellular senescence through p53 and RB tumor suppressor pathways [42][7], [43][8], [44][9], [45][10]. Loss of p53 function could restore many cellular defects caused by telomere dysfunction, including growth arrest and cellular apoptosis [46][11]. These researches indicated a hint of the links between senescence process and cancer progression. Considering the interplay of senescence and cancer, previous research showed that telomere dysfunction suppressed tumorigenesis through p53-dependent cellular senescence induction, accompanied by global induction of p53, p21 and senescence-associated-β-galactosidase staining [47][12]. Since pathological progressions were always complicated, senescent cells were reported to promote epithelial cell growth and tumorigenesis through senescence-associated secretory phenotypes in a proinflammatory context [48][13], [49][14]. Cellular senescence was thought to be one of the hallmarks of aging [50][15]. Thus, some connections might exit between aging process and cancer progression. Actually, age acts as an important risk factor related to the occurrence of colorectal cancer. According to the newest statistics provided by the American Cancer Society, the incidence rates for colorectal cancer declined by 3.3% annually among patients aged 65 years and older during the 2000 s, while increased by approximately 2% annually among adults aged younger than 50 [51][16]. Similar to the incidence rate patterns, colorectal cancer death rates decreased by 3% annually among patients aged 65 and older, while increased by 1.3% annually among adults aged younger than 50 [52][16]. Nevertheless, few researches focused on the molecular changes of this age-related differences, or whether some protecting mechanisms existed in old individuals. The underlying mechanisms accounting for this phenomenon remain unclear. Since colon cancer accounts for a large proportion of colorectal cancer incidence, we focused on colon cancer in this study. In this study, we extended our previous study on the interplay between senescence and cancer [53][39]. Integrating TCGA datasets and other published multi-omics datasets, including ATAC-Seq, DNaseI-Seq, ChIP-Seq, and RNA-Seq, we identified several transcription factors and their target genes, linking aging process and colon cancer progression, through multi-parallel analyses. 2. Materials and methods 2.1. Data processing and identifying DEGs For the collected RNA-Seq data of ETS1 KD and HOXA9 KD, clean reads were mapped to the human reference genome hg38 using hisat2 [54][17], [55][18] with default parameters. Gene expression levels were quantified using stringtie [56][18], [57][19]. Differential analyses were performed with edgeR [58][20], [59][21] in R environment [60][22]. Genes with P-value <0.05 were considered significant. Heatmap visualizations of DEGs were done with Pheatmap package [61][23]. For TCGA RNA-Seq data, quantifications of patients with age of 50 or less and age of 80 or more were considered, and read counts were processed in edgeR [62][20], [63][21] for differential analysis. Genes with P-value<0.05 were considered significant. 2.2. Pathway enrichment analysis KEGG pathway enrichments were done with geneSCF [64][24]. For graphical demonstrations, we presented significantly enriched items related with senescence, immune, and cancer, for clear visualizations. Items enriched with P-value <0.05 were considered significant. Bar plots of enrichment results were done with ggplot2 package [65][25], and alluvial plots were done with ggalluvial package [66][26]. GSEA (Gene set enrichment analysis) was done with GSEA software [67][53], [68][54]. 2.3. ATAC-Seq and ChIP-Seq data processing For ATAC-Seq data and DNaseI-Seq, Bowtie (v1.2.3) [69][27] or Bowtie2 (v2.3.4.3) [70][28] was used to align reads to human reference genome hg38 with default parameters. Peak calling was performed using MACS2 (v2.1.2) [71][29] with default parameters. For ChIP-Seq data, the peak files and target files were directly downloaded from the Cistrome data browser [72][30], [73][31]. The detailed peaks were modified from WASHU epigenome browser [74][32]. Targets were identified based on corresponding gene score greater than 0.2 in the downloaded files. 2.4. Identification of transcription factor binding site motifs All ATAC-Seq and DNaseI-Seq peaks were extracted for motif analysis using the HOMER motif finding tool with default parameters [75][33]. Motifs with P-value<0.05 were considered significant. 2.5. Survival analysis The clinical data for survival analysis were downloaded from UCSC Xena TCGA GDC hub. Survival analyses were performed with the survival package [76][34]. Cutoff thresholds were conducted with Q1 versus Q4 quartile for high and low expression groups of various genes, thus patients with the top 25% of gene expression belonged to high expression group, and patients with the minimum 25% of gene expression belonged to low expression group. Survival plots were proceeded with survminer package [77][35] in R environment [78][22]. 2.6. Data collection We downloaded expression profiles and clinical traits data of human colon cancer of TCGA and normal colon data of GTEx from UCSC Xena browser ([79]https://xenabrowser.net/datapages/). The ATAC-Seq and DNaseI-Seq datasets from [80]GSE83968 [81][36], including two ATAC-Seq data and eleven DNaseI-Seq data, were downloaded from the EBI (European Bioinformatics Institute) database ([82]https://www.ebi.ac.uk/). For ChIP-Seq data of ETS1 and PKNOX1, the peak files and targets were directly downloaded from the Cistrome data browser ([83]http://cistrome.org/db/) [84][30], [85][31]. The ETS1 KD dataset [86]GSE153852 [87][37] and HOXA9 KD dataset [88]GSE100144 [89][38] were utilized to confirm their regulatory relationships to potential targets. All these datasets were downloaded from GEO ([90]https://www.ncbi.nlm.nih.gov/geo/) unless otherwise specified. 3. Results 3.1. Identifying age-dependent differentially expressed genes in colon cancer patients Previously we found that age-dependent differentially expressed genes (DEGs) in senescent mouse embryonic fibroblasts could also function in cancers, including pancreatic cancer, colorectal cancer and cholangiocarcinoma [91][39]. To investigate the role of senescence in the progression of colon cancer, we obtained colon cancer data from UCSC xena browser TCGA GDC hub. Focusing on age-dependent regulators in colon cancer, we defined age of 50 or less as ‘young’ and age of 80 or more as ‘old’ ([92]Fig. 1A). Based on 61 young patients’ and 92 old patients’ RNA-Seq data, we identified 1096 DEGs as set 1 (639 up-regulated and 457 down-regulated in old patients, [93]Fig. 1B). We then obtained the DEGs between colon tumor tissues and adjacent normal tissues as set 2 ([94]Fig. 1C). Focusing on genes functioning in both aging and colon cancer progression, we obtained 803 common genes between DEGs of set 1 and set 2 ([95]Fig. 1D), one of which was RGL2, which is upregulated in colon tumor tissues comparing with adjacent normal tissues ([96]Fig. 1E), and in young patients comparing with old patients ([97]Fig. 1F). Taking the average expression levels as indicators, these 803 common genes exhibited gradual downregulations or upregulations from ‘young’ to ‘old’ patients, where age between 51 and 79 were defined as ‘middle’ ([98]Fig. 1G). Fig. 1. [99]Fig. 1 [100]Open in a new tab Characterizations of age-dependent differentially expressed genes (DEGs) in colon cancer. (A) Age division of young and old patients of colon cancer. Age of 50 or less was defined as young, and age of 80 or more was defined as old. (B) Volcano plot of age-dependent DEGs in colon cancer. (C) Volcano plot of DEGs in colon tumor tissues, comparing to adjacent normal tissues. Significant DEGs (P-value < 0.05) were labeled red based on the thresholds, indicated by dotted lines. X-axis stands for log[2] (fold change) and y-axis stands for -log[10] (P-value). (D) Venn plot of the common genes between age-dependent DEGs (from B) and cancer-dependent DEGs (from C) in colon cancer. (E) RGL2 was upregulated in colon tumor tissues, comparing with adjacent normal tissues. (F) RGL2 was upregulated in young colon cancer patients, comparing with old patients. *** P-value < 0.001, * P-value < 0.05, by Students’ t-test. (G) The gradual expression changes of the 803 common age-dependent DEGs from young to old colon cancer patients. Age between 51 and 79 was defined as ‘middle’. Expression levels of DEGs were normalized to z-scores, indicated by different colors. (H) KEGG pathway enrichment results of the four groups of common DEGs, which were down-regulated in both old patients comparing to young patients and colon tumor tissues comparing to adjacent normal tissues, up-regulated in both old patients and colon tumor tissues, up-regulated in old patients while down-regulated in colon tumor tissues, and down-regulated in old patients while up-regulated in colon tumor tissues. Pathways related to senescence, immune and cancers, were selected. All pathways presented were significantly enriched, with P-value < 0.05. (For interpretation of the references to colour in this