Abstract Esophageal squamous cell carcinoma (ESCC) is malignant while the carcinogenesis is still unclear. Here, we perform a comprehensive multi-omics analysis of 786 trace-tumor-samples from 154 ESCC patients, covering 9 histopathological stages and 3 phases. Proteogenomics elucidates cancer-driving waves in ESCC progression, and reveals the molecular characterization of alcohol drinking habit associated signatures. We discover chromosome 3q gain functions in the transmit from nontumor to intraepithelial neoplasia phases, and find TP53 mutation enhances DNA replication in intraepithelial neoplasia phase. The mutations of AKAP9 and MCAF1 upregulate glycolysis and Wnt signaling, respectively, in advanced-stage ESCC phase. Six major tracks related to different clinical features during ESCC progression are identified, which is validated by an independent cohort with another 256 samples. Hyperphosphorylated phosphoglycerate kinase 1 (PGK1, S203) is considered as a drug target in ESCC progression. This study provides insight into the understanding of ESCC molecular mechanism and the development of therapeutic targets. Subject terms: Oesophagus, Proteomics, Oesophageal cancer, Cancer genomics __________________________________________________________________ The progression of oesophageal squamous cell carcinoma (ESCC) from early to advanced stages requires comprehensive molecular characterisation. Here, the authors perform a proteogenomics analysis of ESCC patient samples across nine histopathological stages and three phases, identifying key alterations and paths for progression. Introduction Esophageal cancer (EC) is a malignant gastrointestinal carcinoma, ranking the seventh most common cancer and the sixth leading cause of cancer-related death worldwide^[70]1. Esophageal adenocarcinoma (EAC) and esophageal squamous cell carcinoma (ESCC) are the two major histologic subtypes of EC, of which EAC is more prevalent in western countries^[71]2, whereas ESCC predominantly occurs in East Asia, particularly in China and Japan^[72]3, indicating the diverse lifestyle of the countries as a major etiological factor of EC. The other etiological factors of EC include gender, age^[73]4, and the habits of drinking/smoking^[74]5. However, the molecular signatures, which are associated with the risk factors in EC progression, are still unknown. Generally, the carcinogenesis of tumor development hall long-drawn-out process, and genome data act as a “fossil” record of how a tumor came to be. Several large-scale ESCC cohorts, including The Cancer Genome Atlas (TCGA)^[75]6, presented genomic aberrations and identified highly mutated genes (e.g., TP53) in the advanced stages of ESCC. However, the first occurrences of the mutations/key events and the related effects during the carcinogenesis of ESCC are poorly understood. Liu et al. showed that earlier and advanced stages of ESCC shared several significantly mutated genes^[76]7. Thus, the mutations were cumulative in ESCC progression, and the key mutation/events which were found in the advanced stages might exist in the earlier stages, whereas there is still a lack of proteogenomic landscape in ESCC progression. In addition, the molecular mechanism of the diversity and tumor heterogeneity of ESCC is poorly understood, imposing a challenge for developing ESCC therapeutic strategies. Histologically, the esophageal wall includes the epithelium, lamina propria, muscularis mucosa, submucosa, muscularis externa, and adventitia^[77]8. The depth of infiltration of the cancer cells, determining the stage of lesions, was measured at the deepest point of their penetration in corresponding layers^[78]9. Surgery is the predominant curative treatment strategy in advanced stages where the cells invade muscularis externa (T2–T4 stages), with poor quality of life (QOL) and a low 5-year survival rate (<30%)^[79]10. Though the recent advances in endoscopic submucosal dissection (ESD)^[80]11 have achieved the early detection of ESCC patients (T1 stage) with higher QOL and significantly improved overall survival rate (>90%)^[81]12, the complexity of the early ESCC progression and the extreme trace amount of tissue samples in different stages have limited in portraying the multi-omics molecular landscape of ESCC. In this study, we perform a comprehensive multi-omics analysis of 786 trace-tumor samples from 154 ESCC patients. The integrative multi-omics dataset reveals the stage-specific and risk factor-associated molecular characterization, and defines the cancer-driving waves along with the mutation accumulation in EC progression. Proteogenomics uncovers key events in the transit of the phases and the trajectory analysis shows 6 major tracks and their molecular characteristics during the carcinogenesis of ESCC. Results Overview of proteogenomic landscape in ESCC progression We performed multi-omics-based profiling of trace 786 samples collected from 154 ESCC patients who had not experienced prior chemotherapy or radiotherapy. The clinicopathological characteristics of patients and tumors are summarized in Supplementary Table [82]1. Subsequently, 22 substages during ESCC progression from healthy esophageal tissue to tumor development were established for these samples following WHO and Japanese pathology diagnostic criteria^[83]13. The number of substages identified from the ESCC patients varied from 4 to 16, and the tissue samples from the corresponding substages were separately dissected from the formalin-fixed, paraffin-embedded (FFPE) slides (Supplementary Fig. [84]1a). All ESCC samples in our cohort were dissected 3 mm thick and stood up one by one in the embedding, and were then marked in the hematoxylin and eosin (H&E)-stained sections (Supplementary Fig. [85]1c, d). In total, 786 samples were collected and classified into 9 histopathological stages covering 22 substages in our cohort (Supplementary Table [86]2), including stage 1 (normal tissue stage), stage 2 (hyperplasia stage), stage 3 (Tis stage), stage 4 (lamina propria cancer stage), stage 5 (muscularis mucosa stage), stage 6 (submucosal invasion cancer stage a), stage 7 (submucosal invasion cancer stage b), stage 8 (T2 stage), and stage 9 (T3 stage) (Fig. [87]1a and Supplementary Data [88]1a; Methods). The normal (stage 1)/tumor cell purity of all samples was over 95%, indicating the high quality of all samples of our cohort (Supplementary Fig. [89]1e). In addition, we collected another 256 samples as an independent validation cohort from 49 early-stage ESCC patients, covering 6 of 9 histopathological stages as stages 1, 2, 3, 4, 5, and 7 (Fig. [90]1a and Supplementary Table [91]3; Methods). That represented the advantages of our samples in a pathological region-resolved mode, providing the chance to portray molecular profiles of ESCC in a time-resolved mode. Subsequently, we performed mass spectrometry (MS) profiling of all 786 samples, phosphoproteomic profiling of 145 samples (Supplementary Table [92]4 and Supplementary Data [93]1b), and whole-exome sequencing (WES) of 102 samples in the main cohort (Fig. [94]1a, Supplementary Figs. [95]1b and [96]2a, and Supplementary Data [97]1c, d; Methods). To demonstrate the findings and results in the main cohort, we also performed proteomic profiling of 256 samples in the validation cohort. Fig. 1. The multi-omics landscape in ESCC progression. [98]Fig. 1 [99]Open in a new tab a Overview of the experimental design and the number of samples for the genomic, proteomic, and phosphoproteomic analyses. ESCC esophageal squamous cell carcinoma. b The genomic profile of ESCC progression. Top: the mutation number and types of all the samples from early to progressive ESCC. Bottom: the somatic copy number alterations of all the samples from early to progressive ESCC. The mutation frequencies are shown by a bar plot at the right panel. NT phase: non-tumor phase, IEN phase: intraepithelial neoplasia phase, A-ESCC phase: advanced-stage ESCC phase. c The gain of neo-mutations at all stages in ESCC progression. d Analysis of the mutations loads of diverse cohorts. EESCC cohort: early ESCC cohort. e The number of the identified proteins of 786 samples (Kruskal–Wallis test, p < 2.2E–16). f Boxplot showing the number of the phosphosites identifications of 145 samples (Kruskal–Wallis test). Boxplot shows median (central line), upper and lower quartiles (box limits), 1.5× interquartile range (whiskers). ****p < 1.0E–4, ***p < 1.0E–3, **p < 0.01, *p < 0.05, ns. > 0.05. Source data are provided as a Source data file. WES profiling identified 9547 mutations in the Fudan cohort (this study) (Supplementary Data [100]2a). The top mutations were TP53, MUC16, FAT3, SYNE1, AKAP9, FAT4, etc. (Fig. [101]1b). TP53, frequently mutated in EC^[102]14, was the top-ranked mutation in the Fudan cohort, and was co-occurrence with the mutation of KMT2D (Supplementary Fig. [103]2b). In our study, we observed that the number of mutations gradually cumulated during ESCC carcinogenesis, ranging from 422 mutations in hyperplasia stage to 5280 mutations in the T3 stage. Neo-mutations were indicated as those just appearing at a certain stage (Methods). In our cohort, we found that the number of the neo-mutations peaked at the Tis stage (n = 1330), and the T2 (n = 3003), indicating the significant events during carcinogenesis (Fig. [104]1c). Thus, nine pathological stages of our cohort were grouped into three phases (NT, IEN, and A-ESCC) (Fig. [105]1c), which allowed us to explore the key mutational events and the corresponding impacts in ESCC progression. Observation of mutation loads of the Fudan cohort and other ESCC cohorts (TCGA cohort^[106]15, Lin’s cohort^[107]16, and Song’s cohort^[108]17) showed the lowest mutation loads in the NT phase in the Fudan cohort (Fig. [109]1d). Liu et al. reported fewer mutations were detected in the esophageal nondysplastic epithelium (simple hyperplasia)^[110]7, which was also observed in our cohort that the mutation loads of the NT phase were lower (Wilcoxon signed-rank test, p < 0.05, A-ESCC vs. NT ratio = 9.36, IEN vs. NT ratio = 2.08) (Supplementary Fig. [111]2c), indicating the low mutation burden of our cohort was due to the low mutation loads of the early-stages of ESCC. Collectively, we built a comprehensive genomic landscape in ESCC progression, and presented the difference between the early and advanced stages of ESCC. At the protein and phosphoprotein levels, proteomic analysis was performed using a label-free quantification strategy^[112]18. Protein abundance was first calculated by intensity-based absolute quantification (iBAQ)^[113]19 and then normalized as the fraction of total (FOT). Spearman’s correlation coefficient of HEK293T cells was 0.91, indicating the consistent stability of our MS platform (Supplementary Fig. [114]2d and Supplementary Data [115]3a). The gradually decreased (Spearman’s) correlation coefficient of the nine histopathological stages reflected the increased tumor heterogeneity during the carcinogenesis of ESCC, highlighting the importance of exploring molecular characteristics in ESCC progression (Supplementary Fig. [116]2e). With the advancement of the stages of ESCC, the numbers of protein identifications were slightly increased (Kruskal–Wallis test, p < 2.2E–16) from ~5000 in stage 1 (normal tissue) to ~7000 in stages 8 and 9 (T2 and T3 stages) with a total of 15,071 in 786 samples at 1% false discovery rate (FDR) (Methods) (Fig. [117]1e and Supplementary Fig. [118]2f). In addition, the reference proteome was highly dynamic based on the protein abundance, which spanned over eight orders of magnitude (Supplementary Fig. [119]2g). In addition, the ESCC biomarkers identified in previous studies (several ESCC tissues or cell lines), including ACTA2, TAGLN^[120]20, HSPA9, PDIA4, PLEC, POSTN, PSAP, and THBS1^[121]21 were also covered in the Fudan cohort (Supplementary Data [122]3b). At the phosphoprotein level, a total of 52,856 phosphosites corresponding to 7612 phosphoproteins were identified in 145 samples (Supplementary Fig. [123]2h, i and Supplementary Data [124]3c). During the process of ESCC carcinogenesis, the number of phosphosite identifications was slightly increased (Kruskal–Wallis test, p = 5.2E–3) (Fig. [125]1f). In the validation cohort, label-free quantification measurement of 256 samples resulted in a total of 12,383 protein groups with a 1% FDR at the protein and peptide levels (Supplementary Fig. [126]2j). Consistent with the findings in the main cohort, with the advancement of the stages of ESCC, slightly elevated protein identifications were detected in the validation cohort (Kruskal–Wallis test, p < 1.0E–4) (Supplementary Fig. [127]2k). Compared to the NT phase, more protein identifications were identified in the IEN phase both in the main cohort and validation cohort (Wilcoxon rank-signed test, p < 1.0E–4) (Supplementary Fig. [128]2l). Overall, we established a comprehensive landscape of ESCC progression at the multi-omics level. The SBS16 signature, associated with alcohol drinking, promoted DNA replication in the IEN phase Except diverse lifestyle of the countries^[129]2,[130]3, alcohol drinking is one of the key risk factors of ESCC^[131]22. Moody et al. pointed that SBS16 is the significant signature associated with alcohol drinking^[132]22, whereas the impacts of SBS16 signature on the molecular level in the ESCC progression have not been revealed. Furthermore, SBS16 signature shows a negative association with the overall survival of ESCC patients (referred from TCGA cohort, log-rank test, p = 0.049) (Supplementary Fig. [133]3a). These findings allowed us to explore the impacts of SBS16 in ESCC patients with drinking habit at the molecular level. Of note, gene ontology (GO) analysis revealed that DNA replication was enriched in the ESCC patients with SBS16 signature (Supplementary Fig. [134]3b). Among the mutations, significantly related to SBS16 signature, we found that OLFM4 mutation upregulated its corresponding protein expression (Wilcoxon signed-rank test, OLFM4 Mut vs. WT ratio = 4.6, FDR = 3.4E–5) (Fig. [135]2a, b). Generally, OLFM4 promotes S-phase transition in cancer cell proliferation^[136]23. In our cohort, OLFM4-positive-associated proteins were involved in cell proliferation, including DNA replication, cell division, etc. (Fig. [137]2c, d). Moreover, OLFM4 displayed a positive association with CDK1 and CDK2 at the protein level (Pearson’s R = 0.44 for CDK1 and 0.54 for CDK2, p = 0.013 for CDK1 and 2.5E–3 for CDK2) (Supplementary Fig. [138]3c), which is a key regulator in the transition from S-phase to G2/M phase in cell cycle^[139]24. The mutation of OLFM4 showed positive impacts on the protein levels of cell proliferation-related markers, such as CDK2, MCM3/5/6, etc. (Fig. [140]2d). Besides, over-phosphorylation of DNA replication-related proteins was also detected during the carcinogenesis of ESCC (Kruskal–Wallis test, FDR < 0.05, A-ESCC and IEN vs. NT ratio ≥ 2) (Fig. [141]2e). To further validate the findings in our cohort, we analyzed the impacts of OLFM4 in other ESCC cohorts, such as the Li’s cohort^[142]25 and the Liu’s cohort^[143]26, and found that OLFM4 displayed positive association with DNA replication-related proteins at the protein level in these cohorts (Fig. [144]2d and Supplementary Fig. [145]3d). Taken together, SBS16 signature was a key event in the transition from the NT phase to the IEN phase for ESCC patients with a drinking habit, in which OLFM4 mutation showed positive impacts on DNA replication (Fig. [146]2j). Fig. 2. The risk factor-associated mutational signatures in ESCC progression. [147]Fig. 2 [148]Open in a new tab a Heatmap showing the significantly positive associated mutations in SBS16 signature (two-sided Fisher’s exact test). ***p = 1.4E–4 (BCKDK), **p = 2.7E–3 (CEP104), ***p = 1.4E–4 (LRRC31), ***p = 4.5E–4 (OLFM4), ***p = 1.5E–3 (DSC3), ***p = 1.5E–3 (PGC). The square directs to a subset of patient samples used for WES. b Volcano analysis of the impacts of significantly positive associated mutations in SBS16 signature on their counterpart proteins expression (two-sided Wilcoxon signed-rank test). c Represented pathway enrichment that was positively correlated with OLFM4. d Heatmap showing the protein levels of DNA replication in OLFM4 mutation group vs. WT group (two-sided Wilcoxon signed-rank test, BH-adjusted *p < 0.05). e The expression (log[2]-transformed Intensity, median) of represented DNA replication-related phosphoprotein in ESCC progression (Kruskal–Wallis test, BH-adjusted *p < 0.05). A total of 145 samples for phosphoproteomic profiling are used in this analysis. n (NT) = 57, n (IEN) = 62, n (A-ESCC) = 26 biologically independent samples examined. f The correlation between etiological factors (top) and the significantly associated mutations of APOBEC signature (bottom) (two-sided Fisher’s exact test). *p = 0.012 (Phase), **p = 4.2E–3 (Habit), **p = 4.6E–3 (DCTN2), **p = 4.6E–3 (EPS8), **p = 4.6E–3 (CENPE). g Venn diagram depicting the number of the overlapped proteins overrepresented in the APOBEC signature and non-smoking/drinking ESCC patients (left, two-sided Wilcoxon signed-rank test, BH-adjusted *p < 0.05), and the associated signaling pathways (right). h Venn plot (left) showing the overlapped proteins significantly correlated with DCTN2 at the gene and protein levels (two-sided Wilcoxon signed-rank test, BH-adjusted *p < 0.05). Volcano plot (right) depicting the correlation between DCTN2 and the overlapped proteins (n = 86, two-sided Pearson’s correlation test). i Heatmap (left) presenting the represented chromosomal/spindle components and cell proliferation markers overrepresented in the DCTN2 mutation group (two-sided Wilcoxon signed-rank test), and correlation (right) between DCTN2/RUVBL1 and represented chromosomal/spindle components and cell proliferation markers (two-sided Pearson’s correlation test). j A brief summary of the impacts of the SBS16 signature (top) and APOBEC signature (bottom) in ESCC progression. A total of 102 samples for WES are used in the analysis. ****p < 1.0E–4, ***p < 1.0E–3, **p < 0.01, *p < 0.05, ns. > 0.05. Source data are provided as a Source data file. The APOBEC signature was overrepresented in the non-smoking/drinking ESCC patients To explore the specific etiological factors that might contribute to the mutagenesis of ESCC, we adopted a non-negative matrix factorization (NMF) algorithm^[149]27 to extract mutational signatures from the WES data of the Fudan cohort and other ESCC cohorts, such as Moody’s cohort^[150]22, TCGA cohort^[151]15, Lin’s cohort^[152]16, and Song’s cohort^[153]17. In our cohort, DNA repair signature was detected as early as in the NT phases and lasted till the A-ESCC phase (Fisher’s exact test, p = 2.8E–3, 41.7%, 31.8%, 70.6% for NT/IEN/A-ESCC phases, respectively) (Supplementary Fig. [154]3e, f), indicating DNA damage is one of the causes leading to esophageal carcinogenesis. In addition, we found APOBEC signature was prominent in ESCC progression and was specially detected as early as in the IEN phase (Fisher’s exact test, p = 0.012, 66.7%) (Fig. [155]2f and Supplementary Fig. [156]3g). Similarly, we confirmed that APOBEC signature was detected in the earlier stage of ESCC (Fisher’s exact test, p = 0.029, 50%) in the TCGA cohort^[157]28 (Supplementary Fig. [158]3h). Integration of the findings in the Moody’s cohort^[159]22, we found APOBEC signature was prominent in the non-smoking/drinking ESCC patients (Fisher’s exact test, p = 8.4E–3, 42.5%), in which higher enrichment score was detected (Wilcoxon signed-rank test, FDR = 5.1E–4, non-smoking/drinking vs. smoking vs. drinking ratio = 1.78) (Supplementary Fig. [160]3i, j). To explore the impacts of APOBEC signature in the non-smoking/drinking ESCC patients, we integrated the highly expressed proteins both detected in the APOBEC signature and ESCC patients with non-smoking/drinking habit, and discovered these proteins were correlated with cell proliferation process (Fig. [161]2g). Somatic mutations analysis revealed that the mutations of CENPE, EPS8, and DCTN2, were positive associated with APOBEC signature (Fisher’s exact test, p < 0.05) (Fig. [162]2f). We further investigated the impacts of these three mutations, and found DCTN2 mutation upregulated its corresponding protein expression (Wilcoxon signed-rank test, FDR = 0.040, Mut vs. WT ratio = 2.0) at the protein level (Supplementary Fig. [163]3k). In addition, only the protein level of DCTN2 was significantly overrepresented in the APOBEC signature group (Wilcoxon signed-rank test, FDR = 5.1E–5) (Supplementary Fig. [164]3k). These findings indicated the potential effects of DCTN2 mutation in the patients with APOBEC signature. DCTN2, a subunit of dynactin, binds to both microtubules and cytoplasmic dynein, and is involved in a diverse array of cellular functions, including spindle formation, chromosome movement, nuclear positioning, etc.^[165]29. Among the dynein families, we found that DYNLRB1 exhibited a positive correlation with DCTN2 (Pearson’s R = 0.30, p = 4.3E–3) at the protein level (Supplementary Fig. [166]3l). Dynein families share a conserved motor domain coupled cycles of ATP hydrolysis with conformational changes to produce movement^[167]30. To determine the activation and movement of dynein in mitosis, we integrated DCTN2 positively correlated proteins, and found RUVBL1 showed a significant correlation with DCTN2 (Pearson’s R = 0.54, p = 4.3E–8) at the protein level (Fig. [168]2h). Generally, RUVBL1 exhibits DNA- and nucleosome- activated ATPase activity and catalyzes ATP-dependent nucleosome sliding in mitosis^[169]31,[170]32. In our cohort, we found DCTN2 mutation had positive impacts on the protein level of chromosomal/spindle components, evidenced by the markers including SMC2, SMC3, etc. (Fig. [171]2i). As well as RUVBL1, DCTN2 showed a significantly positive correlation with chromosomal/spindle components and markers of tumor cell proliferation (e.g., TOP2A, MKI67, etc.)^[172]33 (Fig. [173]2i). These findings indicated the activation of microtubule and spindle in G2/M in the mitosis. Taken together, the links of the findings in our cohort and the Moody’s cohort revealed that APOBEC signature was prominent in the ESCC patients with non-smoking/drinking habit, in which DCTN2 mutation upregulated the protein level of DCTN2, elevating mitosis and cell proliferation (Fig. [174]2j). The gain of chromosome 3q was a key event in the transmit from the NT phase to the IEN phase To explore the impacts of key arm events, we performed whole exome-based somatic copy number alterations (SCNAs) analyses based on WES data and examined the regulatory effects of 23,109 SCNAs on protein expressions of 102 samples in ESCC progression (Supplementary Data [175]3d). The integrated genomic data of all phases illustrated that the gain of chromosomes 3q (chr3q), which was also observed in the TCGA ESCC cohort, was the top-ranked arm event, which was prominent in the IEN phase (Fisher’s exact test, p = 2.9E–3, 45.5%) (Fig. [176]3a and Supplementary Fig. [177]4a–c). In addition, the individual regions at the chr3q gain also showed significant association in ESCC progression, including chr3q26.1 gain, chr3q29 gain, chr3q22.1 gain, and chr3q12.2 gain (Supplementary Fig. [178]4d). To explore the biological functions of chr3q gain, we performed cis-effects analysis of the genes with CNA regions at the protein level, and found 20 genes at chr3q gain perturbation profiles had significantly positive cis effects on their associated proteins (Fig. [179]3b). Interestingly, nearly 35% genes at chr3q gain were positively associated with Ca^2+ signal. Fig. 3. Integrative omics analyses of early ESCC samples. [180]Fig. 3 [181]Open in a new tab a The significant arm events of 102 samples in ESCC progression. The gain and loss events are highlighted in red and blue, respectively. b Volcano plot showing the cis- correlation of the SCNA (x-axis) and the associated –log[10] (p value) (y-axis) on the genes at chr3q gain (two-sided Spearman’s correlation test). c The cis SCNA-protein regulations of significantly correlated genes (top) on their corresponding proteins expression (bottom) (two-sided Spearman’s correlation test, p < 0.05). d Dependency map-supported ([182]https://depmap.org) panels showing relative survival averaged across all available ESCC cell lines after depletion of the indicated genes by RNAi or CRISPR. The right shows Pearson’s correlation and p value of these genes with PCNA and MKI67 at the protein level (two-sided Pearson’s correlation test). e A brief summary of the impacts of chr3q gain. f Volcano plot showing the impacts of the top ten mutations of ESCC progression on proteins expression (log[2]-transformed Intensity) (x-axis) and the associated –log[10] (FDR) (y-axis) (two-sided Student’s t-test). g Scatterplot showing the relationship between log[10] GSK3A and log[10] MACF1 expression at the protein level in the Fudan cohort (two-sided Pearson’s correlation test, mean ± SD). h Represented pathway enrichment of proteins that was positively correlated with MACF1 in the Fudan cohort (top) and the TCGA ESCC cohort (bottom). Biological pathways are analyzed from the GO/KEGG database. i Heatmap showing the impacts of the mutation of MACF1 on the expression of Wnt signaling-related proteins in ESCC progression (Kruskal–Wallis test, BH-adjusted *p < 0.05). The square directs to a subset of patient samples used for WES (n = 102). j A brief summary of the impacts of the mutation of MACF1. ****p < 1.0E–4, ***p < 1.0E–3, **p < 0.01, *p < 0.05, ns. > 0.05. Source data are provided as a Source data file. The Ca^2+ signal is identified as a key regulator in processes of excitable cells (e.g., neurons) and non-excitable cells (including those of the epithelia)^[183]34, where it controls a diverse array of processes such as secretion, proliferation^[184]35, and promotes cancer cell survival^[185]36. In our cohort, all the amplifications of genes in the Ca^2+ signal were detected in the IEN phase (Fig. [186]3c), suggesting the gain of chr3q event was detected as early as in the ESCC early stage. In addition, we found the amplifications of those genes also had significantly positive effects on other parallel proteins in the Ca^2+ signal at the protein level, such as the amplification of GMPS was positively associated with the protein levels of GFM1, OPA1, etc. (Fig. [187]3c). To assess the effects of these amplified genes, we annotated outliers for the degree of which CRISPR- or short hairpin RNA (RNAi)-mediated depletion reduced ESCC cell lines^[188]37. As a result, we found the deletion of three genes (GMPS, GFM1, and OPA1) had negative effects on the proliferation of ESCC cell lines (Fig. [189]3d and Supplementary Data [190]4a, b). In addition, the three genes (GMPS, GFM1, and OPA1) exhibited a positive correlation with PCNA and MKI67, makers of tumor cell proliferation^[191]33 (Fig. [192]3d). Furthermore, we found that the other important genes at chr3q (e.g., TP63/PIK3CA/SOX2) were also prominent in the IEN phase (Fisher’s exact test, p < 0.05, 59.1%, 70.9%, 59.1% for TP63/PIK3CA/SOX2 respectively) (Supplementary Fig. [193]4e). Specifically, the amplifications of TP63 at chr3q28 showed positive impacts on DNA replication (Supplementary Fig. [194]4f, g), further indicating the functions of chr3q gain in the transmit process from the NT phase to the IEN phase, and implying the impacts of TP63 amplification on DNA replication in ESCC progression. Taken together, the chr3q gain was a driven event in the transmit process from the NT phase to the IEN phase, leading to the activation of Ca^2+ signal and cell proliferation in the IEN phase (Fig. [195]3e). The impacts of genomic aberrations in ESCC progression Notably, TP53, highly mutated in ESCC^[196]7, was the top-ranked mutation in ESCC progression, and was prominent in the younger male ESCC patients (Fisher’s exact test, p = 0.044, 65.8%) in our cohort (Supplementary Fig. [197]4h). To explore the impacts of TP53 mutation in ESCC progression, we incorporated the alterations of significantly upregulated proteins (SUPs, Wilcoxon rank-signed test, FDR < 0.05) (Supplementary Fig. [198]4i). As well as in the TCGA cohort^[199]15, the GO enrichment disclosed that those overlapped SUPs were related to DNA replication and cell cycle (e.g., MCM3/4/5/6, CCNK, PPP1R8, etc.), and ECM signaling (e.g., ITGA2/3/5, LAMA3/5, etc.), which was also overrepresented in the tumor tissues compared to the paired non-cancerous adjacent tissues (NATs) in the Liu’s cohort^[200]26 (Supplementary Fig. [201]4j, k). Corresponding to the stages in ESCC progression, TP53 mutation displayed positive impacts on DNA replication/cell cycle in the IEN phase, and ECM signaling in the A-ESCC phase (Supplementary Fig. [202]4l). Among the top ten mutations in ESCC progression, the mutation-protein correlation analysis revealed that MACF1 mutation upregulated its counterpart protein level (t-test, FDR = 5.5E–5) (Fig. [203]3f and Supplementary Fig. [204]4m). In addition, gradually increased expression of MACF1 was detected (Kruskal–Wallis test, FDR = 6.3E–9, A-ESCC vs. NT ratio = 1.5E + 3) at the protein level in ESCC progression (Supplementary Fig. [205]4n). Generally, MACF1 is a multidomain protein that associates with microfilaments and microtubules, and binds to a complex (e.g., CTNND1, GSK3B, etc.) in Wnt signaling^[206]38. In our study, significant correlations were found between MACF1 and GSK3A (Pearson’s R = 0.46, p = 3.0E–4), and MACF1 and GSK3B (Pearson’s R = 0.28, p = 0.027) at the protein level (Fig. [207]3g and Supplementary Fig. [208]4o). In addition, consistent results were found in TCGA cohort^[209]15 at the RNA level (Supplementary Fig. [210]4p), inferring the mutation of MACF1 would activate Wnt signaling. To investigate the impacts, we integrated the MACF1 positively associated proteins in our cohort and TCGA cohort, and found that those proteins participated in Wnt signaling (Fig. [211]3h). Of note, the incorporation of the proteins in Wnt signaling affected by MACF1 mutation, were highly expressed in the A-ESCC phase (Kruskal–Wallis test, FDR < 0.05, A-ESCC vs. NT ratio ≥ 2), such as GSK3A, GSK3B, CSNK1A, CSNK2B, DVL3, etc. (Fig. [212]3i). Furthermore, the CRISPR-mediated depletion of the genes (e.g., CSNK2B, CSNK1A, etc.) reduced the proliferation of ESCC cell lines (Fig. [213]3i). At the phosphoprotein level, the overrepresented phosphorylation of CTNNB1 (T551, S191) and CTNND1 (S346, Y865), were observed in the A-ESCC phase (Kruskal–Wallis test, FDR < 0.05, A-ESCC vs. NT ratio ≥ 2) (Supplementary Fig. [214]4q). Therefore, the mutation of MACF1 upregulated the protein level of MACF1, which bound to the Wnt complex and activated Wnt signaling in the A-ESCC phase (Fig. [215]3j). Collectively, we uncovered that TP53 mutation observed in the whole ESCC stages exhibited diverse impacts during ESCC carcinogenesis, and MACF1 mutation had roles in promoting ESCC progression in the A-ESCC phase. Proteomic characterization of three phases in ESCC progression During the carcinogenesis of ESCC, three phases covered the NT phase, the IEN phase, and the A-ESCC phase. Specifically, the NT phase contained stage 1 and stage 2, the IEN phase covered stage 3 last till to stage 7, and the A-ESCC phase included the advanced stages T2 and T3 (Supplementary Fig. [216]5a). To explore the characteristics of the three phases of ESCC, we analyzed the TMB and differentially expressed proteins (DEPs) of the three phases. As a result, the comparative analysis revealed that the NT phase, with the lowest TMB, was featured by metabolic process (e.g., KLK12/13, ALOX12, etc.) and inflammatory response (e.g., IL36A, SERPINB1, etc.) (Supplementary Figs. [217]2c and [218]5c). DNA repair signature was detected in the IEN phase, indicating DNA damage was initiated. To react to the external damage, the proteins involved in the inflammatory response (e.g., IL36A, SERPINB1, etc.) were highly expressed in the NT phase. As shown in Figs. [219]2 and [220]3, the IEN phase of ESCC had higher TMB compared to the NT phase, and was featured with the chr3q gain. The IEN overrepresented proteins participated in the oncogenic-related pathways, such as cell cycle (e.g., CDK1/2, RB1, etc.), DNA repair (e.g., LIG1, PARP1, etc.), EGFR signaling pathway (e.g., EGFR, SOS1, etc.), and so on. Comparatively, the A-ESCC phase of ESCC, with the highest TMB, was characterized by the dominant pathways of Wnt signaling (e.g., WNT2B, GSK3A, etc.) and glycolysis (e.g., PGK1, ENO3, etc.). To further demonstrate the findings, we analyzed the DEPs (Wilcoxon rank-signed test, FDR < 0.05, IEN vs. NT ratio ≥ 2 or ≤ 0.5) of the phases in the main cohort (n = 786) and the validation cohort (n = 256), and found consistent findings of the characterizations of the phases both in the main cohort and validation cohort. Specifically, the primary functions of esophagus were dominant in the NT phase, such as metabolic processes including keratinization (e.g., KLK12/13, CSTA, etc.), lipid and amino acid metabolism (e.g., CYP4F22, ALOX12, etc.) (Supplementary Fig. [221]5b–d). In addition, the overrepresented proteins in the IEN phase participated in the oncogenic-related pathways, such as cell cycle (e.g., CDK1/2, RB1, etc.), DNA repair (e.g., LIG1, PARP1, etc.), EGFR signaling pathway (e.g., EGFR, SOS1, etc.), etc. (Supplementary Fig. [222]5b–d). These consistent findings in the main cohort and validation cohort indicated the benign of the NT phase, and the relative malignancy of the IEN phase, highlighting the importance of investigating the key molecular events during the carcinogenesis of esophagus. A carcinogenesis path with eight dynamic waves in ESCC progression The proteogenomics on three phases indicated a temporal correlation between genomic aberrations and proteomic alterations in ESCC progression. To further portray molecular profiles of ESCC in a time-resolved mode, we split the whole ESCC progression into 22 substages. Visualization of the abundance of the E2 proteins (n = 6885; Methods) by principal component analysis (PCA) (Methods) differentiated the proteome profiles for the 22 substages, which clearly discriminated the proteomes of the early and the advanced ESCC stages. In addition, PCA of early ESCCs displayed obvious diversity among the substages along with ESCC progression (Fig. [223]4a). These results further showed the distinct profiles of the substages in ESCC progression. Fig. 4. The temporal driver pathway waves in ESCC progression. [224]Fig. 4 [225]Open in a new tab a Principal component analysis (PCA) of the Fudan cohort. Left: PCA of all 786 ESCC samples (including NT phase, IEN phase, and A-ESCC phase); Right: PCA of 746 early ESCC samples (NT phase and IEN phase). b Heatmap analysis of the dynamic switches during the carcinogenesis of ESCC (Kruskal–Wallis test). Left: heatmap analysis of DEPs of the 22 substages in ESCC progression. Right: the driver pathway waves of 8 panels in ESCC progression. c The mutations are significantly associated with stages in ESCC progression (two-sided Fisher’s exact test). The highlighted mutations (right) are exclusively co-mutations (two-sided Fisher’s exact test). The square directs to a subset of patient samples used for WES (n = 102). p values of the co-mutations with AKAP9: ****p = 3.7E–5 (PCDHB16), ****p = 3.7E–5 (BOC), **p = 1.7E–3 (STAG2). d The number of the proteins regulated by the co-mutations and e the associated biological pathways. f Heatmap showing the impacts of the co-mutations of PCDHB16, BOC, SYNE2, BCL9L, and STAG2 (top, two-sided Fisher’s exact test), on the protein level (bottom) in ESCC progression (Kruskal–Wallis test). *p = 0.038 (PCDHB16), *p = 0.038 (BOC), *p = 0.048 (SYNE2), *p = 0.048 (BCL9L), *p = 0.048 (STAG2). The square directs to a subset of patient samples used for WES (n = 102). g The kinase-substrate interactions in ESCC progression (Kruskal–Wallis test). A total of 145 samples for phosphoproteomic profiling are used in this analysis. *p = 0.015 (PRKCD), ***p = 2.4E–4 (SRC), **p = 9.6E–3 (CDK7), ****p = 8.0E–6 (PAK1), ****p = 5.3E–9 (AKT1), ****p = 2.8E–10 (CDK1). ****p < 1.0E–4, ***p < 1.0E–3, **p < 0.01, *p < 0.05, ns. > 0.05. Source data are provided as a Source data file. To investigate specific characteristics of the substages in ESCC progression, we performed the substage-based supervised clustering analysis (Methods) with substage highly expressed proteins (Kruskal–Wallis test, FDR < 0.05, certain substage vs. other substages ratio ≥ 2) (Fig. [226]4b). Based on the enrichment analysis of the substages highly expressed proteins showing that the molecular characterization of the substages in ESCC progression was associated with the lesion invasion layers, and could be grouped into eight models (1–2_1/2_2–3_1/3_2/3_3/3_4/3_5–4_1/4_2/4_3/–5_1/5_2/5_3–6_1/6_2/6_3–7_1 /7_2/7_3–8/9) (Supplementary Fig. [227]5g). Thus, we speculated that the carcinogenesis path in ESCC progression could be linked to those eight models. To further demonstrate the speculation, we performed co-expression analysis of the substage highly expressed proteins and identified eight protein patterns, which were consistent with the results in those eight models correlated to the lesion invasion layers (Supplementary Fig. [228]5h, k). In addition, the consistent findings of the enrichment pathways were also detected in eight models based on substages and eight protein patterns based on the results of co-expression analysis. Therefore, eight panels were identified on the basis of the histopathological stages in a time-resolved mode (Fig. [229]4b and Supplementary Fig. [230]5h). Specifically, in the (sub)stage 1 (normal tissue), the primary metabolic machinery was dominant, such as lipid metabolism (e.g., HMGCS1, ALOX15B, etc.), pyruvate metabolism (e.g., CYP4F12, ACOX3, etc.), and amino acid metabolism (e.g., ANXA1, CTSB, etc.). In stage 2 (hyperplasia stage), the expression of proteins, including PPP2R5A, SERPINB3, and CSTA, involved in the immune response to external damage, were overrepresented. Enhancement of ERBB signaling and NOTCH signaling, and the related proteins (e.g., EGFR, GRB7, etc.) were detected in stage 3 (Tis stage). In stage 4 (lamina propria stage), insulin signaling and IGF signaling-related proteins were overrepresented (e.g., SOD1/2, PPP3CC, etc.). Cell cycle/TLR signaling (e.g., MCM2/3, TLR8, etc.) and DNA repair/mTOR signaling (e.g., MSH3, MTOR, etc.) were the dominant pathways in stage 5 (muscularis mucosa stage), and the stage 6 (sm a stage), respectively. In stage 7 (sm b stage), PI3K-AKT signaling and EMT signaling (e.g., AKT2/3, COL2A1, etc.) were dominant, of which EMT signaling is involved in tumor-initiation and motility^[231]39. Glycolysis and Wnt signaling (e.g., PGK1, WNT2B/7B, etc.) related proteins were overrepresented in stages 8 (T2 stage) and 9 (T3 stage). PGK1, an important glycolytic enzyme, though not mutated at the gene level, was highly increased at the protein level, especially in the T2 and T3 stages of ESCC, which provided further evidence for the crucial role of glycolysis, and suggested the potential function of PGK1 in ESCC progression. In the validation cohort, 256 samples covered 6 of 9 histopathological stages, in which stages 6 (submucosal invasion cancer stage a), 8 (T2), and 9 (T3), were not included. Substage-based supervised clustering analysis identified six proteomic patterns (V1–V6), covering six of eight patterns in the main cohort (Supplementary Fig. [232]5i). In addition, the stages in the proteomic panels were consistent with those in the main cohort. To further demonstrate the result at the molecular level, we analyzed the DEPs of the stages (Kruskal–Wallis test, FDR < 0.05), and found that the molecular characterizations of the six patterns in the validation cohort were consistent with those in the main cohort (Supplementary Fig. [233]5j). Specifically, the V1 (stage 1, normal tissues) proteins were associated with lipid and protein metabolism (e.g., ACOX1, ALOX12, etc.) (Supplementary Fig. [234]5j, k). In the V2 (stage 2, hyperplasia stage), inflammatory response (e.g., COL6A5, SERPINB3, etc.) was dominant. NOTCH signaling (e.g., ADAM10, CREBBP, etc.) and insulin signaling (e.g., IRS2, PPP1CA, etc.) were the dominant pathways in the V3 (stage 3, Tis stage) and V4 (stage 4, lamina propria stage), respectively. At the muscularis mucosa stage (stage 5, V5), cell cycle and TLR signaling (e.g., MCM2, TLR2, etc.) were dominant. The V6 (stage 7, submucosal invasion cancer stage b) proteins participated in EMT signaling and PI3K-AKT-mTOR signaling (e.g., MMP1/2, PIK3CB, etc.) were consistent with the molecular features of the submucosal invasion cancer stage in the main cohort. Therefore, the molecular characterizations of the proteomic patterns in the validation cohort were consistent with those in the main cohort (Supplementary Fig. [235]5k). At the gene level, we observed the stage-associated mutations including PCDHB16 (Fisher’s exact test, p = 0.038), BOC (Fisher’s exact test, p = 0.038), SYNE2 (Fisher’s exact test, p = 0.048), BCL9L (Fisher’s exact test, p = 0.048), STAG2 (Fisher’s exact test, p = 0.048), which were detected in stage 6 and last till to the stage 9 (Fig. [236]4c). In addition, the mutations of PCDHB16, BOC, and STAG2 were co-occurrent with AKAP9 mutation, which exhibited positive impacts on glycolysis. The recurrent co-occurrence of genomic events helps to dissect the genomic complexity underlying tumor progression^[237]40, thus enabling us to explore the functional impacts of the co-occurrence mutations in ESCC progression. Furthermore, the multi-dimensional omics data provided an excellent chance to explore the relationships between the genome and the proteome in the time-resolved ESCC progression. We thus performed enrichment pathway analysis using the DEPs (n = 608), which were the overlapped between the differentially upregulated proteins (n = 1748, Mut vs. WT ratio ≥ 2) and highly expressed phosphoproteins (n = 2818, submucosa stage to T3 stage) (Fig. [238]4d). The results showed that those proteins participated in DNA repair, cell proliferation, and glycolysis (Fig. [239]4e). Specifically, DNA repair (e.g., LIG1 S66, PARP1 T361, etc.) and glycolysis (e.g., GPI Y564, PGAM1 S118, etc.) related proteins were overrepresented in the submucosa stages (stages 6 and 7) and in the advanced stages (stages 8 and 9) of ESCC, respectively, at the protein and phosphoprotein levels (Kruskal–Wallis test, FDR < 0.05, certain stage vs. other stages ratio ≥ 2) (Fig. [240]4f, g). To explore the kinase-substrate interactions in ESCC progression, we integrated the overrepresented kinases in the co-occurrence mutations group, and performed the kinase-substrate analysis based on phosphoprotein data. As a result, we found the kinases-substrate regulation was notably consistent with the dynamic waves in ESCC progression. Specifically, the stage 6 overrepresented kinases (e.g., PRKCD, SRC, CDK7) (Kruskal–Wallis test, FDR < 0.05, stage 6 vs. other stages ratio ≥ 2), showed positive regulation with the DNA repair-related phosphoproteins, such as BCLAF1 S161, EP300 T885, POLR2A T1898, etc. The T2 and T3 stages overrepresented kinases (e.g., PAK1, AKT1, CDK1) (Kruskal–Wallis test, FDR < 0.05, T2 and T3 stages vs. other stages ratio ≥ 2), displayed positive regulation on glycolysis related phosphoproteins, evidenced by the overrepresented phosphorylation of PGAM1 S14 and S118, PFKFB3 S22, LDHA T95, etc. (Fig. [241]4g). These results suggested the potential therapy of kinase inhibitor (e.g., XMD11-85h, Dasatinib, CGP-60474/Seliciclib, etc.) in ESCC progression. Taken together, a carcinogenesis path with eight dynamic waves in ESCC progression was revealed on the basis of the consistency among the genomic aberrations, proteomic alterations, and phosphoproteomic actions: metabolism (e.g., ANXA1 and CTSB) – DNA damage (e.g., PPP2R5A and SERPINB3) – cell proliferation (e.g., EGFR and EGF) – lesion invasion (e.g., GRB2 and PRKCB) – cell cycle (e.g., MCM2/3 and FGFR1OP) – cell differentiation (e.g., MSH3 and MTOR) – tumor metastasis (e.g., MMP2 and AKT2/3) – esophageal carcinogenesis (e.g., PGK1 and CTNNB1) (Supplementary Fig. [242]5k). Furthermore, these results also defined the substages-specific molecular characteristics and uncovered the potential candidates for ESCC malignancy. Proteome clusters of ESCC progression Consensus clustering (Methods) identified two major proteomic clusters which showed association with the classification of the (sub)stages in ESCC progression: Cluster 1 (C1, n = 314) included the stage 1, and 2, and Cluster 2 (C2, n = 472) contained the rest of the stages, including the T2 and T3 stages (Fig. [243]5a, Supplementary Fig. [244]6a, and Supplementary Data [245]5a). Comparative analysis of the later-stage samples (especially the IEN phase samples) of C1 and C2 with the clinic features disclosed that the later samples in the C1 were prominent in the younger ESCC patients (<50 years old, Fisher’s exact test, p = 0.029, 15.0%) (Supplementary Fig. [246]6b), indicating the potential impacts of ages in ESCC progression. In addition, the two clusters were associated with 3 phases in ESCC progression, in which the C2 contained the most samples in the IEN phase (95.3%) and the A-ESCC phase (100%), reflecting more malignancy of the C2 (Supplementary Fig. [247]6c). This distinct two-stage separation suggested that the irreversible fundamental proteome alterations were detected as early as in the Tis stage (stage 3). Van der Schaaf et al. pointed out that the Tis stage was associated with worse survival^[248]41, further indicating the malignancy of Tis stage in ESCC progression. Fig. 5. Proteomic clusters and the impacts of AKAP9 mutation in ESCC progression. [249]Fig. 5 [250]Open in a new tab a Consensus clustering analysis of 786 samples (two-sided Fisher’s exact test). Left: the percentages of the two clusters in 22 substages; Right: 786 samples were classified into two clusters based on proteomic patterns. *p = 0.029 (Age), ****p < 2.2E–16 (Phases), ****p < 2.2E–16 (Substages). b Volcano analysis of DEPs (left) in the two clusters and their associated biological pathways (right) in the two clusters (two-sided Student’s t-test). Biological pathways were analyzed from the Reactome database. C1: the Cluster 1. C2: Cluster 2. c Venn diagram depicting the number of the genes both detected in the genome and proteome in C2. The right shows the significant C2 mutations with mutation frequency over 10%. d Heatmap showing the impacts of AKAP9 mutation on the protein level of AKAP9 (two-sided Student’s t-test, BH-adjusted **p = 8.4E–3). e Scatterplot showing the relationship between log[10] PRKACA and log[10] AKAP9 expression at the protein level (two-sided Pearson’s correlation test, mean ± SD). f GSEA plot (KEGG gene sets) for glycolysis in AKAP9 mutation and WT comparison. g Heatmap depicting the impacts of AKAP9 mutation on glycolysis in ESCC progression (two-sided Student’s t-test, BH-adjusted *p < 0.05). The square directs to a subset of patient samples used for WES (n = 102). h Scatterplots showing the relationship between log[10] G6PD (left)/HK1 (right) and log[10] GPI expression at the protein level (two-sided Pearson’s correlation test, mean ± SD). i A brief summary of the impacts of AKAP9 mutation. ****p < 1.0E–4, ***p < 1.0E–3, **p < 0.01, *p < 0.05, ns. > 0.05. Source data are provided as a Source data file. The significance analysis of microarray (SAM)^[251]42 analysis (Methods) was performed to investigate the characteristics of the two clusters at the protein level, which identified 2922 DEPs between C1 and C2 (t-test, FDR < 0.05, C2 vs. C1 ratio ≥ 2 or ≤ 0.5), including 168 and 2754 proteins overrepresented in the C1 and C2, respectively (Supplementary Fig. [252]6d and Supplementary Data [253]5b). The results showed that the oncogenic pathways-related proteins, including ATM, CDK1, EGFR, CASP3, VCAN, etc., were significantly overrepresented in the C2. On the contrary, the overrepresented proteins of the C1, including KLK12, ACOX3, ALOX15B, and ANXA1, involved in keratinization and inflammatory response, were associated with the primary biological function of normal esophagus (Fig. [254]5b and Supplementary Fig. [255]6g). To validate the molecular characterization of two proteomic clusters, we performed PCA of the 256 samples in the validation cohort, which were also classified into two clusters. Consistent with the findings in the main cohort, the C1 included stages 1 and 2, and the C2 contained the rest of the stages in the validation cohort (Supplementary Fig. [256]6e). In addition, the proteomic clusters in the validation cohort also showed a positive association with the phases in ESCC progression, and specifically, the C1 (Fisher’s exact test, p < 1.0E–4, 89.5%) and C2 (Fisher’s exact test, p < 1.0E–4, 64.4%) consisted most of the samples in the NT phases and the IEN phase, respectively. Furthermore, based on the SAM analysis, the C1 proteins in the validation cohort participated in the primary biological function of normal esophagus, such as keratinization (e.g., KLK12/13, CSTA, etc.), fatty acid metabolism (e.g., ACOX3, ALOX12, etc.), and amino acid metabolism (e.g., MGLL, CYP4F12, etc.) (Supplementary Fig. [257]6f, g). The oncogenic pathways-related proteins were overrepresented in the C2 both in the main cohort and validation cohort, including cell cycle (e.g., CDK1, SKP1, etc.), DNA repair (e.g., DDB1, XPC, etc.), glycolysis (e.g., ENO2, HK3, etc.), and so on. The specific biomarkers of normal esophagus annotated from Human Proteome Atlas (HPA, [258]https://www.proteinatlas.org), including GBP6, TGM1, TGM3, and S100A14, were overrepresented in the C1, which gradually decreased from the normal stage (stage 1) to advanced stages (stages 8 and 9) of ESCC (Kruskal–Wallis test, FDR < 2.2E–16, stage 9 vs. stage 1 ratio ≥ 2) (Supplementary Fig. [259]6h, i). Taken together, these results further confirmed the dysregulation of metabolism and oncogenic pathways in ESCC progression. The AKAP9 mutation enhanced glycolysis in the A-ESCC phase Incorporation of genomic aberrations (n = 40, mutation frequency > 10%) and the C2 overrepresented proteome (n = 3385, C2 vs. C1 ratio ≥ 2) revealed that seven mutations (e.g., AKAP9, MACF1, EP400, etc.) were exclusively detected in the C2 with mutation frequency over 10% (Fig. [260]5c). Furthermore, the mutation of AKAP9 was associated with poor prognosis outcomes of ESCC patients (log-rank test, p = 0.029) indicating the potential functions of AKAP9 mutation in ESCC progression (Supplementary Fig. [261]6j). AKAP9, one of the A-kinase anchoring proteins (AKAPs), binds to the regulatory subunit of AMP-dependent protein kinase (PKA) and achieves the activation of PKA^[262]43, which regulates multiple signaling cascade^[263]44, such as glucose metabolism including glycolysis^[264]45. In our cohort, we found the mutation of AKAP9 upregulated the protein level of AKAP9 (t-test, AKAP9 Mut vs. WT ratio = 6.58, FDR = 8.4E–3) (Fig. [265]5d and Supplementary Fig. [266]6k). Moreover, AKAP9 showed a significantly positive correlation with PRKACA, one of the PKA catalytic subunits^[267]46, at the protein level (Pearson’s R = 0.37, p = 0.011) (Fig. [268]5e). Gene set enrichment analysis (GSEA) demonstrated that the AKAP9 mutation positive-correlated proteins were converged on glycolysis (normalized enrichment score (NES) = 2.37, FDR = 0) (e.g., HK1, GPI, PGK1, etc.), which were gradually increased in ESCC progression (Kruskal–Wallis test, FDR < 0.05, A-ESCC vs. NT ratio ≥ 2) (Fig. [269]5f, g). Consistently, the expression of AKAP9 was significantly overrepresented in the C2 in the validation cohort (t-test, FDR = 3.1E–3, C2 vs. C1 ratio = 3.29) (Supplementary Fig. [270]6l). We integrated AKAP9 positively associated proteins (n = 1344), and found those proteins were involved in glucose metabolism, such as glycolysis, TCA cycle, etc. (Supplementary Fig. [271]6m). These consistent findings further indicated the impacts of AKAP9 mutation on glycolysis in ESCC progression. Allosteric regulation refers to the process where the effect of binding of a ligand at one site of a protein is transmitted to another, often distant, functional site, and thus regulates biological processes including glucose metabolism^[272]47. As well as HK1 and GPI, G6PD, as one of the activators in the transformation process from glucose-6-phosphate to fructose-6-phosphate^[273]48, showed a significant association with GPI (Pearson’s R = 0.33, p = 1.7E–3), further demonstrating the activation of glycolysis (Fig. [274]5h and Supplementary Fig. [275]6n). At the phosphoprotein level, we also detected the increased phosphorylation of glycolysis (e.g., ENO1 S272, PGAM1 S189, etc.) in ESCC progression (Kruskal–Wallis test, FDR < 0.05, A-ESCC vs. NT ratio ≥ 2) (Supplementary Fig. [276]6o). Taken together, the mutation of AKAP9, activated PKA and enhanced the energy formation, elevating the process of glucose metabolism, especially glycolysis, in the A-ESCC phase (Fig. [277]5i). Thus, our study provided valuable insights into the proteomic characteristics with AKAP9 mutation and revealed the impacts on glycolysis in ESCC progression at the multi-omics level. Personalized trajectory revealed six major carcinogenesis tracks of early ESCC The diversity and tumoral heterogeneity of ESCC remain challenging to decide precise clinical strategies for different ESCC patients who have diverse featured carcinogenesis tracks. To this end, we used trajectory inference methods^[278]49 (Methods) to trace the carcinogenesis lineages of early ESCC in the cohort. As a result, six major tracks were classified (13.2%, 7.0%, 7.9%, 43.9%, 19.3%, and 8.7% of patients, respectively) (Fig. [279]6a, b and Supplementary Data [280]6a). The track proteins were determined by the expression trend along with ESCC progression (Kruskal–Wallis test, FDR < 0.05, stage 9 vs. stage 1 ratio ≥ 2), and the dominant pathways of six major tracks were annotated as follows: (1) track 1 (T1, n = 15), biomaterial synthesis; (2) track 2 (T2, n = 8, non-drinking/smoking track), ECM signaling; (3) track 3 (T3, n = 9, female track), cell cycle; (4) track 4 (T4, n = 50, mainstream track (mainstream population of ESCC patients)), DNA repair; (5) track 5 (T5, n = 22, older track), glucose metabolism; and (6) track 6 (T6, n = 10, drinking/smoking track), immune response (Fig. [281]6b and Supplementary Data [282]6b, c). The proteins, of which the expression was gradually decreased in six tracks (Kruskal–Wallis test, FDR < 0.05, stage 9 vs. stage 1 ratio ≤ 0.5), were involved in the primary biology of normal esophagus, such as epithelial cell differentiation (e.g., EVPL and AHNAK2) and keratinization (e.g., FLG, KLK12/13 and SPRP3) (Supplementary Fig. [283]7a). Fig. 6. Personalized trajectory reveals six major carcinogenesis tracks of the early ESCCs. [284]Fig. 6 [285]Open in a new tab a The trajectory of 746 samples (top) and 114 early ESCC cases were grouped into 9 (bottom). b Sankey diagram analysis of 114 early ESCC cases (top, main cohort) and 49 early ESCC cases (bottom, validation cohort). c Venn diagram showing the track mutations (top) and the CAGs (bottom) (two-sided Fisher’s exact test). CAGs: cancer-associated genes. The overlapped mutations are shown in the box. d The CAG-associated track mutations in the early ESCCs. The co-mutations are highlighted on the left (two-sided Fisher’s exact test), and the mutation frequency is shown on the right. The square directed to a subset of patient samples used for WES (n = 68) in early ESCCs. Co-mutations: *p = 0.032 (TP53 and EPAS1), *p = 0.032 (TP53 and EPHA3), ****p = 9.3E–6 (EPAS1 and EPHA3), *p = 2.2E–7 (STAG2 and USP6), ****p = 9.3E–6 (USP6 and AKAP9), ****p = 3.2E–5 (STAG2 and AKAP9). e GSEA plot (KEGG gene sets) for ECM signaling in EPAS1 mutation and WT comparison. f Venn diagram depicting the number of the overlapped proteins enhanced by the mutation of EPAS1 and T2 enhanced phosphoprotein (top), and the associated biological pathways (bottom). SUPs: the significantly upregulated proteins. g Heatmap showing the represented protein in the cell–cell adhesion positive associated with EPAS1 mutation (two-sided Fisher’s exact test). The square directs to a subset of patient samples used for WES (n = 68) in early ESCCs. Co-mutations: *p = 0.032 (TP53 and EPAS1), *p = 0.032 (TP53 and EPHA3), ****p = 9.3E–6 (EPAS1 and EPHA3). h Heatmap showing the phosphorylation of the phosphoproteins in cell–cell adhesion (Kruskal–Wallis test). The square directs to a subset of patient samples used for phosphoproteome (n = 119) in early ESCCs. ****p < 1.0E–4, ***p < 1.0E–3, **p < 0.01, *p < 0.05, ns. > 0.05. Source data are provided as a Source data file. Furthermore, different tracks were closely associated with various clinical features of early-stage ESCC patients, which improved our understanding of tumor heterogeneity. For example, track 3, featured with cell cycle, had the highest proportion of female patients (Fisher’s exact test, p = 7.7E–14, 29.2%) (Supplementary Fig. [286]7c), significantly greater than the natural proportion of female patients in ESCC (no more than 10%)^[287]50. Track 6, featured as immune response, had the highest proportion of patients with drinking/smoking habits (Fisher’s exact test, p = 2.3E–16, 18.1%), which was associated with chronic inflammatory and regulated oxidative stress in various cancer types^[288]51, revealing the unique track in the patients with drinking/smoking habits. To validate the features of the six major tracks, the trajectory inference methods were also applied to 49 early-stage ESCC patients in the validation cohort. As a result, we found all the patients in the validation cohort were also classified into tracks 1–6, of which the molecular characterization was similar to those in track 1 to track 6 from 114 early-stage ESCC patients in the main cohort (Fig. [289]6b, Supplementary Fig. [290]7b, d, and Supplementary Data [291]6d–f). Specifically, the ESCC patients with drinking/smoking habits were prominent in track 6 of the validation cohort (Fisher’s exact test, p = 5.1E–9, 18.6%) (Supplementary Fig. [292]7c). To explore the features of the ESCC patients with drinking/smoking habits, we integrated the gradually increased proteins, which participated in the immune response pathways, including interleukins signaling, antigen processing and presentation, etc. (Supplementary Fig. [293]7e). The consistent findings in the characterizations of track 6 were observed in the main and validation cohorts, further validating the findings of the personalized tracks. At the gene level, we identified 150 differential track mutations, 5 of which were also covered in CAGs with mutation frequency over 10%, including TP53, EPAS1, EPHA3, STAG2, and USP6 (Fig. [294]6c and Supplementary Data [295]6g). Furthermore, we found the mutations of STAG2 (Fisher’s exact test, p = 1.1E–3, track 2 vs. track 4 vs. track 5 vs track 6 = 0% vs. 0% vs. 30.4% vs. 0%) and USP6 (Fisher’s exact test, p = 3.1E–3, track 2 vs. track 4 vs. track 5 vs track 6 = 0% vs. 0% vs. 26.1% vs. 0%) were prominent in track 5, which were all co-occurrence with AKAP9 mutation (Fisher’s exact test, p = 3.2E–5 for STAG2 and 9.3E–6 for USP6), suggesting the roles of glycolysis in the ESCC patients in track 5 (Fig. [296]6d). In addition, GSEA showed the positive impacts of the mutations of STAG2 (NES = 2.10, FDR = 7.2E–4), and USP6 (NES = 2.04, FDR = 0) on pentose phosphate pathway (Supplementary Fig. [297]7f), which were the evidence of glucose metabolism characteristics in track 5. Notably, the mutations of TP53 (Fisher’s exact test, p = 7.2E–3, track 2 vs. track 4 vs. track 5 vs track 6 = 100% vs. 73% vs. 57% vs. 27%), EPAS1 (Fisher’s exact test, p = 3.2E–5, track 2 vs. track 4 vs. track 5 vs track 6 = 71% vs. 0% vs. 4% vs. 0%), and EPHA3 (Fisher’s exact test, p = 2.2E–7, track 2 vs. track 4 vs. track 5 vs track 6 = 86% vs. 0% vs. 0% vs. 0%) were prominent in track 2. As shown in Supplementary Fig. [298]4l, TP53 mutation displayed positive impacts on ECM signaling, demonstrating the characteristics of track 2. GSEA displayed the positive impacts of the mutations of EPAS1 (NES = 1.99, FDR = 0), EPHA3 (NES = 1.83, FDR = 1.6E–3) on ECM signaling (Fig. [299]6e and Supplementary Fig. [300]7g). Furthermore, EPAS1 mutation was co-occurrence with the mutations of TP53 (Fisher’s exact test, p = 0.032) and EPHA3 (Fisher’s exact test, p = 9.3E–6). To elucidate the impacts of EPAS1 genomic aberrations on proteomic alterations and phosphoproteomic actions, we integrated the molecules (n = 126) overlapped in the EPAS1 mutation SUPs and track 2 overrepresented phosphoproteins. As a result, we found those overlapped molecules were involved in cell–cell adhesion (e.g., ITGA5, EGFR, etc.) (Fig. [301]6f, g). Compared with other tracks, the phosphorylation of the cell–cell adhesion proteins (ITGA5 S127 and S128, EGFR S991, etc.) was overrepresented in track 2 (Fig. [302]6h). Together, our study revealed six major carcinogenesis tracks, and found the track specific mutations had positive impacts on the track carcinogenesis lineages of ESCC. PGK1, aberrant glycolytic enzyme, is a potential therapeutic target Abnormal glycolytic metabolism was observed in the whole process of ESCC carcinogenesis at the multi-omics level, which dramatically increased throughout carcinogenesis (Fig. [303]7a). PGK1, the first ATP-generating enzyme in glycolysis, gradually increased in ESCC progression in all 6 tracks at the protein level and identified as the nominated drug-targetable protein in ESCC progression (Supplementary Fig. [304]8a). In addition, we performed Cox regression analysis to assess the prognostic value of PGK1 expression, which was negatively correlated with the OS of ESCC in the TCGA dataset (log-rank test, p = 7.8E–3) (Fig. [305]7b and Supplementary Data [306]7a). To further cross-validate these results, our dataset confirmed that the expression of PGK1 was gradually increased in ESCC at the protein and phosphoprotein levels (Kruskal–Wallis test, FDR = 1.3E–18 for proteome and 4.2E–3 for phosphoproteome, stage 9 vs. stage 1 ratio = 2.30 for proteome and 2.49 for phosphoproteome) (Fig. [307]7c), as demonstrated by immunohistochemistry of ESCC FFPE slides in which PGK1 was gradually increased in the process from T0 stage (normal tissue) to the Tis stage, SM2 stage, and advanced stage (Fig. [308]7d). In addition, the elevated protein of PGK1 was identified in more than 75% (476/672) samples for proteome and 60% (75/125) for phosphoproteome from the stage 2 to stage 9 (n = 672 for proteome and 125 for phosphoproteome) (Fig. [309]7a, c). The only identified motif (sP) of PGK1 was ubiquitously (125/145) detected, and the expression of PGK1 S203 was also gradually increased with ESCC progression (Kruskal–Wallis test, FDR = 0.011, stage 9 vs. stage 1 ratio = 2.44) (Fig. [310]7e and Supplementary Fig. [311]8b). Fig. 7. Aberrant glycolytic metabolism in ESCC and alterations in the activities of its key enzyme, PGK1. [312]Fig. 7 [313]Open in a new tab a Aberrant glycolysis in ESCC progression at the multi-omics level (Kruskal–Wallis test, BH-adjusted *p < 0.05). A total of 786 samples for proteomic profiling and 145 samples for phosphoproteomic profiling are used. b Highly expressed PGK1 is negatively correlated to prognosis (two-sided log-rank test). c Boxplots showing the increased expression (log[10]-transformed Intensity) of PGK1 in ESCC at the protein (left) and phosphoprotein (right) levels (Kruskal–Wallis test). Boxplots show median (central line), upper and lower quartiles (box limits), 1.5× interquartile range (whiskers). A total of 786 samples and 145 samples were used for proteome and phosphoproteome, respectively. Proteome: n (stage 1) = 114, n (stage 2) = 206, n (stage 3) = 259, n (stage 4) = 86, n (stage 5) = 32, n (stage 6) = 17, n (stage 7) = 32, n (stage 8) = 16, n (stage 9) = 24 biologically independent samples examined. Phosphoproteome: n (stage 1) = 20, n (stage 2) = 37, n (stage 3) = 31, n (stage 4) = 14, n (stage 5) = 5, n (stage 6) = 3, n (stage 7) = 9, n (stage 8) = 10, n (stage 9) = 16 biologically independent samples examined. d Immunohistochemistry analysis of PGK1 expression in normal (T0), Tis (T1), SM2 (T1), and advanced stage (T2/T3) tissues. The zone with the dotted lines and red arrow represents PGK1 positive staining. The scale bar indicates 50 µm. e Analysis of the serine motif of PGK1 (sP). The top shows the sequence and phosphorylated sites of PGK1 (S203). The bottom presents that PGK1 S203 is detected at almost samples in ESCC (125/145) and the kinases associated with the motif of PGK1 S203 (“sP”). f The expression of the kinases and the substrates in ESCC progression at the phosphoprotein level. The square directs to a subset of patient samples used for phosphoproteome. g Volcano plot displaying the ERK2-substrates (top) and CDK2-substrates (bottom) regulation results (two-sided Wilcoxon rank-sum test). The red marks the overrepresented substrates (left) and phosphorylations (right) in the kinases highly expressed group. h Histogram showing the Z-score and FDR of the KSEA results. A total of 145 samples for phosphoproteomic profiling are used in the analysis. i The SCNAs of CDK2 have positive effects on PGK1 expression (two-sided Wilcoxon signed-rank test). j The impacts of the SCNAs of CDK2 (middle) on the substrates expression of the kinases (bottom), associated with the PGK1 motif (sP). The square directs to a subset of patient samples used for WES (n = 102). ****p < 1.0E–4, ***p < 1.0E–3, **p < 0.01, *p < 0.05, ns. > 0.05. Source data are provided as a Source data file. To investigate the proteome-phosphoproteome regulation between the motif of PGK1 and the kinases, we applied motif extraction algorithm^[314]52 to MS phosphorylation dataset from 145 samples. The results revealed the association of ERK1/2, CDKs, and GSK-3 with the motif of PGK1 (sP)^[315]53, whose downstream substrates and the related corresponding phosphorylations, including MAP1S (S759), FOXK1 (S420), MAP2K1 (T386), etc., were increased in ESCC progression at the protein and phosphoprotein levels (Kruskal–Wallis test, FDR < 0.05, stage 9 vs. stage 1 ratio ≥ 2) (Fig. [316]7f). In addition, we found that PGK1 and the phosphorylation (S203) were overrepresented in the kinases highly expressed group (two-sided Wilcoxon rank-sum test, FDR < 0.05, highly vs. lowly expressed group ratio ≥ 2), especially in ERK2 and CDK2 highly expressed group (Fig. [317]7g), indicating the potential functions of ERK2 and CDK2 in the activation of PGK1 (S203) in ESCC progression. Our previous study has found ERK1/2 could phosphorylate PGK1 S203 and result in the mitochondrial translocation of PGK1^[318]54. In this study, we performed the kinase-substrate enrichment analysis (KSEA) of the phosphoproteome of PGK1 (S203). The results of the score and FDR showed that ERK2 was the top-rank one kinase to activate PGK1 in ESCC progression (Fig. [319]7h). Furthermore, we found that the SCNAs of CDK2 had positive impacts on the expression of PGK1 (Wilcoxon rank-sum test, FDR = 2.8E–3) (Fig. [320]7i), and the substrates (e.g., GTF2F1, PAK4, etc.) of kinases (Fig. [321]7j). These findings indicated that the SCNA of CDK2 and ERK1/2 synergistically induced the total activity of PGK1 through increasing PGK1 expression both at the protein and phosphoprotein levels. Thus, we proposed that PGK1 S203 could be implicated in identifying the potential therapeutic target to manage ESCC. Next, we investigated the roles of PGK1 in regulating glucose and serine metabolism. Overexpression of PGK1 in KYSE150 cells increased the levels of glycolytic-citrate cycle flux metabolites, including 3-PG, pyruvate, and lactate in glycolysis, and citrate, succinate, and fumarate in citrate cycle (Supplementary Fig. [322]8c and Supplementary Data [323]7b). In addition, the levels of serine and glycine were also increased in PGK1-overexpressing cells. Conversely, the knockdown of PGK1 decreased the concentration of metabolites in glycolysis and citrate cycle, as well as serine and glycine (Supplementary Fig. [324]8d and Supplementary Data [325]7c). Furthermore, the overexpression of ERK2 led to increased Ser-phosphorylation level, but not Thr- or Tyr- phosphorylation levels of PGK1 in KYSE150, KYSE70, ECA109, and TE-8 cell lines (Fig. [326]8a). On the contrary, ERK2 could not increase the Ser-phosphorylation level of PGK1 S203 mutant (S203A) (Fig. [327]8a), indicating that ERK2 phosphorylated PGK1 S203 in ESCC cells. Moreover, the increased Ser-phosphorylation level of PGK1 led to mitochondrial translocation of PGK1 (Fig. [328]8b), which increased the phosphorylation level of PDHK1 at T338 (Fig. [329]8c), and decreased pyruvate dehydrogenase (PDH) activity in ERK2 overexpressing cells (Fig. [330]8d and Supplementary Data [331]7d). It was also observed that the metabolites of glycolysis and serine metabolism were further increased, while citrate cycle metabolites were decreased in ERK2 overexpression cells (t-test, p < 0.05) (Supplementary Fig. [332]8c), suggesting that the overexpression of ERK2 could shut down the pyruvate dehydrogenase complex. Moreover, through Seahorse assay, we found overexpression of PGK1 decreased oxygen consumption rate (OCR) and ATP production, and increased extracellular acidification rate (ECAR) (Fig. [333]8e, f and Supplementary Data [334]7e). In contrast, knockdown of PGK1 increased OCR and ATP production, and decreased ECAR (Fig. [335]8e, f and Supplementary Data [336]7f). In addition, increased expression of ERK2 further decreased OCR and ATP production, and increased ECAR, in PGK1-overexpression cells (Fig. [337]8e, f). Collectively, these results indicated that the increased expression and phosphorylation levels of PGK1 synergistically enhanced glycolysis and serine metabolism. Accordingly, we confirmed that co-overexpression of PGK1 and ERK2 in the above four kinds of ESCC cell lines (KYSE150, KYSE70, ECA109, and TE-8) promoted their proliferation most profoundly, compared to cells overexpressing either PGK1 or ERK2 (t-test, p < 1.0E–4) (Fig. [338]8g and Supplementary Fig. [339]8e, and Supplementary Data [340]7g). In contrast, the knockdown of PGK1 slowed down the cell proliferation, which was further inhibited by double-knock down of PGK1 and ERK2 in those four kinds of ESCC cell lines (t-test, p < 1.0E–3) (Supplementary Fig. [341]8f, g and Supplementary Data [342]7h). Furthermore, unlike PGK1, the overexpression of other glycolytic enzymes, including GAPDH and PGM1, which catalyzed the last and the next step reaction of PGK1, respectively, did not show pro-proliferation effects in KYSE150 cells (t-test, p < 0.05) (Supplementary Fig. [343]8h and Supplementary Data [344]7i). Taken together, these results indicated that the activation of glycolytic enzyme PGK1 was associated with the proliferation of ESCC cells. Fig. 8. PGK1 reprograms glucose metabolism and contributes to ESCC progression. [345]Fig. 8 [346]Open in a new tab a Pan Serine/Threonine/Tyrosine-phosphorylation levels of PGK1 in KYSE150 cells, KYSE70 cells, ECA109 cells, and TE-8 cells. b PGK1 level in mitochondria and cytosol fraction of in KYSE150 cells, KYSE70 cells, ECA109 cells, and TE-8 cells. c The impacts of PGK1 and/or ERK2 on PDHK1 T338 phosphorylation levels in KYSE150 cells, KYSE70 cells, ECA109 cells, and TE-8 cells. d The impacts of PGK1 and/or ERK2 on PDH activity in KYSE150 cells (n = 36) and ECA109 cells (n = 36) (two-sided Student’s t-test, mean ± SD). KYSE150: p = 0.088, *p = 0.031, ****p = 7.7E–6, p = 0.81, p = 0.66 from left to right. ECA109: *p = 0.043, p = 0.052, ****p = 1.3E–7, p = 0.78, p = 0.058 from left to right. e The impacts of overexpressed and knockdown PGK1 and ERK2 on OCR and ATP production (two-sided Student’s t-test, mean ± SD). Twenty-four cell samples were used in the analysis. Top: ****p = 1.6E–8 (PGK1), ****p = 2.6E–12 (PGK1 + ERK2). Bottom: ***p = 7.7E–4. OCR: oxygen consumption rate. f The impacts of overexpressed and knockdown PGK1 and ERK2 on ECAR (two-sided Student’s t-test, mean ± SD). Sixteen cell samples are used in the analysis. Top: **p = 1.2E–3 (PGK1), ****p = 1.0E–11 (PGK1 + ERK2). Bottom: ***p = 3.9E–7. ECAR: extracellular acidification rate. g The impacts of PGK1-overexpression (OE) and/or ERK2-OE on cell proliferation in KYSE150 cells, KYSE70 cells, ECA109 cells, and TE-8 cell (two-sided Student’s t-test, mean ± SD). A total of 320 cell samples were used in the analysis. KYSE150: ****p = 1.5E–7 (PGK1), ****p = 1.6E–5 (ERK2), ****p = 2.9E–9 (PGK1 + ERK2). KYSE70: *p = 0.014 (PGK1), **p = 4.5E–3 (ERK2), ****p = 6.8E–5 (PGK1 + ERK2). ECA109: ****p = 5.4E–7 (PGK1), ****p = 3.2E–6 (ERK2), ****p = 2.7E–9 (PGK1 + ERK2). TE-8: **p = 2.1E–3 (PGK1), ***p = 5.0E–4 (ERK2), ***p = 3.2E–4 (PGK1 + ERK2). h Gemcitabine inhibits cell proliferation (n = 30, two-sided Student’s t-test, **p = 4.8E–3, mean ± SD). i The impacts of PGK1-OE (left) and PGK1 knockdown (right) on the weight of the xenografts in the KYSE150 cells, ECA109 cells, and TE-8 cells (two-sided Student’s t-test, mean ± SD). A total of 130 samples are used in the analysis. Left: KYSE150: ****p = 5.9E–8 (Control and PGK1-OE), ****p = 4.4E–9 (PGK1-OE and PGK1-OE-inhibitor), p = 0.17 (Control and PGK1-OE-inhibitor). ECA109: ****p = 2.1E–8 (Control and PGK1-OE), ****p = 4.4E–9 (PGK1-OE and PGK1-OE-inhibitor), p = 0.17 (Control and PGK1-OE-inhibitor). TE-8: ****p = 1.3E–7 (Control and PGK1-OE), ****p = 4.0E–8 (PGK1-OE and PGK1-OE-inhibitor), p = 1.7E–3 (Control and PGK1-OE-inhibitor). Right: ****p = 8.6E–8 (KYSE150), ****p = 3.4E–7 (ECA109), ****p = 7.6E–8 (TE-8). ****p < 1.0E–4, ***p < 1.0E–3, **p < 0.01, *p < 0.05, ns. > 0.05. Source data are provided as a Source data file. It has been reported that pyrimidine deoxynucleoside analog diphosphates (for example, gemcitabine) could be used as PGK1 inhibitors because L-nucleoside analog diphosphates were selectively phosphorylated by PGK1^[347]55. Therefore, we tested the potential of gemcitabine to inhibit ESCC tumor growth. Firstly, we validated the significant inhibitory effects of gemcitabine on PGK1 (t-test, p = 4.8E–3) with an IC50 of 16.3 nM (Fig. [348]8h and Supplementary Fig. [349]8i, and Supplementary Data [350]7j) using an in vitro enzymatic assay. Secondly, treating the cultured cells with 50 nmol/L gemcitabine significantly decreased the glycolytic flux (such as pyruvate (t-test, p = 1.6E–3) and lactate (t-test, p = 0.014)) and cell proliferation (Supplementary Fig. [351]8j and Supplementary Data [352]7k, l). Thirdly, the xenograft growth-promoting ability of PGK1 overexpression was abolished in gemcitabine treated mice bearing either KYSE150 cells (t-test, p = 4.4E–9), ECA109 cells (t -test, p = 2.2E–9), or TE-8 cells (t-test, p = 4.0E–8), which was consistent with the effects of PGK1 knockdown (t-test, p (KYSE150) = 8.6E–8, p (ECA109) = 3.4E–7, p (TE-8) = 7.6E–8) (Fig. [353]8i and Supplementary Fig. [354]8k, and Supplementary Data [355]7m, n). In addition, we found that increased levels of PGK1 phosphorylation, but not PGK1 protein levels, were associated with tumor weight in xenograft TE-8 cells (Supplementary Fig. [356]8l, m and Supplementary Data [357]7o). Overall, these observations suggested that enhanced PGK1 expression was associated with early ESCC development. Discussion ESCC is one of the most common malignancies, with a relatively low overall 5-year survival rate (less than 30%). Even though the whole genome sequences of ESCC patients have been obtained, the tumor heterogeneity and lack of understanding of the molecular mechanisms in ESCC progression impose many challenging unmet clinical needs in ESCC. It was speculated that tracking the occurrence and development of early ESCC could provide direct evidence of cancer-driving pathways and molecules in each stage. Depending on the substage-based model, our study detected precise temporal molecular switches promoting the progression of ESCC at the multi-omics level. The sequence of canonical cancer pathways was also disclosed, which involved ERBB, NOTCH, IGF, cell cycle, DNA repair, PI3K-AKT, mTOR, glycolysis, and Wnt signaling. The co-occurrence mutations of BOC, AKAP9, and PCDHB16, detected as early as in stage 6 and last till the T2 and T3 stages, had positive impacts on DNA repair in stage 6, and glycolysis in the T2 and T3 stages, respectively. To further validate the findings and results in our cohort, we collected another 256 samples as an independent validation cohort from 49 early-stage ESCC patients. The number of stages in the validation cohort was comparable with the main cohort. Comparative analysis showed consistent findings in a time-resolved mode in ESCC progression, and presented that the gradual decrease of keratinization and lipid metabolism revealed that the loss of normal esophagus was an important event in initiating early ESCC. The biomarkers of ESCC tissues, such as ACTA2, TAGLN, POSTN, PSAP, and THBS1^[358]20,[359]21, were also detected and significantly increased during ESCC progression (Kruskal–Wallis test, FDR < 2.2E–16, stage 9 vs. stage 1 ratio ≥ 2) (Supplementary Fig. [360]5e, f). These findings provided a temporal dimension and trans-omics dimension in understanding the ESCC progression. It is generally agreed that carcinogenesis is a chronic process involving several genes and pathways in different stages. The significant genomic aberrations were then translated to proteomic alterations in ESCC progression. Liu et al. also showed that several significantly mutated genes were shared in the earlier and advanced stage of ESCC^[361]7. Thus, we hypothesized that the mutations were cumulative in ESCC progression, and the key mutation/events which were found in the advanced stages might exist in the earlier stages. In our cohort, we observed that the number of mutations gradually cumulated during ESCC carcinogenesis. The dual-peak of the neo-mutations detected at the Tis stage and the T2 stage, perfectly matched their corresponding pathological phenotypes observed in the clinic, delivering that the mutations were not based on the principle of the linear accumulation model, but surged in certain histopathological stages during the carcinogenesis process. The mutation-surging wave at the advanced stage of ESCC (T2 and T3 stages) in the Fudan cohort indicated that the sudden rise of the mutations at specific stages was regular and would determine the carcinogenesis trend. The neo-mutation peaks in ESCC progression promoted to divide the carcinogenesis of ESCC into three phases (NT, IEN, A-ESCC). Compared to the IEN and A-ESCC phases, the lowest TMB was observed in the NT phase, in which TP53 mutation and DNA repair signature were detected. Biological function of the normal esophagus and inflammatory response were the dominant pathways in the NT phase. The Ca^2+ signal impacts nearly all aspects of cellular life^[362]56, and is implicated in a variety of processes that are important in tumor progression, such as proliferation and invasiveness^[363]57. In our cohort, we found the gain of chr3q was characterized in the IEN phase, in which the cis-effects genes related to the Ca^2+ signal and showed positive impacts on DNA replication. Interestingly, the amplification of TP63 at chr3q28, prominent in the IEN phase, showed positive impacts on DNA replication, implying the combined effects of Ca^2+ signal and TP63 amplification on DNA replication in the IEN phase and further indicating the functions of chr3q gain in the transmit process from the NT phase to the IEN phase and on DNA replication in ESCC progression. The highest TMB was detected in the A-ESCC phase, in which the mutation of MACF1, a large crosslinker that contributes to cell integrity and cell differentiation^[364]58, activated Wnt signaling at the multi-omics level. Interestingly, based on the two proteomic clusters (C1 and C2) which were associated with the three phases in ESCC progression, the C2 (relatively malignant cluster) prominent mutation of AKAP9 upregulated the expression-level of AKAP9 and thus activated PKA, improving the transfer of ATP to ADP and enhancing glycolysis at the protein and phosphoprotein levels (Supplementary Fig. [365]9). These results had explored that all these pathways followed a very precise temporal order during the carcinogenesis progress of ESCC. Except for the diverse lifestyles of the countries^[366]2,[367]3, drinking/smoking habit, gender, and ages are key environmental risks of ESCC^[368]1, while the molecular mechanism is yet unknown. We applied the NMF algorithm to analyze the mutational signatures of the Fudan cohort and other ESCC cohorts, including the TCGA cohort, Moody’s cohort, etc. The integrated findings revealed that the SBS16 signature, which was associated with the ESCC patients with a drinking habit and OLFM4 mutation exhibited positive impacts on CDKs activation and thus enhancing DNA replication evidenced by the related markers, indicating the potential medicative of Dinaciclib for drinking ESCC patients. In addition, the APOBEC signature was the dominant signature in the non-drinking/smoking ESCC patients, and the significant mutation of DCTN2 upregulated the protein level of DCTN2, and displayed positive impacts on RUVBL1 and thus activated DNA replication, implying the potential effects of CB-6644 in the non-drinking/smoking ESCC patients. These results revealed the functions of DNA replication in the IEN phase, whereas the molecular mechanism and clinic strategy were diverse in the ESCC patients with the habits of drinking and non-drinking/smoking. Furthermore, 746 samples from 114 early ESCC patients in the Fudan cohort allowed us to trace the carcinogenesis lineages of early-stage ESCCs, resulting in six tracks closely related to the clinical feature, including gender, age, and risk habits of drinking/smoking. For example, more female ESCC patients were observed in track 3 (Supplementary Fig. [369]9). To further validate the results in our cohort, we collected another 256 samples as an independent validation cohort from 49 early-stage ESCC patient, which were then classified into tracks 1–6 with similar molecular characterization. A large-scale, population-based cohort study has shown that drinking/smoking promotes ESCC carcinogenesis^[370]5. Our study disclosed the carcinogenesis lineage of ESCC patients with drinking/smoking habits (track 6). Integration with the findings in the validation cohort revealed that the immune response pathways, including interleukins signaling, antigen processing and presentation, etc., were the dominant pathways in the ESCC patients with drinking/smoking habits, which was consistent with features of track 6, both in the main cohort and validation cohort, further validating the findings of the personalized tracks. In brief, in addition to the driver waves, this study provided 6 carcinogenesis tracks as references for diverse ESCC clinical therapies.