Abstract Colorectal cancer (CRC) progression is driven by complex metabolic alterations, including aberrant N‐glycosylation patterns that critically influence tumor development. However, the metabolic and functional roles of N‐glycosylation in CRC remain poorly understood. Herein, comprehensive proteomic and N‐linked intact glycoproteomics analyses are performed on 45 CRC tumors, and normal adjacent tissues (NATs) are matched, identifying 7125 intact N‐glycopeptides from 704 glycoproteins. Through analysis of glycoform expression profiles and structural characteristics, a glycosylation site–protein function association network is constructed to uncover metabolic dysregulation driven by N‐glycosylation in CRC. Moreover, an arithmetic model is developed that integrates N‐glycan expression patterns, which effectively distinguishes tumors from NATs, reflecting metabolic reprogramming in cancer. These findings identify Chloride Channel Accessory 1 (CLCA1) and Olfactomedin 4 (OLFM4) as potential metabolic biomarkers for CRC diagnosis. Immunohistochemistry and Cox regression analyses validated the prognostic power of these markers. Notably, the critical role of specific N‐glycosylation at N196 of Adipocyte plasma membrane‐associated protein (APMAP) is highlighted, a key player in tumor metabolism and CRC progression, providing a potential target for therapeutic intervention. These findings offer valuable insights into the metabolic roles of N‐glycosylation in CRC, advancing biomarker discovery, enhancing metabolic‐based diagnostic precision, and improving personalized treatment strategies targeting cancer metabolism. Keywords: clinical relevance, colorectal cancer, functions, glycoproteomics, intact N‐glycopeptides __________________________________________________________________ The comprehensive proteomic and N‐glycoproteomic analyses of 45 colorectal cancer tissues with matched normal adjacent tissues identified 7125 intact N‐glycopeptides from 704 glycoproteins. A glycosylation site‐protein function network revealing metabolic dysregulation is constructed and a model differentiating tumors from normal tissues is developed. CLCA1 and OLFM4 are validated as diagnostic biomarkers, while site‐specific glycosylation at APMAP‐N196 emerged as a therapeutic target. graphic file with name ADVS-12-2415645-g001.jpg 1. Introduction Colorectal cancer (CRC) progression involves multifaceted genetic/epigenetic alterations, posing significant challenges to global health.^[ [44]^1 , [45]^2 ^] While aberrant N‐glycosylation is increasingly recognized as pivotal in CRC pathogenesis,^[ [46]^3 , [47]^4 ^] yet key knowledge gaps persist. First, the biological functions of complete N‐glycan structures and their modification sites remain unelucidated.^[ [48]^5 , [49]^6 , [50]^7 ^] Second, accurately and comprehensively quantifying the impact of N‐glycosylation on protein state changes continues to be a significant challenge.^[ [51]^8 , [52]^9 , [53]^10 , [54]^11 ^] Finally, due to the extensive structural diversity and dynamic nature of N‐glycans, accurately pinpointing their specific functions and regulatory mechanisms has remained challenging, limiting our understanding of N‐glycosylation in cancer.^[ [55]^12 ^] Recent advances in N‐glycoproteomics, particularly enrichment techniques, and spectral interpretation algorithms,^[ [56]^13 , [57]^14 ^] now enable precise characterization of CRC pathologically relevant N‐glycosylation. N‐glycosylation not only encodes a wealth of information beyond the primary protein sequence but also exhibits specific site alterations that can significantly impact the efficacy of cancer therapy targets, such as the N‐glycosylation states of Programmed death‐ligand 1 (PD‐L1)/ Programmed Cell Death Protein‐1 (PD‐1) and Epidermal growth factor receptor (EGFR), which are known to affect the therapeutic outcomes of treatments targeting these molecules.^[ [58]^15 , [59]^16 ^] Notably, N‐glycosylation patterns themselves are emerging as CRC diagnostic/prognostic biomarkers, as their disease‐specific alterations often precede morphological change.^[ [60]^17 , [61]^18 , [62]^19 ^] This analytical power uniquely reveals microenvironmental dysregulation independent of protein abundance/complexity, with particular value for early‐stage biomarker discovery.^[ [63]^20 ^] Therefore, to gain a deeper understanding of the relationship between N‐glycosylation dysregulation and CRC progression, and to identify potential therapeutic targets and biomarkers for CRC, we performed comprehensive proteomic and N‐intact glycoproteomics analyses on 45 CRC tumors and their matched normal adjacent tissues (NATs). In total, 7125 intact N‐glycopeptides (IGPs) corresponding to 704 glycoproteins were identified. We systematically investigated the potential functions associated with the cellular localization and expression profiles of different glycoforms. By integrating the structural characteristics and distribution differences of these glycoforms, we constructed comprehensive glycosylation site‐protein function association networks to better understand the dysregulation of N‐glycosylation in CRC. Furthermore, we developed an arithmetic model to integrate and characterize the complex global changes in N‐glycans in CRC through asynchronous overexpression of sialylated and mannosylated glycoforms. Additionally, a logistic algorithm combining 8 glycoproteins with the highest abundant N‐glycans distinguished tumor samples from NATs well, with an area under the receiver operating characteristic curve (ROC AUC) of 0.979. Further analysis using random forest and logistic regression identified Chloride Channel Accessory 1 (CLCA1) and Olfactomedin 4 (OLFM4) as potential CRC biomarkers, achieving an AUC of 1 and 0.969, respectively. Immunohistochemistry (IHC) and multivariate Cox regression analysis confirmed the model's ability to predict patient prognosis, highlighting the potential of glycoproteins as diagnostic features for colorectal cancer. Notably, our study found that the N‐glycosylation at the APMAP‐N196 site is significantly reduced in CRC tissues. Functional assays demonstrated that this reduction promotes CRC progression, highlighting the critical role of adipocyte plasma membrane‐associated protein (APMAP)‐N196 N‐glycosylation as a critical bridge in CRC progression. These findings thus pave the way for significant advancements in cancer diagnosis, prognosis, and personalized therapy. In summary, our study using N‐linked intact glycoproteomics provides valuable perspectives and tools for elucidating the molecular mechanisms of colorectal cancer, particularly in the areas of cancer diagnosis, prognosis, and personalized therapy. These findings offer a scientific basis for better understanding the regulatory mechanisms and specific functions of N‐glycosylation in CRC and provide new avenues for cancer diagnosis and treatment. 2. Results 2.1. Glycoproteomic Landscape of CRC To elucidate the role of N‐glycosylation in CRC, we performed in‐depth proteomic and N‐glycoproteomics analyses of 45 collected CRC tumors and NATs (Figure [64] 1A). All tissues were lysed to extract proteins, followed by trypsin digestion. The resulting peptide mixtures were subjected to proteomic analysis or N‐glycosylation analysis after enrichment of IGPs using ZIC‐hydrophilic interaction liquid chromatography (ZIC‐HILIC). The mass spectrometry data were then analyzed for global proteomics and N‐glycoproteomics using MaxQuant and pGlyco3, respectively. Through these processes, we quantified 7567 proteins and 7125 unique IGPs belonging to 704 proteins (Figure [65]1B, Tables [66]S2 and [67]S3, Supporting Information). Subsequently, we integrated the omics results with the relevant clinical information of the samples (Table [68]S1, Supporting Information). Principal component analysis (PCA) showed that paired NATs effectively clustered together and distinctly separated from tumor samples in both proteomics (Figure [69]1C) and N‐glycoproteomics (Figure [70]1D). Compared to proteomics, the N‐glycoproteomics profiles exhibited greater variability among samples (Figure [71]1C,D and Figure [72]S1A,B, Supporting Information), indicating higher heterogeneity and offering more refined potential avenues for precision molecular therapy. Additionally, Pearson statistical methods were used to compare the correlation between our omics data and public data, including The Cancer Genome Atlas (TCGA) transcriptomic and Clinical Proteomic Tumor Analysis Consortium (CPTAC) proteomic data (Figure [73]S1C–H, Supporting Information). The results supported a positive correlation between transcriptomics and proteomics while showing a weak negative correlation between N‐glycoproteomics and other omics. Figure 1. Figure 1 [74]Open in a new tab Overview of N‐glycoproteins in the colorectal cancer cohort. (A) Workflow for N‐glycoproteomics sample preparation and subsequent MS analysis. (B) Number of identified glycopeptides (red dots) and proteins (grey dots). (C) Principal component analysis (PCA) of proteomics data. Blue dots represent tumors, red dots represent NATs. (D) PCA of glycoproteomics data. Blue dots represent tumors, red dots represent NATs. (E) GO enrichment analysis of glycoproteins. Differential colors represent differential pathways. (F) Number of glycosites and N‐glycans for each glycoprotein. Differential size indicates the max number of each IGP per N‐glycosites. (G) Glycosites and glycoforms of CEACAM5. Colors represent the N‐glycans of each site, with green indicating 3–11 Hexoses and 2 HexNAc glycans. To further investigate the subcellular compartments or macromolecular complexes where the detected glycoproteins in our study may function, we performed a Gene Ontology Cellular Component (GO CC) analysis. The results revealed that collagen‐containing extracellular matrix (ECM) (adjust p = 1.47E‐67), vacuolar lumen (adjust p = 5.99E‐29), and primary lysosome (adjust p = 9.86E‐24) were most significantly enriched (Figure [75]1E). Previous reports have indicated that dysregulation of extracellular proteins is closely associated with the occurrence and progression of colorectal cancer,^[ [76]^21 , [77]^22 ^] and in‐depth studies of ECM protein composition and structure provide new insights for cancer treatment.^[ [78]^23 ^] However, most of these ECM proteins have multiple N‐linked glycosylation sites and complex N‐glycans, which greatly complicates the study of N‐glycosylation in tumorigenesis. In our data, 57.4% of N‐linked glycosylation sites were found to possess multiple N‐glycan structures, and 30.7% of glycoproteins had multiple glycosylation sites (Figure [79]1F and Figure [80]S1I–K, Supporting Information). For example, Carcinoembryonic antigen‐related Cell Adhesion Molecule 5 (CEACAM5), a known prognostic marker for CRC,^[ [81]^24 ^] contains up to 319 N‐glycosylation forms with 9 N‐linked glycosylation sites and 99 N‐glycans in our N‐glycoproteomics data (Figure [82]1G). Overall, N‐glycoproteomics significantly expands the diversity and quantity of membrane and secreted proteins in the proteome, making them potential sources for enhanced tumorigenic capacity and providing numerous targets for cancer treatment. 2.2. A Potential Interaction Mode Relies on Different Glycoforms To explore the impact of the complex structure of N‐glycosylation on its functions, we defined five glycoforms based on monosaccharide composition: sialylated (Sia, A), fucosylated (Fuc, F), sialylated‐fucosylated (FA), high mannose (Hm, H), and high HexNAc (Hn, N) (Figure [83]2A). By grouping glycoforms according to the presence or absence of sialic acid, fucose, or extended mannose residues, we simplified the complex N‐glycan landscape. N‐glycoproteomics analysis showed different proportions of N‐glycoforms, with Hm (30%) and Fuc (38%) forms dominating in CRC N‐glycoproteomics (Figure [84]S2A, Supporting Information). The Hm group contained the largest number of glycoproteins (Figure [85]S2A, Supporting Information), while N‐glycosylation sites in the Fuc group were more clustered on the same proteins (Figure [86]2B). Most observed non‐Hm glycoforms were associated with lower mannose content (Figure [87]S2B, Supporting Information), representing a low N‐glycan skeleton, particularly for Fuc. Fucosylation refers to the terminal modification of the peptide proximal GlcNAc moiety within the Hn pentasaccharide core,^[ [88]^20 ^] and IGPs were largely restricted to Asn‐X‐Ser/Thr/Cys (X not equal to Pro) consensus sequences^[ [89]^25 ^] (Figure [90]S2C, Supporting Information). Generally, human N‐glycosylation pathways involve at least 173 glycosyltransferases,^[ [91]^26 ^] with their alternative expression contributing to the generation of diverse glycoforms. To investigate whether there are differences among various glycoform IGP motifs, we compared the amino acids surrounding the consensus sequences in the human proteome and found a notable preference for acidic amino acids (e.g., Asp and Glu) in Sia (A, +4, +6, +7, −1, −5) and FA (+4, +6, +7), while polar amino acids such as Tyr or Thr tended to be located in Hm and Fuc (−1, −5) (Figure [92]2C). For Hn, positively charged amino acids are generally found at positions (−2, −3, −4, −5), and hydrophobic amino acids mainly at positions (−1, +1, +3, +4, +5). Figure 2. Figure 2 [93]Open in a new tab Annotation and analysis of differentially intact glycopeptides cluster. (A) Definition of five glycoforms based on the monosaccharide composition of N‐glycans. S: Sialylated IGP, F: fucosylated IGP, FA: sialylated and fucosylated IGP, H: high mannose IGP, N: high HexNAc IGP. (B) Proportions of singly and multiply N‐glycosylated proteins, colors indicating the number of each IGP per protein. (C) Motif composition of each glycoform. Differential colors represent differential classifications. (D) Summary of UniProt database localization categories for each glycoform. Red saturation including the number of N‐glycan. (E) Bar plot showing the UniProt localization categories for receptor (Blue) and ligand (Red) proteins detected in our data. (F) Number of each monosaccharide composition in receptor (Blue) and ligand (Red) proteins. (G) Heatmap displaying the number of each glycoform per glycoproteins. Each column represents a glycoprotein, with colors indicating glycoforms counts. Row names indicate each glycoform's name and their proportions in receptor (top) and ligand (button). (H) A potential mutual attract/binding mode based on different glycoforms. Ligands are represented in grey color. Mannose, N‐acetylglucosamine, galactose, fucose, and sialic acid are represented by green circles, blue squares, yellow circles, red triangles, and red diamonds, respectively. Carbon, nitrogen, and oxygen atoms are represented by black circles, blue circles, and red circles, respectively. By analyzing the subcellular localization preferences of different