ABSTRACT Colorectal cancer (CRC) exhibits substantial intertumoral heterogeneity, largely attributable to multiple tumor stem‐like cell populations, whose molecular identities and clinical significance remain incompletely defined. This study delineates tumor‐intrinsic stem‐like cell diversity and its prognostic implications through single‐cell transcriptomic profiling of 171,906 tumor epithelial cells (n = 152), integrated with bulk transcriptomic (n = 1389) and genomic (n = 1077) datasets. Functional validation was conducted via in vitro assays and multiplex immunofluorescence. A previously unrecognized lysosome‐associated transmembrane protein 4B‐positive (LAPTM4B^+) stem‐like cell cluster was identified, distinct from the classical leucine‐rich repeat‐containing G‐protein coupled receptor 5‐positive (LGR5^+) population. LAPTM4B^+ cells exhibited MYC pathway activation and 8q chromosomal gains, with preferential enrichment in microsatellite‐stable, POLE wild‐type, and left‐sided tumors. Stratification based on LAPTM4B^+/LGR5^+ stem‐like cell ratios defined four CRC stem‐like subtypes (CSS), with CSS2 (LAPTM4B^+‐dominant) associated with the poorest prognosis (HR = 2.31, p < 0.001). The combined expression of LAPTM4B and LGR5 demonstrated superior predictive power for CRC progression compared to either marker alone (AUC = 0.820 vs. 0.715/0.699), underscoring the synergistic influence of distinct stem‐like cell populations on patient outcomes. These findings provide novel insights into CRC heterogeneity and cooperative interactions among diverse stem‐like populations shaping disease outcomes. Keywords: colorectal cancer, leucine‐rich repeat‐containing G‐protein coupled receptor 5 (LGR5), lysosome‐associated transmembrane protein 4B (LAPTM4B), stem‐like cells __________________________________________________________________ This study identified the existence of a novel population of LAPTM4B^+ tumor stem‐like cells in addition to the classical LGR5^+ tumor stem‐like cells in colorectal cancer (CRC). The combination of LGR5^+ and LAPTM4B^+ stem‐like cells enable a more refined stratification of CRC, offering potential insights for targeted therapeutic strategies. graphic file with name MCO2-6-e70284-g003.jpg 1. Introduction Colorectal cancer (CRC) represents a formidable challenge in oncology, with surgical resection serving as the primary therapeutic strategy. However, despite advancements in surgical techniques, over 30% of patients develop metastases within a few years postsurgery [[36]1]. CRC is characterized by profound heterogeneity [[37]2], primarily reflected in aberrant cellular differentiation. Poorly differentiated tumor cells frequently acquire stem cell‐like properties [[38]3], a major driver of intratumoral heterogeneity [[39]4]. Accumulating evidence underscores the pivotal role of these stem‐like tumor cells in tumor initiation, progression, and metastasis [[40]5, [41]6]. Thus, delineating intratumoral heterogeneity at the cellular level, particularly within stem‐like tumor cell populations, holds promise for refining CRC treatment strategies. Over the past decade, investigations into CRC stem‐like cells have identified key markers and pathways. Seminal studies established leucine‐rich repeat‐containing G‐protein coupled receptor 5 (LGR5) as a canonical intestinal stem‐like cell marker [[42]7], while subsequent research revealed heterogeneous stem‐like subsets expressing alternative surface markers such as CD133, CD44, and CD166 [[43]8]. However, a reliance on surface marker‐based classification risks overlooking stem‐like populations defined by intracellular regulators. For instance, nuclear factors like ASCL2 drive stemness via WNT pathway activation, highlighting the limitations of surface marker‐centric approaches in capturing the full spectrum of stem‐like plasticity [[44]9]. Single‐cell RNA sequencing (scRNA‐seq) has transformed the ability to resolve tumor heterogeneity by enabling unbiased transcriptional profiling, circumventing the constraints of surface marker‐dependent methods, and providing precision in dissecting tumor complexity [[45]10]. Recent scRNA‐seq studies have identified enrichment of stem‐like malignant cells [[46]11], tumor‐associated macrophages [[47]12], and stromal cells [[48]13] within metastatic sites, implicating these populations in the metastatic process. However, these studies remain constrained by limited cohort sizes and sample numbers, restricting comprehensive exploration of biological heterogeneity and its clinical relevance. Furthermore, the absence of genomic data impedes the integration of single‐cell transcriptomic alterations with underlying genetic variations. To overcome these challenges, this study integrates scRNA‐seq with bulk RNA‐seq and whole‐exome sequencing (WES), identifying a previously unrecognized lysosome‐associated transmembrane protein 4B‐positive (LAPTM4B^+) stem‐like population distinct from classical leucine‐rich repeat‐containing G‐protein coupled receptor 5‐positive (LGR5^+) cells. LAPTM4B^+ subsets exhibited 8q chromosomal gains and were enriched in microsatellite‐stable (MSS), DNA polymerase epsilon catalytic subunit A (POLE) wild‐type, and left‐sided CRCs. Bulk transcriptomic analyses across five independent cohorts classified CRC subtypes based on LAPTM4B^+/LGR5^+ stem‐like signatures, linking these subgroups to survival outcomes, transcriptional programs, and genomic profiles. Patients with predominant LAPTM4B^+ stem‐like populations exhibited the highest risk of disease recurrence. Moreover, combined quantification of LAPTM4B and LGR5 expression demonstrated superior predictive power for tumor progression compared to individual markers, underscoring the synergistic influence of heterogeneous stem‐like cell populations on patient outcomes. Collectively, these findings provide critical insights into CRC stem‐like cell heterogeneity and identify distinct stem‐like subpopulations that shape disease progression. This study advances the understanding of tumor heterogeneity and offers potential avenues for therapeutic stratification. 2. Results 2.1. Single‐Cell Transcriptomic Landscape of Colorectal Cancer WES and bulk RNA‐seq were performed on surgical resection samples from 148 patients with primary CRC. Additionally, scRNA‐seq was conducted on tumor tissues from 27 patients with CRC. The comprehensive analysis workflow is depicted in Figure [49]1A. FIGURE 1. FIGURE 1 [50]Open in a new tab Establishment of the CRC single‐cell atlas. (A) A schematic flowchart delineates the sample collection process for the FAHNU‐SC and FAHNU‐BULK cohorts, specifically curated for CRC. (B) Demographic and clinical characteristics, including sex, age, microsatellite status, tumor stage, and anatomical tumor location, were compiled for 152 patients with CRC. (C) A UMAP projection visualizes 1,001,621 cells from these patients, classified into nine major cell types. (D, E) Distinct UMAP plots further delineate the clustering of nonimmune cells and immune cells within the dataset. (F) A heatmap quantifies the distribution bias of each cell subset between tumor tissue and normal adjacent tissue (NAT) based on the ratio of observed‐to‐expected ratio (Ro/e). (G) A UMAP plot highlights copy number variation (CNV) scores across all nonimmune cells. (H) A box plot presents genomic instability scores of nonimmune cells, reflecting chromosomal aberration levels. To further elucidate CRC heterogeneity, scRNA‐seq data from the FAHNU‐SC cohort (n = 27) were integrated with datasets from four independent scRNA‐seq studies, yielding a combined dataset of 152 patients with primary CRC, encompassing both tumor (n = 152) and adjacent normal tissues (n = 73). Clinical information, including age, pathological stage, tumor location, and microsatellite status, was available for most patients (Figure [51]1B and Table [52]S1). Integration of single‐cell datasets was performed using scArches, followed by quality control and batch correction, resulting in a final dataset of 1,001,621 cells. Coarse cell type annotation identified nine major cell types: NK cells, T cells, B cells, plasma cells, myeloid cells, epithelial cells, stromal cells, Schwann cells, and mast cells (Figures [53]1C and [54]S1B). Nonimmune populations were further classified into epithelial and stromal subtypes. The epithelial compartment included goblet cells, Bestrophin‐4 (BEST4) cells, absorptive enterocytes (AEs), enterocytes, tuft cells, and tumor epithelial (Epi T) cells, while stromal cells were categorized into fibroblasts, endothelial cells, and smooth muscle cells (SMCs, Figures [55]1D and [56]S1A). Myeloid lineage immune cells were further subdivided into neutrophils, plasmacytoid dendritic cells (pDCs), and the mononuclear phagocyte system (MPS, Figures [57]1E and [58]S1A). Cell type proportions were analyzed across different datasets and tissue types. Among nonimmune cells, Schwann cells, AE, tuft cells, enterocytes, and BEST4 cells were predominantly enriched in adjacent normal tissues, whereas Epi T cells were almost exclusively confined to tumor tissues. Within the immune compartment, neutrophils and pDCs, both of myeloid origin, were enriched in tumor tissues (Figures [59]1F and [60]S1C). Consistent with previous findings, microsatellite instability (MSI) tumors exhibited a higher abundance of immune cells within tumor tissues, whereas MSS tumors showed a greater prevalence of stromal cells (Figure [61]S1D). Regarding tumor location, left‐sided CRC demonstrated a higher proportion of stromal cells (Figure [62]S1E). Among epithelial subsets, Epi T cells exhibited the highest copy number variation (CNV) scores (Figure [63]1G) and the greatest genomic instability scores (GISs; Figures [64]1H and [65]S1F), reinforcing their malignant potential. 2.2. Epi T Cell Annotation To delineate epithelial tumor cell subsets, 171,906 Epi T cells were classified into nine distinct clusters (S1–S9). Among these, S1 represented the most prevalent population, whereas S5 was the least abundant (Figure [66]2A). In single‐cell transcriptomics, gene expression levels often correlate with cellular differentiation potential [[67]14]. Notably, S1 and S9 subsets exhibited the highest gene counts (Figure [68]2B), along with elevated GISs (Figure [69]2C). Marker gene analysis corroborated these findings, revealing that S1 cells highly expressed LGR5, a well‐established marker of intestinal cancer stem‐like cells. The remaining subsets displayed distinct molecular signatures: S2 cells were characterized by high PCNA expression, indicative of proliferative activity; S3 cells were marked by NME2; S4 cells by CLEC1A; S5 cells exhibited dual epithelial and immune properties, with HLA‐DRA and HLA‐DRB1 expression; S6 cells expressed mucin family markers MUC1, MUC2, and MUC4; S7 cells were defined by LCK; S8 cells were characterized by KRT7 and AXAN1; and S9 cells exhibited high expression of TFAP2C and LAPTM4B (Figure [70]2D). FIGURE 2. FIGURE 2 [71]Open in a new tab Features of tumor stem‐like cell subpopulations. (A) UMAP dimensionality reduction of 171,906 epithelial tumor (Epi T) cells, the cells are stratified into nine subsets (S1–S9), with a bar chart illustrating their relative proportions. (B) A box plot displays gene counts distributions across S1–S9 subsets. (C) Another box plot compares genomic instability scores among S1–S9 subsets. (D) A heatmap showcases marker genes specific to each subpopulation. (E) A correlation heatmap, derived from Spearman's analysis, assesses gene expression similarities among the nine subpopulations, with red indicating positive correlation and blue denoting negative correlation. (F) Violin plot highlights the shared high expression of specific genes within the S1 and S9 subsets. (G, H) Density plots depict AUCell scores for CRC stem‐like traits and CytoTRACE2 scores across the nine Epi T subsets. (I) RNA velocity streamlines embedded in the UMAP projection visualize transcriptional dynamics among the S1–S9 subpopulations. (J) A heatmap outlines enriched signaling pathways within the nine subsets. (K) Violin plots illustrate the expression patterns of LGR5 and MYC within tumor stem‐like cells in the S1 and S9 subsets. Expression profiles similarity analysis revealed that S1 and S9 shared the most comparable transcriptional features (Figure [72]2E). Several oncogenes, including the oncogenes CEACAM5 and CEACAM6, were upregulated in both subsets, alongside stemness‐associated genes such as MEX3A [[73]15], ASCL2 [[74]15], and PTPRO [[75]11] (Figure [76]2F). Stemness scoring via AUCell further confirmed that S1 and S9 subsets exhibited the highest stemness‐related gene signatures (Figure [77]2G and Table [78]S2). Additionally, CytoTRACE2 predictions indicated that these subsets harbored the greatest differentiation potential (Figure [79]2H), and RNA velocity analysis suggested that differentiation trajectories originate from S1 and S9 cells toward other tumor epithelial subsets (Figure [80]2I). Previous research has implicated MYC and WNT pathway hyperactivation in CRC stem‐like cells [[81]11]. AUCell pathway analysis across all subsets demonstrated that both S1 and S9 cells exhibited elevated WNT and MYC pathway activity. However, WNT overactivation was more pronounced in S1, whereas S9 cells predominantly exhibited MYC activation (Figure [82]2J). Both LGR5 and MYC, markers of stem‐like characteristics, were highly expressed in these subsets, with S1 showing stronger LGR5 expression and S9 displaying higher MYC expression (Figure [83]2K). Metabolic profiling revealed that S1 and S9 cells share similar metabolic characteristics, both exhibiting heightened metabolic activity (Figure [84]S2). These results indicate the presence of two highly similar yet distinct stem‐like cell populations in CRC, each characterized by unique activation pathways and functional properties. Cell–cell interaction analysis demonstrated that S9 cells engaged in slightly more interactions than S1 cells and exhibited active communication with stromal cells (Figure [85]S3A–C). Compared to S1 cells, S9 cells specifically participated in endothelial cell‐mediated CD39 signaling input and generated oncogenic LIFR signaling as an output (Figure [86]S3D) [[87]16]. 2.3. LAPTM4B as a Marker Gene of the S9 Cell Subsets Further characterization of these stem‐like populations showed that both S1 and S9 subsets were predominantly enriched in MSS patients, with S9 cells being almost exclusively present in this subgroup (Figures [88]3A and [89]S4A). Spatially, S9 cells, along with the S3 cell population, were primarily localized in left‐sided CRC, whereas S6, S7, and S8 cells were predominantly found in right‐sided tumors, S1 cells, in contrast, exhibited no distinct spatial preference (Figures [90]3B,C and [91]S4A). FIGURE 3. FIGURE 3 [92]Open in a new tab LAPTM4B as a marker gene for the S9 stem‐like cell subsets. (A) UMAP projection of epithelial tumor (Epi T) cells differentiates MSI (pink) from MSS (gray) tumors (left panel), while the right line graph quantifies the distribution biases of each cell subgroup based on MSI/MSS status. (B) Distribution patterns of Epi T subgroups according to tumor location. (C) Pie charts illustrating the proportional composition of various Epi T subsets across different tumor sites. (D) CNV profiles of S1–S9 subsets. (E) Heatmap summarizing copy number alterations across all 22 chromosomes in the S1–S9 subsets. (F) Venn diagram depicting the overlap between highly expressed genes in the S9 subset and genes located in the 8q chromosomal region (logFC > 2, FDR < 0.01). (G) UMAP projection mapping LAPTM4B expression in tumor epithelial cells. (H) Kaplan–Meier survival analysis stratified by LAPTM4B expression, with DFS assessed via the log‐rank test to evaluate prognostic significance (n = 985). (I) Integration of the CRC single‐cell atlas with spatial transcriptomic data. (J) Spatial transcriptomic visualization of LAPTM4B and MYC expression in CRC. CNV profiling, inferred using inferCNV with five normal epithelial cell types as references, revealed that S1 and S9 subsets shared the most